@HoneyBadger
Very easy - in my understanding ZFS is great in theory (and its a storage geek's wet dream) but in practice it has more downsides than alternative solutions. Dont get me wrong, I love the concepts of ZFS and for a home lab its fantastic but not for a production system for the following reasons:
Here is my experience: Do we need all the fancy features of self healing etc etc? In my 20 years of data and storage experience I never had anything like bit-rot happen to me. Also, I had 5 total power outages on enterprise servers and no data ever got lost or damaged (all under Linux MDADM RAID or LSI Hardware RAID and ext4). Also in all of 20 years, I had 3 Harddisk physically die on me so RAID5 worked pretty well. Some of our servers had EEC memory and some don't. Again in 20 years, nothing ever happened that was caused by memory.
So in my view ZFS wants to cover some obscure tail risks of highly unlikely data damage and for that you have to live with additional complexity and the fact that there are no repair tools. Once something goes wrong, all is gone. How can this be a sensible risk management strategy?
And here are the specific downside points I see in ZFS :
- first of all, and this hasn't even been mentioned here: Zpools are not (!) compatible across different server environments. If you run ZFS on Linux you cannot import your Pool into Truenas or vice versa. I am not sure what causes this restriction but its a no-go for any serious filesystem. The filesystem is an abstraction layer that should be independent from the rest of the environment. E.g. you can mount ext4 filesystems anywhere and you can import a Linux MDADM Raid pool into any Linux machine. A major factor for data recovery purposes.
- ZFS comes with a bunch of complexity but no real new features (vdevs already existed in Linux. Its just just called LVM); why even use more than 1 vdev in a zpool if a problem in any vdev kills the entire zpool? Obviously its more secure to have a separate zpool for each vdev. So just even more complexity for no apparent benefit.
- no recovery tools (!!??) -> this alone would disqualify any filesystem from serious usage apart from home-lab (which I do appreciate)
- "force-mounting can cause permanent damage" and "plenty of users have lost everything because of wrong commands" -> I dont think MDADM and LVM are that fragile. Its a key concept of a filesystem to not allow damaging actions.
- Scrubs (a key feature of ZFS self-healing) can cause more damage than people know. "Scrubs can completely destroy a zpool that is healthy because bad RAM. Again, a filesystem feature should not allow damaging its own integrity (even if people use non-EEC).
- cache management is overly complicated and apparently doesn't improve anything; reading about it just makes you want to hug your RAID controller with hardware cache and backup battery :) or even normal cache management provided by standard Linux.
- snapshots existed previously in Linux storage (via LVM) - so again nothing new. And if you rsync your snapshot to another location, you know it is safe and works because rsync and LVM have been around for 100s of years. With ZFS Send, you would never know if a restore is really working.
- You cannot add more HDs to a Vdev. In Linux LVM this is not a problem.
- deduplication and compression is nice in theory but nobody uses it in practice because storage space has become cheap and it is "discouraged when performance is important". "even the designers of ZFS recommend not using deduplication"
- I am not sure if ZFS encryption has any advantages over LUKS so no benefit here again
- iSCSI is not working/not recommended. This is a key feature of a storage solution and any Linux distro can do it (even in the Kernel)
- ZFS is apparently not usable as ESXI datastore -> this would be the perfect use-case in an enterprise world but again, too complex with ZFS and not advised.
So I love playing around with it but I am not sure if I would use ZFS on a server where the data matters?