Your Linux Data Center Experts

I've now been running ZFS with FUSE under Linux for around 6 months. I still have my storage server running ZFS, and while it hasn't had too many problems (ZFS has always caused me some level of stability problems), I will say that my storage box is now the only system I have running ZFS. In other words, I've gotten rid of the Solaris-based ZFS machines but did not replace them with FUSE-based Linux systems as I was hoping to. Read on for more of this saga.

I had tried to migrate our backup servers over to Linux with ZFS and FUSE, but I ended up with FUSE crashing as I was trying to import the historic backup data. Not that unusual, because the OpenSolaris kernel systems we've been running for the last year have also crashed sometimes when we're importing historic data (via “zfs recv”). However, in one of the FUSE crashes I ended up with the ZFS file-system totally trashed.

The message ZFS was giving me was basically “re-format and reload your file-system from backups now”. The unfortunate thing is that is what I was doing at the time ZFS+FUSE wigged out – using “zfs recv” to re-load data.

This is the first time I've run into the file-system getting totally trashed with ZFS. So I went ahead and re-loaded the system with the latest Nexenta, going back to ZFS+Open Solaris kernel. And I'll be darned if the same thing didn't happen there.

So it's looking like the newer versions of ZFS are just more unstable than the previous ones I have been using. Which, as I've mentioned, isn't saying much. Our ZFS systems running the OpenSolaris kernel have had to be rebooted typically monthly, and usually daily when doing “zfs recv” to load data into them. In fact, in some cases we've just had to re-create the snapshots using rsync instead of “zfs send”/“zfs recv”, because the receives would reliably crash the machine.

So I ended up switching our backup servers back to running Linux and ext3. I found some software called “BackupPC” which I had never heard of before, but it works brilliantly. It provides many of the benefits we were getting under ZFS (the snapshots are provided by hard linking into a pool, compression of stored files but even better than ZFSs compression). It is lacking the snapshot's abilities in ZFS to only store the changed portion of a large file that's being appended to. But it does have the ability that ZFS does not which is to store only one copy of the same file across many machines…

Because of the many hard links, this does end up hammering the file-system much harder than ZFS did. So, we are basically going from 3 overloaded backup servers to 4 overloaded backup servers – or requiring about 33% more hardware. This might have been able to be mitigated by using XFS under Linux, but I've had a less than stellar history with XFS in the past.

The plus side is that the Linux setup is rock solid, and not nearly as picky on hardware as the OpenSolaris side. The only reboots the backup servers have had over the last 4 months was because there was a new kernel update available…

Back to the storage server. It's been working well. About once every 2 months the ZFS FUSE daemon will die and I have to restart it. So far it has not munged the file-system.

One of the real annoyances I've run into though is that ZFS+FUSE doesn't allow you to export the file-system via NFS. A real nuisance on a storage server. The other issue I've had is that the performance of the file-system can at times be rather bad. It seems to boil down to the ZFS+FUSE not caching data in RAM, so if you do multiple finds on a reasonably sized directory, it's going to be seeking around the disc for all of them. “locate” is your friend.

So, for some uses ZFS+FUSE isn't really that bad, as long as you can live with the limitations. Back up your data, of course, but that shouldn't be any different advice than any file-system you run.

It is really unfortunate that ZFS has not been able to get into the Linux kernel. It's got a lot of benefits, and I think both Solaris and Linux would benefit from having ZFS in the kernel. The current thinking seems to be that btrfs will be the next generation of file-system for Linux. But it's got a long way to go before it even reaches the level of maturity that ZFS currently has, let alone the stability of ext3.

I predict we're at least 2 years away from btrfs being really usable in a production environment. Which leaves Linux as the only Unix system without snapshots in it's file-systems. Block-device level snapshots just really are not that useful because you either have to pre-allocate enough space to hold all the deltas, or you have to kludge something together to detect the snapshot filling up and extend the snapshot device.

So, the hottest OS on the planet suffers from having a pretty retro file-system behind it. Admittedly though, the file-system it has is rock solid now. I'm just concerned that the btrfs project doesn't remember how long it took to get there…

comments powered by Disqus

Join our other satisfied clients. Contact us today.