Saturday January 24, 2009 at 19:38
Subject: 6 Months of ZFS in Linux
Keywords:
Technical, ZFS
Posted by: Sean Reifschneider
Related entries:A month of ZFS under Linux by Sean Reifschneider, Friday August 08, 2008 at 16:57
ZFS dedup Available in ZFS-FUSE by Sean Reifschneider, Wednesday December 09, 2009 at 05:27
18 months of ZFS-FUSE by Sean Reifschneider, Wednesday December 09, 2009 at 06:33
Adding an Audit Interface to BackupPC by Mike Loseke, Thursday March 05, 2009 at 15:29
I've now been running ZFS with FUSE under Linux for around 6 months.
I still have my storage server running ZFS, and while it hasn't had too
many problems (ZFS has always caused me some level of stability problems),
I will say that my storage box is now the only system I have running
ZFS. In other words, I've gotten rid of the Solaris-based ZFS machines but
did not replace them with FUSE-based Linux systems as I was hoping to.
Read on for more of this saga.
I had tried to migrate our backup servers over to Linux with ZFS and
FUSE, but I ended up with FUSE crashing as I was trying to import the
historic backup data. Not that unusual, because the OpenSolaris kernel
systems we've been running for the last year have also crashed sometimes
when we're importing historic data (via "zfs recv"). However, in one of
the FUSE crashes I ended up with the ZFS file-system totally trashed.
The message ZFS was giving me was basically "re-format and reload
your file-system from backups now". The unfortunate thing is that is what
I was doing at the time ZFS+FUSE wigged out -- using "zfs recv" to re-load
data.
This is the first time I've run into the file-system getting totally
trashed with ZFS. So I went ahead and re-loaded the system with the latest
Nexenta, going back to ZFS+Open Solaris kernel. And I'll be darned if the
same thing didn't happen there.
So it's looking like the newer versions of ZFS are just more unstable
than the previous ones I have been using. Which, as I've mentioned, isn't
saying much. Our ZFS systems running the OpenSolaris kernel have had to be
rebooted typically monthly, and usually daily when doing "zfs recv" to load
data into them. In fact, in some cases we've just had to re-create the
snapshots using rsync instead of "zfs send"/"zfs recv", because the
receives would reliably crash the machine.
So I ended up switching our backup servers back to running Linux and
ext3. I found some software called "BackupPC" which I had never heard of
before, but it works brilliantly. It provides many of the benefits we were
getting under ZFS (the snapshots are provided by hard linking into a pool,
compression of stored files but even better than ZFSs compression). It is
lacking the snapshot's abilities in ZFS to only store the changed portion
of a large file that's being appended to. But it does have the ability
that ZFS does not which is to store only one copy of the same file across
many machines...
Because of the many hard links, this does end up hammering the
file-system much harder than ZFS did. So, we are basically going from 3
overloaded backup servers to 4 overloaded backup servers -- or requiring
about 33% more hardware. This might have been able to be mitigated by
using XFS under Linux, but I've had a less than stellar history with XFS in
the past.
The plus side is that the Linux setup is rock solid, and not nearly as
picky on hardware as the OpenSolaris side. The only reboots the backup
servers have had over the last 4 months was because there was a new kernel
update available...
Back to the storage server. It's been working well. About once every
2 months the ZFS FUSE daemon will die and I have to restart it. So far it
has not munged the file-system.
One of the real annoyances I've run into though is that ZFS+FUSE
doesn't allow you to export the file-system via NFS. A real nuisance on a
storage server. The other issue I've had is that the performance of the
file-system can at times be rather bad. It seems to boil down to the
ZFS+FUSE not caching data in RAM, so if you do multiple finds on a
reasonably sized directory, it's going to be seeking around the disc for
all of them. "locate" is your friend.
So, for some uses ZFS+FUSE isn't really that bad, as long as you can
live with the limitations. Back up your data, of course, but that
shouldn't be any different advice than any file-system you run.
It is really unfortunate that ZFS has not been able to get into the
Linux kernel. It's got a lot of benefits, and I think both Solaris and
Linux would benefit from having ZFS in the kernel. The current thinking
seems to be that btrfs will be the next generation of file-system for
Linux. But it's got a long way to go before it even reaches the level of
maturity that ZFS currently has, let alone the stability of ext3.
I predict we're at least 2 years away from btrfs being really usable
in a production environment. Which leaves Linux as the only Unix system
without snapshots in it's file-systems. Block-device level snapshots just
really are not that useful because you either have to pre-allocate enough
space to hold all the deltas, or you have to kludge something together to
detect the snapshot filling up and extend the snapshot device.
So, the hottest OS on the planet suffers from having a pretty retro
file-system behind it. Admittedly though, the file-system it has is rock
solid now. I'm just concerned that the btrfs project doesn't remember how
long it took to get there...
(Post Reply)
(Post Reply)