Wednesday December 09, 2009 at 05:27
Subject: ZFS dedup Available in ZFS-FUSE
Keywords:
Dedup, Linux, Technical, ZFS
Posted by: Sean Reifschneider
Related entries:6 Months of ZFS in Linux by Sean Reifschneider, Saturday January 24, 2009 at 19:38
Not to subvert the recent 0.6.0 release of ZFS-FUSE, which I think is
great... But the thing I'm really interested in is the dedupication code
that just recently got released, and isn't even in the current OpenSolaris
release yet. Read below the fold for more information on dedup in
ZFS-FUSE.
For those of you who aren't up on it, the ZFS dedup code works at the
block level, which is great news for files that are largely the same but
differ in some small components. Copying the same file (say rsyncing
multiple similar machines to a backup server) will obviously be
deduplicated. But it will also deduplicate (most of) prelinked files that
are largely the same on multiple machines, but may differ slightly in the
pre-linking information.
The 0.6.0 ZFS-FUSE release doesn't include dedup, not surprisingly.
I did some digging around and I found this git repository which has a
version of ZFS-FUSE that includes the dedup code:
(Post Reply)
git clone 'http://rainemu.swishparty.co.uk/git/zfs' zfs-fuse-dedupeI've installed this on a test system and am currently running some stress testing of it and some basic testing of dedup, and everything seems to be working as expected. It'll take weeks or more before I'm ready to try putting any real data on it though. My first test, which I started Sunday, was interrupted by what I suspect was a hardware problem with one of my drive enclosures. I've been running with "dedup=verify", which I believe should be the default. The default is to consider blocks that have the same checksum to be the same block. "dedup=verify" takes it one step further and verifies that the contents are indeed the same before deduplicating the block. One other thing I've really been impressed with on that version is the memory consumption. My system running an old version of ZFS-FUSE is currently consuming 2.6GB of RAM. The new test system running the above version never went over 600MB. Which is good -- my test system is maxed out at 2GB of RAM (Atom 330 system with the older 2GB chipset). I hope to convert my main home storage server over to 0.6.0 at some point soon here. But first I'd like to run some stress testing... It's not a huge deal if I lose the data on there, it's all backed up. But I'd prefer to get a bit more comfortable with it first.
(Post Reply)