Your Linux Data Center Experts

The Ultimate Storage Server

By  Sean Reifschneider Date July 7, 2008

Recently our home backup server (which is used for backups of many of our laptops) had a RAID issue and because of a slightly quirky setup I got confused and ended up losing all the data on it. As I mentioned recently in some of my other recent posts, I went through some experimentation and the end result is an almost entirely new system and configuration.

Below I'll discuss some of my choices for this system. Let's start from the end.

The Final Result

Here is what the end result of the system is:

Capacity6TB usable, with double parity
EncryptionAffirmative
Memory2GB (probably will upgrade it to 8GB soon)
CPUCore 2 Quad core 2.4GHz

Goal

I previously had a tummy.com backup server, which we all backed up our laptops to as well as other things like the pristine disc copies from when we got the laptops, and also a personal storage server for Evelyn and I. In order to save power and space, I decided to combine these into a single system. I hadn't done this previously simply because of the number of discs it would take to hold it all.

I want ZFS because it goes to extraordinary lengths to ensure that what is written to the file-system is not corrupted months or years down the line. Since many of the things we are storing are infrequently used, like digital photos, copies of scanned bills, invoices, and other business and personal documents, for example, having checksums and the ability to detect and correct corruption is useful.

However, as this is a central point for storing pretty much every bit of information we have, including sensitive business information and private and confidential personal information, it has to be encrypted. I don't want to have to worry about it getting stolen, but this also protects in case we were to, say, accidentally sell the hard drives on ebay before wiping them. There's just no reason not to store the data encrypted.

OS

I ended up using Ubuntu Hardy 8.04 64-bit for the system. Largely because it is the most recent free LTS release, so it has a newer kernel and supported the SATA Port Multiplier and Encryption in the installer. I had to use the "Alternate" installer to get the encryption during the install.

Encryption

For the system disc, I just let the Ubuntu Alternate installer do it's work and set up the encrypted LVM.

For the individual encrypted partitions, I set them up using "cryptsetup luksFormat" to format each of the partitions. I had set up all discs with a /dev/sd*4 partition, and the system disc only had partitions 1 and 2, so I know any disc currently hooked up with a 4th partition is one of the encrypted set.

When I "luksOpen" them, I give them a named based on the LUKS UUID, so this name will never change unless I reformat the partition. So I have a script like:

modprobe fuse
for file in /dev/sd*4; do
   cryptsetup luksOpen $file `cryptsetup luksUUID $file`
done

if [ "`ls /dev/mapper | grep -e '.*-.*-.*-.*' | wc -l`" != 14 ]; then
   echo "Didn't find all 14 devices in /dev/mapper.  Exiting."
   exit 0
fi

The "Didn't find all devices" condition is in there because the eSATA controller/Port Multiplier sometimes doesn't see all 5 of the external drives at boot time. This seems to only happen at boot after a power cycle, and clears up after a reboot, but I want to get a warning before I start the ZFS.

ZFS

As mentioned in a previous journal post on ZFS under Linux, I did some researching on the ZFS-on-FUSE mailing list to see how things were progressing. It looked like it was getting regular updates, but only into the development version. I ran some testing before deciding to go down this path. If it fails I have a fall-back plan of trying a virtual or even dedicated machine running OpenSolaris accessing the crypto volumes via iSCSI.

So far it's been almost a month using ZFS under Linux and it's been going very well.

For ZFS I used the latest checkout from "mercurial" on the ZFS FUSE Wiki. There is still fairly regularly updates being done, and the latest release tar-file is over a year old. On the mailing list there was regular bug-fixes going in, so the latest checkout seems to be the only way to go.

I ran the "configure" and installed any of the packages it needed. In particular, I remember needing libfuse-dev and fuse-utils. Once "configure" completed without problems, I ran "make" and "make install".

Then you have to start the zfs daemon, which I did by going into the "src/zfs-fuse" directory and running "nohup ./run.sh </dev/null >/dev/null 2>&1 &". This has to be done every time you reboot, but after you decrypt the partitions.

Then you can create the ZFS pool with "zpool create data raidz2 /dev/mapper/*-*-*-*", if you have named your partitions like I have, and don't have any other /dev/mapper partitions matching a file-name with 4 hyphens in it. WARNING: Everything existing on those drives will be destroyed.

This creates a "/data" directory which is the root of the ZFS file-system. You can then use "zfs" to create new sub-file-systems, snapshots, and more...

rsync

The primary use of this system is daily backups of a bunch of other systems. One of the drawbacks of Ubuntu 8.04 is that it shipped with rsync version 2, even though rsync 3 was out when it released. In fact, Fedora 9, which shipped a month before Hardy, included rsync 3...

There's a huge memory and speed improvement using rsync v3, so I custom built the Debian package of rsync-3, and installed it on this system.

Hardware: Case

The case I used was one of the few pieces of the existing backup box that I kept. I had purchased it around a year ago, the current version of it seems to be the R5605-BK by Rosewill. I selected this case because it will take 5 drives internally, with a nice fan in front of them, but is relatively compact. There are cases that will take 10 or more drives in a tower configuration, but they tend to be huge. This case also has 4 5.25" bays, so I can put a 5-in-3 SATA enclosure in there and have the case hold 10 drives with good cooling.

When I initially got this case I started with the 5 internal drives and then about 6 months later added the SATA enclosure (more on that below) to double the capacity. With the current configuration, I'm at 15 drives. So this case worked out quite well as far as expandability.

Cost: $75.

Power Supply

The Power Supply I used was a high efficiency unit from Seasonic. This wasn't an exact fit for my new motherboard, and this exact power supply isn't available any more it seems. The motherboard I used requires an 8-pin CPU power header, but the Seasonic I had has a 4 and 6 pin. I admit that I kludged it together using both of these, but if I were buying it new I'd go with something that had an 8-pin CPU power cable.

For example, the SS-500GM seems like a good choice.

Note that I don't have a beefy graphics card in this system. That's why I can get away with a 330W power supply to drive a 90W CPU and 10 hard drives. I went with an "80+ certified" power supply to reduce wasted energy and heat.

Cost: $100

Internal 5-bay Enclosure

In addition to the 5 internal bays, I added a Supermicro Mobile Rack CSE-M35T-1. This fits in 3 5.25" bays and adds 5 3.5" bays including a nice fan and front removable sliding trays plus activity lights.

The down side of this is that it's pretty long. With the original motherboard I had in the system it would just barely fit. With the current board, there are memory slights that make it so this enclosure doesn't fit completely flush in the case.

Cost: $110

External 5-bay Enclosure

I also added a 5-bay external enclosure, the AMS DS-2350S. These are kind of pricy, but make it so I can add a lot more drives without worrying about the internal power supply or space to put them. If I were starting from scratch I'd consider a smaller case and putting all drives external, but then we're talking over $700 just for the enclosures. However, the current setup allows me to add 3 more enclosures without doing any other system changes.

This enclosure has a 5-port SATA Port Multiplier in it. I posted in my journal previously about SATA Port Multipliers, so read that if you would like more details. The short form is that it requires a recent kernel, but allows 5 drives to be connected to a single SATA port.

Cost: $240

Internal SATA Card

For connecting the 10 internal hard drives, I use 4 of the 6 internal ports, and then a Supermicro AOC-SAT2-MV8 8-port PCI-X SATA card. Note that this can be used in a normal PCI slot with speed limited to around 100MB/sec, as long as the slot doesn't have tall components mounted behind it.

Cost: $120

External SATA Card

This card connects the external 5 bay enclosure, I selected the Addonics ADS3GX4R5-E. This card has 4 external SATA ports, a PCI-X interface, and is reported to work under Linux. Most of our servers still have PCI-X interfaces in them, so I figured I'd try this card instead of testing out any of the PCI-Express cards (which tend to be even newer and I've run into more issues there in Linux.

Going with a PCI-Express card is tempting, since most motherboards these days have them rather than PCI-X or extra PCI slots. My choice of PCI-X moved me into a more expensive motherboard, but I also wanted to get more experience with this specific remote control card so I was kind of already prepared to go with a more expensive motherboard.

Cost: $80

Motherboard

For the motherboard, I decided to get something with at least 2 PCI-X slots because the SATA cards are PCI-X. I probably don't really need the performance, but the ability to do a "scrub" of the data to verify integrity in a reasonable amount of time is a good feature.

I selected the Supermicro MBD-X7SBE which is a Core 2 or Xeon 3000 series capable board with 2 PCI-X 100 slots and 2 PCI-X 133 slots, plus 4x and 8x PCI-Express slots.

Part of the reason I selected this board was also that it can do full remote management, and I want to get more experience with that.

Cost: $270.

Remote Management

The above motherboard supports a add-in card called the Supermicro AOC-SIM1U+. This board allows you to power on/off/reset via IPMI, a web interface, or SSH, and also allows remote KVM and emulated media via a Java client. I've played with it before and it worked well, but we will probably be making much more extensive use of these in the future and I wanted more experience with them.

Cost: $120.

CPU

The CPU I got is the Intel Q6600 Quad core 2.4GHz. It supports hardware virtualization, 32 and 64 bit OS, and has a lot of horsepower. With my planned encryption of the discs, plus compression of backups via ZFS, I wanted to have plenty of horsepower. This was almost a perfect fit, I found out later, all 4 cores run between 85 and 90% utilized when I'm running a full ZFS "scrub".

Cost: $210.

RAM

I forgot to order RAM initially, so I used the old RAM out of the existing box, which is 2GB of DDR2 memory. The system will take up to 8GB, and with the RAM being as cheap as it is, I'll probably go that direction before long. Mainly to try to keep the disc cache as big as possible, to improve backup performance. The 2GB has performed quite well though.

Cost: 8GB around $170, 2GB around $40.

Hard Drives

I used an existing 250GB Hitachi Deskstar SATA drive for the system disc, and 14 Hitachi Deskstar 500GB drives. I've been very happy with the Hitachi drives I've used, and have had very few failures (out of hundreds I've used). I probably would have sprung for the Ultrastar version, but we had these drives in inventory, already tested, and for our normal use should probably be using the Ultrastars.

The drives are where you have to do the hard math. 500GBs are far less than half as expensive as the 1TB drives right now. Plus, when doing RAID you're wasting drives. So, I could have gotten 6 1TB drives at $200 or 12 500GB drives at $75 for the data storage. The need for the $240 external enclosure and $80 controller card pretty much makes them even in price. Add on the 2 drives worth of redundancy I want though, and I'm only wasting $150 with the 500GB drives instead of $400 with the 1TB drives...

Cost: $75 each ($1050 for 500GBx14). The 250 was just a spare drive, but new cost would be around $40.

Conclusions

It's been great that Sun open-sourced the ZFS code. It's a great technology and it's nice that we haven't had to duplicate effort by building a similar set of functionality under Linux. Obviously, the flexibility of Linux has been a huge help here as well, since even though it couldn't be brought into the kernel directly, we were still able to use it.

Copyrights

The above images of the individual components are from the manufacturers web sites or product literature.

Shameless Plug

tummy.com has smart people who can bring a diverse set of knowledge to augment your Linux system administration and managed hosting needs. See the menu on the upper left of this page for more information about our services.

comments powered by Disqus