Your Linux Data Center Experts

At our hosting facility, we have a 1.2TB RAID server which we use to back up hosted client and our own machines. This system uses a number of custom scripts along with rsync to do the job, and keep incremental information in a day/week/month progression. This system works really well, but we have been rapidly running out of space as we've added clients. Then Scott came up with the idea that we use a compressing file-system…

Compressing file-systems were all the rage 15 years ago, which is probably the last time I used one on a regular basis. Of course, back then I remember being happy when I was able to get a hard drive for less than $2 per megabyte (yes megabyte). Of course, now we have hard drives running around half a dollar per gigabyte, so who needs compression? Just add another disc.

A single hard drive may be dirt cheap, but when you start talking about a bunch of them in a rack-mount chassis with RAID controller and a system to drive them, it starts adding up pretty quickly. Sure, you're talking about 3TB in a 2U or 3U case (3.5" or 5.25" high). A compressing file-system would be ideal.

The choices for Linux are pretty slim, though. Many of the implementations you run across haven't been touched in years and many kernel versions. The ones that have been recently touched are in alpha stage. Not looking good.

We threw some ideas around among the team for a while, and finally I came up with the idea that we really didn't need a compressing file-system, if we just built a job to run through the incremental backups and compress the files within them, if they aren't already, and make a log to record the details.

I set about building just such a job, and have spent much of the last week and a half running it, finding problems with the job that made it stop prematurely, etc. In fact, this weekend I even modified the script so that it was able to distribute the compression among 2 other machines with much faster CPUs.

The backup server has only a 1GHz CPU in it, so it was not the most speedy in doing the compression. Once I changed it to push larger files across the network and spread the compression among 2 2.6GHz machines, it finished up the compression of older files in pretty quick order.

We now are at only 55% consumption on the disc, making better use of the existing hardware. That's probably given that server another year's worth of life

All of this because we work in a team. I knew the problem, but was stuck on one solution. Scott came up with a possible solution, and Kevin added some input on it, and we were eventually able to come up with the optimal solution to the problem.

comments powered by Disqus

Join our other satisfied clients. Contact us today.