tummy.com: we do linux

Recent Entries

Below is a summary of the most recent journal entries. A full index of all entries is also available.
Also available as: RSS Comments (RSS)

(Sunday August 29, at 15:13) Kevin Fenzi
Subject: Calibre
Keywords: fedora, linux, Reviews, software

I took over a while back working on the Fedora calibre package and thought I would share some information about the application for those who haven't used it and some thoughts about software release cycles. Read on for more.


(read more | 3 Comments)

(Thursday August 26, at 16:51) Sean Reifschneider
Subject: Getting the program name in scripts
Keywords: bash, NCLUG, Technical

In the (distant) past I've used "`basename $0`" to get the name of the currently running script.

However, several years ago I learned about "${0##*/}", and have switched over to using that. The benefit is that you don't have to fork a new process, and it's also very concise, but admittedly more archaic than "basename".

The way I remember it is that you can do "${var%glob}" "${var%%glob}" "${var#glob}" "${var##glob}", where they strip either at the beginning or the end of the string. The single version is non-greedy match (matches the shortest match) and the double matches greedily. I remember which one is front versus end by thinking that "#" is like a comment, and comments often come at the beginning of the line. :-) So "${0##*/}" means "strip everything before the last / of $0".
(go to article | 1 Comment)


(Tuesday August 24, at 03:47) Sean Reifschneider
Subject: A very useful tool: fping
Keywords: NCLUG, Networking, Technical

Recently a friend was saying that one of his co-workers doesn't really write scripts, and had asked my friend to write a script to ping a bunch of hosts. The use was for inventory in preparation for a big move, to make sure they weren't missing hosts or IPs that needed to be moved.

I immediately recommended "fping". It's default mode will read target names from stdin and ping them, and it times out after a few tries, displaying the results. It also pings them asynchronously, so pinging huge sets of hosts goes pretty fast. In a test here, I ran it on a 254 addresses in 35 seconds.

Read on for more details and examples of what fping can do.
(read more | 1 Comment)


(Tuesday August 24, at 03:08) Sean Reifschneider
Subject: Avoiding your SSH daemon being killed during OOM.
Keywords: Linux, NCLUG, Technical

I've often had a love/hate relationship with the OOM killer, only without the love. The OOM killer always seems to look at the processes and decide that "hey, this one process is big, but it's really active, so I'll kill off some other processes that aren't being used right now."

Of course, that leads to the OOM killer terminating things like the SSH daemon to save the multi-gigabyte process that's leaking memory and showing no signs of slowing... In most of those cases, I'd prefer that it's decision were reversed.

Through my playing around with zfs-fuse (a process that can get to be quite big, but that you almost never want to OOM kill), I've found that there's a way to immunize processes against the OOM killer.

More recent kernels have a "oom_adj" file under /proc/$PID which you can echo values between +15 and -17 into. If you run:

echo -17 >/proc/`cat /var/run/sshd.pid`/oom_adj

the "-17" value should get set for your SSH daemon process so that it avoids being a candidate for the OOM killer.

You can read more details about this and other memory-handling issues in the excellent LWN article from 2009 Taming the OOM killer.
(go to article | 0 Comments)


(Thursday August 19, at 19:26) Sean Reifschneider
Subject: I avoid "rm *".
Keywords: Linux, NCLUG, Process

I'm worried about what happens if a command from the history is accidentally re-run -- whether by me or by someone else on the system. It happens. So, instead of doing "rm *" I'll usually do "cd .." and then "rm myfiles/*". And for things like "rm -rf mysql" if I'm re-initializing a database or something, I'll often do "mv mysql mysql.old" and then "rm -rf mysql.old". Repeating this command or running it on the wrong server would hopefully never break things.

These are the things that I worry about as a sysadmin. :-)
(go to article | 0 Comments)


(Tuesday August 10, at 15:56) Kevin Fenzi
Subject: Fun with DNSSec
Keywords: DNS, Fedora, Linux, Security

The other day the question came up about how to better provide ssh host keys to end users in a secure manner. Sure, you can publish them somewhere and the user can check them the first time they connect, but thats prone to human error and not very automated. Turns out you can put SSH host key fingerprints into DNS for easy checking. Of course then the problem becomes how can you check that the DNS data is valid and correct? Thats where DNSSec comes into play.

Read on for my saga of implementing dnssec on my home domain...


(read more | 0 Comments)

(Saturday July 31, at 22:29) Sean Reifschneider
Subject: Claws-Mail vs. Thunderbird
Keywords: Claws, Linux, NCLUG, Review, Thunderbird

Last month Thunderbird broke horribly for me... Whenever I would try to start it, it would never show the main GUI, it would just grow and grow until it reached 3GB of RAM usage and then die. My laptop is running a 32-bit install, so 3GB is the max single process size. Not being able to get anywhere in the GUI made it hard to do any diagnostics, and starting in "safemode" wasn't helping either. I decided to give Claws-Mail a try.

Read on for my comparison between them.
(read more | 0 Comments)


(Friday July 16, at 16:08) Kyle Anderson
Subject: Setting up a PXE boot Server
Keywords: dhcp, Linux, NCLUG, Netbooting, networking, PXE, technical, tftp

Ever fumbled around your house looking for a Linux CD, but you can't find it? Ever suspected that your ram was going bad, but you didn't have a way to test it? Ever needed to backup files on a computer that had a dead and broken operating system? A solution to all these problems is a super cool tool called PXE booting. PXE booting allows you to load alternative operating systems over the network, without the need for cds, cdroms, etc!

Sound like something that you could benefit from? Setting up your own PXE boot environment is easy and Fun! Check out my Journal entry and then later my wiki page for notes, commands, and configuration snippets to get your own setup going.
(go to article | 0 Comments)


(Sunday June 20, at 16:08) Sean Reifschneider
Subject: Competing teams and HP BASIC.
Keywords: History

Slashdot has a story about using competing development teams to find the best solution. This reminded me of the HP-BASIC or "Rocky Mountain Basic" project.

The story I heard, in the late '80s when I was working on testing in the Loveland Instrument Division of HP on their port of RMB to Unix, was that HP was looking to create a BASIC variant, and they set up two teams, one in Colorado and one on the east coast. These variants were Rocky Mountain Basic and East Coast Basic.

The developers were given some time to work on their visions, and then presentations were set up to allow a choice between them to be made. Once released, RMB was generally considered to be a pretty great dialect. I programmed in RMB early on, while I was also learning Pascal, and found that RMB was much less panful than the other BASICs I had dealt with, like Microsoft's.

I also remember a meeting where our manager asked a bunch of us for suggestions on what to call the port to HP-UX, and we were pretty much all in favor of "RMB/UX" or "BASIC/UX". But our manager dropped the bomb "It can't have a slash in it." I remember everyone being pretty annoyed at that. But, in looking at the RMB wikipedia page, it looks like the slash won out in the long run.
(go to article | 0 Comments)


(Sunday June 06, at 14:39) Sean Reifschneider
Subject: Native ZFS coming to Linux?
Keywords: NCLUG, ZFS

A recent thread on the zfs-fuse mailing list has announced that the long-awaited Lustre project to make a native ZFS module for Linux has made good progress. This was announced as being the future for Lustre probably a year ago, but I haven't heard anything about it until this post on the list.

Things are still pretty early it sounds like -- zfs-fuse is likely to be the best choice for probably the next 6 months at least, but it is a significant step toward getting ZFS native under Linux.

The ZFS license is still CDDL, which means that it won't be included in the kernel.org kernel, instead it'll be an out-of-kernel module like DRBD (until recently) or Xen, etc...

This comes at the same time as the ZFS-FUSE 0.6.9 release, which includes deduplication and many other great features. In my testing of 0.6.9b3, it's been working really great. I've been hammering on it with both "zfsstress" and also running it on a test backup server, and it's been running very solidly.

The deduplication has been working well, though you really do need a lot of memory in the ARC cache if you want it to perform well. For this system with 8x2Tb drives, I figure I'll need to put at least 8, and possibly 16GB in the ARC cache. I currently have 8GB RAM, and a 2GB ARC, which is about as much as I can do in an 8GB system. The host will take up to 32GB RAM though, so I have room to grow. My plan is to upgrade it to 8GB and push the ARC up to 8GB, then see how it works. I blew out the original 800MB ARC with deduplication at around 900GB stored in the pool.

It looks like with compression plus deduplication I'm getting a 1.9:1 space savings. Not sure how this compares to the deduplication+compression in BackupPC, but I'm expecting it to do much better simply because I can do block-level changes (large files that just have small appends/updates to them, like databases or log-files).

Anyway, that's the ZFS news for today. :-)
(go to article | 0 Comments)


(Wednesday June 02, at 00:20) Sean Reifschneider
Subject: cron+xargs: The Scheduler of the Stars
Keywords: Command-line, cron, NCLUG, Technical, Tricks, xargs

I'm working on replacing our BackupPC backup infrastructure (because BackupPC just takes too long), and one of the things I needed to do was schedule backup jobs. In BackupPC you can tell it to run 4 jobs in parallel, and whenever it wakes up if there are slots free and backups to run, it will start some more.

I wanted similar capabilities, but without writing my own scheduler; it's not rocket science, but it's still a complicated bit of code. Ideally, to improve on BackupPC, I'd like to have one job start as soon as another ends, rather than waiting for the next scheduler wake-up.

As I've mentioned before, xargs can manage running multiple jobs. You can specify how many to run in parallel, and it gets the list of arguments to run from stdin. So, what I came up with is a crontab which looks like this:

00 22 * * * echo 1.example.com 2.example.com [...] \
      15.example.com | xargs --max-args=1 --max-procs=4 /path/to/harness
00 09 * * * echo a.example.org b.example.org c.example.org \
      | xargs --max-args=1 --max-procs=1 /path/to/harness

The first line starts at 10pm and runs the harness with the system name to back up as the argument. It runs it for 15 hosts, running 4 in parallel. The second cron entry starts at 9am and runs the 3 example.org backups one at a time (they are hosted off-site and no need to hit their network or ours harder than necessary).

In the past I would manually add the cron entries for each host at specific times, but sometimes jobs would run long and load would go way up, or sometimes there were idle periods where nothing happened... This is definitely an improvement over that, with minimal additional coding.

Wherever possible: Avoid writing code.
(go to article | 0 Comments)


(Saturday May 29, at 12:31) Kevin Fenzi
Subject: Chromium
Keywords: browser, linux, reviews, web

I have been using midori full time as my browser for a while now. The latest release of the Chromium browser looked interesting, so I decided to run it for a few days and see how it worked. Read on for a review.


(read more | 1 Comment)

(Saturday May 15, at 17:47) Sean Reifschneider
Subject: Getting 95 percentile numbers out of rrdtool
Keywords: NCLUG, Networking, Technical

This morning I figured out how to get rrdtool to report the 95%ile utilization to stdout. It's kind of convoluted how you do it, you have to use the "graph" subcommand, but write the graph to /dev/null, and use "PRINT" instead of "GPRINT". For example:

eval `$RRDTOOL graph -f '' -s "$1" /dev/null \
   DEF:in="$2":in:AVERAGE \
   DEF:out="$2":out:AVERAGE \
   CDEF:inbits=in,8,* \
   CDEF:outbits=out,8,* \
   VDEF:95pct_in=inbits,95,PERCENT
   VDEF:95pct_out=outbits,95,PERCENT \
   PRINT:95pct_in:"IN='%.2lf %Sb'" \
   PRINT:95pct_out:"OUT='%.2lf %Sb'"

Where "$1" is the period start time (like "-1d" for showing the 95%ile of today), and "$2" is the .rrd file name. I do an "eval" to parse the output (making $IN and $OUT shell variables). The "-f ''" tells is not to write an image size string.

It may also be useful to change the last two lines to use a format something like "IN='%.2lf; IN_MAGNITUDE=%S" (to get something like IN=2.50 and IN_MAGNITUDE=M) or just "IN=%.0lf" (to get the full bits like IN=2500000).

Again, rrdtool proves to be amazingly flexible, given enough time to wrap your mind around it.
(go to article | 0 Comments)


(Sunday May 09, at 14:14) Kevin Fenzi
Subject: RHEL6 beta and EPEL6 news
Keywords: EPEL RHEL Reviews Linux

Not too long ago, Red Hat released a public Beta of RHEL6 for folks to try out. I've been running it here in a vm since it was released. Read on for my thoughts on the Beta and also News about where EPEL is related to RHEL6.


(read more | 2 Comments)

(Saturday May 01, at 00:26) Sean Reifschneider
Subject: Python syslog patch to log exceptions
Keywords: Patch, Python, Syslog

I've just completed a patch to the Python syslog module to add the method enable_exception_logging(). It sets up a sys.excepthook so that unhandled exceptions are logged to syslog. By default, it chains to the existing excepthook.

So, once this code gets accepted, you will be able to have exceptions logged by doing: "import syslog; syslog.enable_exception_logging()".

For Python software that runs from cron or init or Apache, it can be very useful to capture the exceptions in a persistent location.

I'd appreciate some reviews of the code, it's in Issue8214.
(go to article | 2 Comments)


(Tuesday April 27, at 05:20) Sean Reifschneider
Subject: Initial btrstress results.
Keywords: btrfs, File-system, NCLUG, Stresstest

Just a FYI follow-up on my previous post about having started running some stress testing of btrfs: it seems to have run into a bug in btrfs after around 8 days. I had done a couple of stops of btrstress, removals of the test data, and restarts. I had also enabled compression, which I'm wondering if that was the cause of the problem. The issue seems to be a NULL pointer dereference (which has been reported to the btrfs mailing list).

I'll look at starting another run, once I've made sure nobody needs any information off the system as it is now.
(go to article | 0 Comments)


(Wednesday April 21, at 23:41) Sean Reifschneider
Subject: Analysis of a data loss event.
Keywords: RAID, Technical

Computer hardware is pretty reliable these days. However, even with good procedures and hardware in place, there is still the possibility of data-loss. As we found out on Monday night... Despite having a well documented and tested workflow, RAID data redundancy, monitoring, top-notch personnel, and operating in a very conservative manner, we had a data-loss event that impacted 6 of our virtual hosting customers.

This is, to the best of my recollection, the first major data loss event we've had related to our hosting, since we began the hosting service over 11 years ago. I wanted to document what happened, both to provide information to the clients that were impacted and also as a lesson to the other readers.

Read on if you are interested in all the gory details. In big-enterprise circles, this is called a Service Outage Analysis (SOA).
(read more | 1 Comment)


(Monday April 19, at 13:17) Sean Reifschneider
Subject: Tricks: Using xargs to feed multiple CPUs.
Keywords: Command-line, Tricks, Unix

xargs is a great command-line tool for parceling huge lists of files to not exceed command-line limits on length or numbers of arguments. However, it also has some arguments that cause it to manage running multiple, parallel jobs. Read on for how I used this to cut one of my jobs execution time by 75%.
(read more | 0 Comments)


(Sunday April 18, at 23:57) Sean Reifschneider
Subject: Patch to Python syslog module to use sys.argv[0] for "ident".
Keywords: Python

I've created a patch for the Python syslog module which:

  • Makes openlog arguments keyword args.
  • Makes openlog ident argument optional.
  • If ident is not passed to ident, basename(sys.argv[0]) is used.
  • The first call to syslog.syslog() calls ident() with no options (if it hasn't previously been called).
  • Variously related documentation changes.

The patch is in the issue tracker as Issue 8451. If anyone out there has the inclination to review it, I'd appreciate it.
(go to article | 0 Comments)


(Sunday April 18, at 13:09) Sean Reifschneider
Subject: btrstress Program Available
Keywords: btrfs, File-system, Stresstest

Several months ago I wrote a "zfsstress" program. This program emulates the use-case of our old backup servers, which would regularly cause the OpenSolaris systems we were running them on to reboot. As I've mentioned before, zfsstress has shown that the stable release of zfs-fuse is quite good.

Now that btrfs has the ability to delete snapshots, I decided to port zfsstress to btrfs, and the result is available at ftp://ftp.tummy.com/pub/tummy/zfsstress/ along-side the zfsstress program.

I've been running btrstress on a test Fedora 13 Beta system for the last week, and it's been working very well. I am getting "unlinked 1 orphans" in dmesg periodically, I haven't been able to find anything saying what that is about. So far though, btrfs is looking pretty good.

One kind of curious thing is that I let btrstress fill up a 200GB partition, then I deleted the btrstress subvolume. The delete of 200GB of data returned in around 20 seconds, but df showed the file-system still full. It was cleaning up in the background, which is just great -- I don't have to wait for it before the command returns. It took around 10 minutes for the data to be completely removed.
(go to article | 0 Comments)


(Saturday April 17, at 00:09) Sean Reifschneider
Subject: Improving Deduplication Performance Under ZFS-FUSE
Keywords: Performance, Technical, ZFS

I've been running some tests with real, live data on the ZFS-FUSE devel branch that I mentioned previously I am testing.

My initial tests were performing rather poorly, in the area of 1MB/sec for 250GB. In other words, it took 3 days to rsync 250GB. However, a conveniently-timed thread on the ZFS-FUSE mailing list saved the day. The remainder of this message includes some tools and techniques for determining how big your ARC needs to be to get good performance.
(read more | 0 Comments)


(Sunday April 11, at 15:10) Sean Reifschneider
Subject: ZFS-FUSE Status: Testing with DeDup going well.
Keywords: Linux, Technical, ZFS, ZFS-FUSE

I've been continuing to do a variety of testing with ZFS-FUSE, concentrating largely on the development branches which implement deduplication. I'm using Emmanuel Anne's branch, the 94767eeb512704d673e301eb6c837ee108739bd4 branch. With a change to kernel parameters, this has been running without any problems. Continue reading if you would like more details.
(read more | 0 Comments)


(Tuesday March 30, at 00:37) Sean Reifschneider
Subject: What advice would you give yourself?
Keywords: Advice

For about the last 6 months I've been trying to decide what advice I would have wanted when I was 18 (for me, that was 1988). A recent XKCD combined with a "homework assignment" from my about-to-graduate nephew's English teacher finally combined forces to get me to sit down and write a letter from the past. What advice would you want to have heard as you were about to take another step into the "real world"? I mean besides "Don't Do It (tm)!".

Read my letter from the past here.
(go to article | 1 Comment)


(Sunday March 14, at 22:11) Sean Reifschneider
Subject: @reboot and other cron fun.
Keywords: cron, Technical

swarren and I were just chatting and he mentioned putting something into rc.local. I said that I tended to prefer using "@reboot with cron" over rc.local these days. Partly because I don't like rc.local, partly because it keeps most of my system maintenance stuff in one place on my laptop. Stephen hadn't heard about @reboot though...

The cron daemon supports several "@ nicknames" for use instead of the normal set of time values: @reboot, and @hourly (daily, weekly, monthly, yearly/annually).

I use this in my personal crontab for a job I want run once at boot time:

   @reboot ~/bin/archivevimswap >/dev/null 2>&1

This is a small shell script that moves my ~/.vim-tmp out of the way and creates a new ~/.vim-tmp. I have vim configured to put all the tmp files in there, so that I don't end up with them littered all over the disc.

Just to be canonical, I'll mention that you can also be lists, ranges, and steps, so things like "*/5 * * * *" is every 5 minutes, "0 9-17 * * *" is the top of the hour from 9am to 5pm, and "15,20 * * * *" runs at 15 and 20 minutes past the hour.

Lately though, I've been thinking about what a next-generation cron would look like. It would be nice to say "I have these 10 jobs that need to be run between midnight and 4am, but I only want 3 of them running at once." Rather than having to try to stagger them. I could imagine a use for also having it "kill -STOP" jobs if the load goes above a certain value, or "kill -CONT" when it drops. Or even coordination among machines (I have 10 machines, they all need to run CPU-intensive jobs after midnight, but I don't want to cause a spike in power consumption or drop in responsiveness among all of them).

So many possibilities...
(go to article | 1 Comment)


(Tuesday March 02, at 01:04) Sean Reifschneider
Subject: mkpkg: Helper to create setup.py for your projects.
Keywords: Packaging, Python

At the PyCon sprints I mostly worked on my write-up for the conference networking, and other administratrivia related to PyCon 2010 and getting ready for 2011. However, I did achieve one thing, and that's "mkpkg".

During the Language Summit we were speaking about packaging, and Guido said that he usually doesn't create packages. And I felt his pain, because I usually put it off a long as possible too. But from that I decided that I wanted to build a helper to make setting up the package files a no-brainer.

A couple of days into the sprint, I had something that was a good start. Continue reading for more details.
(read more | 2 Comments)