Sean Reifschneider's Journal Recent Entries
Below is a summary of the most recent journal entries by this user. A full index of all entries is also available.Also available as: RSS
Thursday August 26, at 16:51
Subject: Getting the program name in scripts
Keywords:
bash, NCLUG, Technical
In the (distant) past I've used "`basename $0`" to get the name of the
currently running script.
However, several years ago I learned about "${0##*/}", and have
switched over to using that. The benefit is that you don't have to fork a
new process, and it's also very concise, but admittedly more archaic than
"basename".
The way I remember it is that you can do "${var%glob}" "${var%%glob}"
"${var#glob}" "${var##glob}", where they strip either at the beginning or
the end of the string. The single version is non-greedy match (matches
the shortest match) and the double matches greedily. I remember which
one is front versus end by thinking that "#" is like a comment, and
comments often come at the beginning of the line. :-) So "${0##*/}"
means "strip everything before the last / of $0".
(go to article | 1 Comment)
(go to article | 1 Comment)
Tuesday August 24, at 03:47
Subject: A very useful tool: fping
Keywords:
NCLUG, Networking, Technical
Recently a friend was saying that one of his co-workers doesn't really
write scripts, and had asked my friend to write a script to ping a bunch of
hosts. The use was for inventory in preparation for a big move, to make
sure they weren't missing hosts or IPs that needed to be moved.
I immediately recommended "fping". It's default mode will read target
names from stdin and ping them, and it times out after a few tries,
displaying the results. It also pings them asynchronously, so pinging huge
sets of hosts goes pretty fast. In a test here, I ran it on a 254
addresses in 35 seconds.
Read on for more details and examples of what fping can do.
(read more | 1 Comment)
(read more | 1 Comment)
Tuesday August 24, at 03:08
Subject: Avoiding your SSH daemon being killed during OOM.
Keywords:
Linux, NCLUG, Technical
I've often had a love/hate relationship with the OOM killer, only
without the love. The OOM killer always seems to look at the processes and
decide that "hey, this one process is big, but it's really active, so I'll
kill off some other processes that aren't being used right now."
Of course, that leads to the OOM killer terminating things like the
SSH daemon to save the multi-gigabyte process that's leaking memory and
showing no signs of slowing... In most of those cases, I'd prefer that
it's decision were reversed.
Through my playing around with zfs-fuse (a process that can get to be
quite big, but that you almost never want to OOM kill), I've found that
there's a way to immunize processes against the OOM killer.
More recent kernels have a "oom_adj" file under /proc/$PID which you
can echo values between +15 and -17 into. If you run:
(go to article | 0 Comments)
echo -17 >/proc/`cat /var/run/sshd.pid`/oom_adjthe "-17" value should get set for your SSH daemon process so that it avoids being a candidate for the OOM killer. You can read more details about this and other memory-handling issues in the excellent LWN article from 2009 Taming the OOM killer.
(go to article | 0 Comments)
Thursday August 19, at 19:26
Subject: I avoid "rm *".
Keywords:
Linux, NCLUG, Process
I'm worried about what happens if a command from the history is
accidentally re-run -- whether by me or by someone else on the system. It
happens. So, instead of doing "rm *" I'll usually do "cd .." and then "rm
myfiles/*". And for things like "rm -rf mysql" if I'm re-initializing a
database or something, I'll often do "mv mysql mysql.old" and then "rm -rf
mysql.old". Repeating this command or running it on the wrong server
would hopefully never break things.
These are the things that I worry about as a sysadmin. :-)
(go to article | 0 Comments)
(go to article | 0 Comments)
Saturday July 31, at 22:29
Subject: Claws-Mail vs. Thunderbird
Keywords:
Claws, Linux, NCLUG, Review, Thunderbird
Last month Thunderbird broke horribly for me... Whenever I would
try to start it, it would never show the main GUI, it would just grow and
grow until it reached 3GB of RAM usage and then die. My laptop is running
a 32-bit install, so 3GB is the max single process size. Not being able
to get anywhere in the GUI made it hard to do any diagnostics, and starting
in "safemode" wasn't helping either. I decided to give Claws-Mail a try.
Read on for my comparison between them.
(read more | 0 Comments)
(read more | 0 Comments)
Sunday June 20, at 16:08
Subject: Competing teams and HP BASIC.
Keywords:
History
Slashdot has a story about using competing
development teams to find the best solution. This reminded me of the
HP-BASIC or "Rocky Mountain Basic" project.
The story I heard, in the late '80s when I was working on testing
in the Loveland Instrument Division of HP on their port of RMB to Unix, was
that HP was looking to create a BASIC variant, and they set up two teams,
one in Colorado and one on the east coast. These variants were Rocky
Mountain Basic and East Coast Basic.
The developers were given some time to work on their visions, and then
presentations were set up to allow a choice between them to be made. Once
released, RMB was generally considered to be a pretty great dialect. I
programmed in RMB early on, while I was also learning Pascal, and found
that RMB was much less panful than the other BASICs I had dealt with, like
Microsoft's.
I also remember a meeting where our manager asked a bunch of us for
suggestions on what to call the port to HP-UX, and we were pretty much all
in favor of "RMB/UX" or "BASIC/UX". But our manager dropped the bomb "It
can't have a slash in it." I remember everyone being pretty annoyed at
that. But, in looking at the
RMB
wikipedia page, it looks like the slash won out in the long run.
(go to article | 0 Comments)
(go to article | 0 Comments)
Sunday June 06, at 14:39
Subject: Native ZFS coming to Linux?
Keywords:
NCLUG, ZFS
A recent thread on the zfs-fuse mailing list has announced that the
long-awaited Lustre
project to make a native ZFS module for Linux has made good progress.
This was announced as being the future for Lustre probably a year ago, but
I haven't heard anything about it until this post on the list.
Things are still pretty early it sounds like -- zfs-fuse is likely to
be the best choice for probably the next 6 months at least, but it is a
significant step toward getting ZFS native under Linux.
The ZFS license is still CDDL, which means that it won't be included
in the kernel.org kernel, instead it'll be an out-of-kernel module like
DRBD (until recently) or Xen, etc...
This comes at the same time as the ZFS-FUSE 0.6.9 release, which
includes deduplication and many other great features. In my testing of
0.6.9b3, it's been working really great. I've been hammering on it with
both "zfsstress" and also running it on a test backup server, and it's been
running very solidly.
The deduplication has been working well, though you really do need a
lot of memory in the ARC cache if you want it to perform well. For this
system with 8x2Tb drives, I figure I'll need to put at least 8, and
possibly 16GB in the ARC cache. I currently have 8GB RAM, and a 2GB ARC,
which is about as much as I can do in an 8GB system. The host will take up
to 32GB RAM though, so I have room to grow. My plan is to upgrade it to
8GB and push the ARC up to 8GB, then see how it works. I blew out the
original 800MB ARC with deduplication at around 900GB stored in the pool.
It looks like with compression plus deduplication I'm getting a 1.9:1
space savings. Not sure how this compares to the deduplication+compression
in BackupPC, but I'm expecting it to do much better simply because I can do
block-level changes (large files that just have small appends/updates to
them, like databases or log-files).
Anyway, that's the ZFS news for today. :-)
(go to article | 0 Comments)
(go to article | 0 Comments)
Wednesday June 02, at 00:20
Subject: cron+xargs: The Scheduler of the Stars
Keywords:
Command-line, cron, NCLUG, Technical, Tricks, xargs
I'm working on replacing our BackupPC backup infrastructure (because
BackupPC just takes too long), and one of the things I needed to do was
schedule backup jobs. In BackupPC you can tell it to run 4 jobs in
parallel, and whenever it wakes up if there are slots free and backups to
run, it will start some more.
I wanted similar capabilities, but without writing my own scheduler;
it's not rocket science, but it's still a complicated bit of code.
Ideally, to improve on BackupPC, I'd like to have one job start as soon as
another ends, rather than waiting for the next scheduler wake-up.
As I've mentioned before, xargs can manage running multiple jobs. You
can specify how many to run in parallel, and it gets the list of arguments
to run from stdin. So, what I came up with is a crontab which looks like
this:
(go to article | 0 Comments)
00 22 * * * echo 1.example.com 2.example.com [...] \
15.example.com | xargs --max-args=1 --max-procs=4 /path/to/harness
00 09 * * * echo a.example.org b.example.org c.example.org \
| xargs --max-args=1 --max-procs=1 /path/to/harness
The first line starts at 10pm and runs the harness with the system
name to back up as the argument. It runs it for 15 hosts, running 4 in
parallel. The second cron entry starts at 9am and runs the 3 example.org
backups one at a time (they are hosted off-site and no need to hit their
network or ours harder than necessary).
In the past I would manually add the cron entries for each host at
specific times, but sometimes jobs would run long and load would go way up,
or sometimes there were idle periods where nothing happened... This is
definitely an improvement over that, with minimal additional coding.
Wherever possible: Avoid writing code.(go to article | 0 Comments)
Saturday May 15, at 17:47
Subject: Getting 95 percentile numbers out of rrdtool
Keywords:
NCLUG, Networking, Technical
This morning I figured out how to get rrdtool to report the 95%ile
utilization to stdout. It's kind of convoluted how you do it, you have to
use the "graph" subcommand, but write the graph to /dev/null, and use
"PRINT" instead of "GPRINT". For example:
(go to article | 0 Comments)
eval `$RRDTOOL graph -f '' -s "$1" /dev/null \ DEF:in="$2":in:AVERAGE \ DEF:out="$2":out:AVERAGE \ CDEF:inbits=in,8,* \ CDEF:outbits=out,8,* \ VDEF:95pct_in=inbits,95,PERCENT VDEF:95pct_out=outbits,95,PERCENT \ PRINT:95pct_in:"IN='%.2lf %Sb'" \ PRINT:95pct_out:"OUT='%.2lf %Sb'"Where "$1" is the period start time (like "-1d" for showing the 95%ile of today), and "$2" is the .rrd file name. I do an "eval" to parse the output (making $IN and $OUT shell variables). The "-f ''" tells is not to write an image size string. It may also be useful to change the last two lines to use a format something like "IN='%.2lf; IN_MAGNITUDE=%S" (to get something like IN=2.50 and IN_MAGNITUDE=M) or just "IN=%.0lf" (to get the full bits like IN=2500000). Again, rrdtool proves to be amazingly flexible, given enough time to wrap your mind around it.
(go to article | 0 Comments)
Saturday May 01, at 00:26
Subject: Python syslog patch to log exceptions
Keywords:
Patch, Python, Syslog
I've just completed a patch to the Python syslog module to add the
method enable_exception_logging(). It sets up a sys.excepthook so that
unhandled exceptions are logged to syslog. By default, it chains to the
existing excepthook.
So, once this code gets accepted, you will be able to have exceptions
logged by doing: "import syslog; syslog.enable_exception_logging()".
For Python software that runs from cron or init or Apache, it can be
very useful to capture the exceptions in a persistent location.
I'd appreciate some reviews of the code, it's in Issue8214.
(go to article | 2 Comments)
(go to article | 2 Comments)
Tuesday April 27, at 05:20
Subject: Initial btrstress results.
Keywords:
btrfs, File-system, NCLUG, Stresstest
Just a FYI follow-up on my previous post about having started running
some stress testing of btrfs: it seems to have run into a bug in btrfs
after around 8 days. I had done a couple of stops of btrstress, removals
of the test data, and restarts. I had also enabled compression, which I'm
wondering if that was the cause of the problem. The issue seems to be a
NULL pointer dereference (which has been reported to the btrfs mailing
list).
I'll look at starting another run, once I've made sure nobody needs
any information off the system as it is now.
(go to article | 0 Comments)
(go to article | 0 Comments)
Wednesday April 21, at 23:41
Subject: Analysis of a data loss event.
Keywords:
RAID, Technical
Computer hardware is pretty reliable these days. However, even with
good procedures and hardware in place, there is still the possibility of
data-loss. As we found out on Monday night... Despite having a well
documented and tested workflow, RAID data redundancy, monitoring,
top-notch personnel, and operating in a very conservative manner, we
had a data-loss event that impacted 6 of our virtual hosting customers.
This is, to the best of my recollection, the first major data loss
event we've had related to our hosting, since we began the hosting service
over 11 years ago. I wanted to document what happened, both to provide
information to the clients that were impacted and also as a lesson to the
other readers.
Read on if you are interested in all the gory details. In
big-enterprise circles, this is called a Service Outage Analysis (SOA).
(read more | 1 Comment)
(read more | 1 Comment)
Monday April 19, at 13:17
Subject: Tricks: Using xargs to feed multiple CPUs.
Keywords:
Command-line, Tricks, Unix
xargs is a great command-line tool for parceling huge lists of files
to not exceed command-line limits on length or numbers of arguments.
However, it also has some arguments that cause it to manage running
multiple, parallel jobs. Read on for how I used this to cut one of my jobs
execution time by 75%.
(read more | 0 Comments)
(read more | 0 Comments)
Sunday April 18, at 23:57
Subject: Patch to Python syslog module to use sys.argv[0] for "ident".
Keywords:
Python
I've created a patch for the Python syslog module which:
(go to article | 0 Comments)
-
Makes openlog arguments keyword args.
Makes openlog ident argument optional.
If ident is not passed to ident, basename(sys.argv[0]) is used.
The first call to syslog.syslog() calls ident() with no options
(if it hasn't previously been called).
Variously related documentation changes.
(go to article | 0 Comments)
Sunday April 18, at 13:09
Subject: btrstress Program Available
Keywords:
btrfs, File-system, Stresstest
Several months ago I wrote a "zfsstress" program. This program
emulates the use-case of our old backup servers, which would regularly
cause the OpenSolaris systems we were running them on to reboot. As I've
mentioned before, zfsstress has shown that the stable release of zfs-fuse
is quite good.
Now that btrfs has the ability to delete snapshots, I decided to port
zfsstress to btrfs, and the result is available at
ftp://ftp.tummy.com/pub/tummy/zfsstress/
along-side the zfsstress program.
I've been running btrstress on a test Fedora 13 Beta system for the
last week, and it's been working very well. I am getting "unlinked 1
orphans" in dmesg periodically, I haven't been able to find anything saying
what that is about. So far though, btrfs is looking pretty good.
One kind of curious thing is that I let btrstress fill up a 200GB
partition, then I deleted the btrstress subvolume. The delete of 200GB of
data returned in around 20 seconds, but df showed the file-system still
full. It was cleaning up in the background, which is just great -- I don't
have to wait for it before the command returns. It took around 10 minutes
for the data to be completely removed.
(go to article | 0 Comments)
(go to article | 0 Comments)
Saturday April 17, at 00:09
Subject: Improving Deduplication Performance Under ZFS-FUSE
Keywords:
Performance, Technical, ZFS
I've been running some tests with real, live data on the ZFS-FUSE
devel branch that I mentioned
previously I am testing.
My initial tests were performing rather poorly, in the area of 1MB/sec
for 250GB. In other words, it took 3 days to rsync 250GB. However, a
conveniently-timed thread on the ZFS-FUSE mailing list saved the day.
The remainder of this message includes some tools and techniques for
determining how big your ARC needs to be to get good performance.
(read more | 0 Comments)
(read more | 0 Comments)
Sunday April 11, at 15:10
Subject: ZFS-FUSE Status: Testing with DeDup going well.
Keywords:
Linux, Technical, ZFS, ZFS-FUSE
I've been continuing to do a variety of testing with ZFS-FUSE,
concentrating largely on the development branches which implement
deduplication. I'm using Emmanuel Anne's branch, the
94767eeb512704d673e301eb6c837ee108739bd4 branch. With a change to kernel
parameters, this has been running without any problems. Continue reading
if you would like more details.
(read more | 0 Comments)
(read more | 0 Comments)
Tuesday March 30, at 00:37
Subject: What advice would you give yourself?
Keywords:
Advice
For about the last 6 months I've been trying to decide what
advice I would have wanted when I was 18 (for me, that was 1988). A
recent XKCD combined with a
"homework assignment" from my about-to-graduate nephew's English teacher
finally combined forces to get me to sit down and write a letter from the
past. What advice would you want to have heard as you were about to take
another step into the "real world"? I mean besides "Don't Do It (tm)!".
Read my letter
from the past here.
(go to article | 1 Comment)
(go to article | 1 Comment)
Sunday March 14, at 22:11
Subject: @reboot and other cron fun.
Keywords:
cron, Technical
swarren and I were just chatting and he mentioned putting something
into rc.local. I said that I tended to prefer using "@reboot with cron"
over rc.local these days. Partly because I don't like rc.local, partly
because it keeps most of my system maintenance stuff in one place on my
laptop. Stephen hadn't heard about @reboot though...
The cron daemon supports several "@ nicknames" for use instead of the
normal set of time values: @reboot, and @hourly (daily, weekly, monthly,
yearly/annually).
I use this in my personal crontab for a job I want run once at boot
time:
(go to article | 1 Comment)
@reboot ~/bin/archivevimswap >/dev/null 2>&1This is a small shell script that moves my ~/.vim-tmp out of the way and creates a new ~/.vim-tmp. I have vim configured to put all the tmp files in there, so that I don't end up with them littered all over the disc. Just to be canonical, I'll mention that you can also be lists, ranges, and steps, so things like "*/5 * * * *" is every 5 minutes, "0 9-17 * * *" is the top of the hour from 9am to 5pm, and "15,20 * * * *" runs at 15 and 20 minutes past the hour. Lately though, I've been thinking about what a next-generation cron would look like. It would be nice to say "I have these 10 jobs that need to be run between midnight and 4am, but I only want 3 of them running at once." Rather than having to try to stagger them. I could imagine a use for also having it "kill -STOP" jobs if the load goes above a certain value, or "kill -CONT" when it drops. Or even coordination among machines (I have 10 machines, they all need to run CPU-intensive jobs after midnight, but I don't want to cause a spike in power consumption or drop in responsiveness among all of them). So many possibilities...
(go to article | 1 Comment)
Tuesday March 02, at 01:04
Subject: mkpkg: Helper to create setup.py for your projects.
Keywords:
Packaging, Python
At the PyCon sprints I mostly worked on my write-up for the conference
networking, and other administratrivia related to PyCon 2010 and getting
ready for 2011. However, I did achieve one thing, and that's "mkpkg".
During the Language Summit we were speaking about packaging, and Guido
said that he usually doesn't create packages. And I felt his pain, because
I usually put it off a long as possible too. But from that I decided that
I wanted to build a helper to make setting up the package files a
no-brainer.
A couple of days into the sprint, I had something that was a good
start. Continue reading for more details.
(read more | 2 Comments)
(read more | 2 Comments)
Friday February 26, at 03:31
Subject: PyCon 2010 Networking Wrap-up
Keywords:
Networking, PyCon, Python
I've completed my wrap-up of the networking
at PyCon 2010. I hope you enjoy reading about it as much as I enjoyed
working on it.
(go to article | 0 Comments)
(go to article | 0 Comments)
Wednesday February 24, at 22:27
Subject: Python developers are the best!
Keywords:
Python
The hotel engineer comes over and is looking for someone in charge...
"What can I help you with?" I ask him. He explains that they have to set
these rooms they've given away from us to another conference and there are
power cords taped down all over.
I call out "Python Developers Activate! Form of a righteous swarm!"
5 minutes later the rooms are picked clean like the bones of some
(particularly tasty) carrion.
(go to article | 1 Comment)
(go to article | 1 Comment)
Monday February 22, at 08:46
Subject: The story behind the 4.2.2.2 DNS server.
Keywords:
DNS
I've hunted down some of the story behind the DNS server that runs at
4.2.2.2. I've taken some discussions I've found around the web, and some
responses I got on NANOG, into a story on the
history behind the popular 4.2.2.2 DNS server.
(go to article | 0 Comments)
(go to article | 0 Comments)
Thursday February 04, at 06:11
Subject: Firefox Weave: Syncing Between Machines
Keywords:
Firefox, Plugin, Synchronization
Evelyn pointed out the Weave plugin for Firefox on Sunday. There's a
server you can, allegedly, install the server on your own machine. I tried
that, but it's in pretty rough shape (40+ errors are all reported as a
generic "database failure" message, then I eventually got to the point
where it was just responding "https:///"). Or you can use a server
provided by the Mozilla foundation.
Everything is, apparently, encrypted for transit and storage, with a
key that you select. So there shouldn't be a security concern. You can
select what you want it to synchronize, including bookmarks, passwords,
preferences, history, and tabs.
One thing that kind of threw me is that the tabs show up under
"History -> Tabs from Other Computers".
Weave seems to work fine so far. You just install the weave plugin,
create an account, set the encryption password, and off you go. It's real
easy, man.
(go to article | 0 Comments)
(go to article | 0 Comments)
Tuesday February 02, at 21:04
Subject: Django snippet for automating templates.
Keywords:
Django, Python
I just posted a Django snippet that (ab)uses a decorator to change
how you call templates in Dango views. For example, it makes my view
code something like this:
(go to article | 0 Comments)
####################################
@with_template('friends/index.html')
def friends(request, context, username):
context['user'] = User.objects.get(username = username)
And the view:
{% extends "base.html" %}
{% block content %}
<h1>{{ user.username }}'s Friends</h1>(go to article | 0 Comments)