Wednesday June 02, at 00:20
Subject: cron+xargs: The Scheduler of the Stars
Keywords:
Command-line, cron, NCLUG, Technical, Tricks, xargs
Posted by: Sean Reifschneider
Related entries:Tricks: Using xargs to feed multiple CPUs. by Sean Reifschneider, Monday April 19, at 13:17
I'm working on replacing our BackupPC backup infrastructure (because
BackupPC just takes too long), and one of the things I needed to do was
schedule backup jobs. In BackupPC you can tell it to run 4 jobs in
parallel, and whenever it wakes up if there are slots free and backups to
run, it will start some more.
I wanted similar capabilities, but without writing my own scheduler;
it's not rocket science, but it's still a complicated bit of code.
Ideally, to improve on BackupPC, I'd like to have one job start as soon as
another ends, rather than waiting for the next scheduler wake-up.
As I've mentioned before, xargs can manage running multiple jobs. You
can specify how many to run in parallel, and it gets the list of arguments
to run from stdin. So, what I came up with is a crontab which looks like
this:
(Post Reply)
00 22 * * * echo 1.example.com 2.example.com [...] \
15.example.com | xargs --max-args=1 --max-procs=4 /path/to/harness
00 09 * * * echo a.example.org b.example.org c.example.org \
| xargs --max-args=1 --max-procs=1 /path/to/harness
The first line starts at 10pm and runs the harness with the system
name to back up as the argument. It runs it for 15 hosts, running 4 in
parallel. The second cron entry starts at 9am and runs the 3 example.org
backups one at a time (they are hosted off-site and no need to hit their
network or ours harder than necessary).
In the past I would manually add the cron entries for each host at
specific times, but sometimes jobs would run long and load would go way up,
or sometimes there were idle periods where nothing happened... This is
definitely an improvement over that, with minimal additional coding.
Wherever possible: Avoid writing code.
(Post Reply)