Your Linux Data Center Experts

Last weekend Paul Hummer and I had a sprint on building packages of Python software from PyPI. Paul made good progress on the patch for building Debian packages, it's sorely out of date. I refactored my existing build code to more safely build untrusted packages. Read on for more about it.

Thanks to Paul I have been publishing this code via a Bazaar branch up to Launchpad. Feel free to check it out via “bzr checkout lp:~jafo/+junk/pypi-builderator”.

One of the things I had planned was to use a chroot, but I was worried about the time it would take to set up a chroot for the build environment. Paul had pointed me at the Debian pbuilder (in addition to my previous look at the Fedora Mock), and they mentioned using LVM snapshots for the chroots. What a great idea.

So I made a small script to create a chroot out of the system into an LVM. Then I modified my build script to make a uniquely-named snapshot of this file-system, and mount it up. For example:

lvcreate -l 5G -n pypibuild-centos-5-32-001de0a34d35 \
      -s /dev/pypibuildvg/centos-5-32-chroot
mkdir /pypi/builderator/build/pypibuild-centos-5-32-001de0a34d35
mount /dev/pypibuildvg/pypibuild-centos-5-32-001de0a34d35 \
      /pypi/builderator/build/pypibuild-centos-5-32-001de0a34d35

Of course, the source file-system for this needs to be unmounted before doing the snapshot.

With this, it takes a fraction of a second to create a new chroot, and to get rid of it. In comparison, to create the chroot file-system initially it takes around a minute (including the mke2fs, but that is a small fraction of the time).

The other concern I have is that anything done in the chroot gets killed off at the end of the build. In case any processes accidentally (or maliciously) leave something around that they shouldn't.

I boned up on Linux process groups and sessions, which turned out to be just what I needed. So, I start the build in a new process session with code like:

def builderChild():
   os.setsid()
   if os.getpid() != os.getsid():
      raise ValueError('Expected session ID to be %s' % os.getpid())
   [DO THE BUILD HERE]
childPid = os.fork()
if childPid == 0:
   builderChild()
   sys.exit(1)
[WAIT FOR OR KILL THE CHILD]

The fork() combined with the setsid() are what set the builder process in it's own session. You can then use “os.killpg(childPid, signal.SIGTERM)” to send a TERM signal to the child and all its sub processes.

I can also check to see if there are any processes running from the build environment by checking the output of “ps ax –format pgrp=” and see if there are any occurrences of “childPid” in it.

So far this has worked quite well, though admittedly I haven't done anything to try to break it. It seems like it's doing all the right stuff to make a fairly robust environment for building packages.

One slightly comic event happened when I added code to finish cleaning up the chroot. As part of the umount I added something to do an “fuser -m -k -9” on the mount-point. But I was also running out of disc space to create the LVM snapshots, so it would eventually fail to mount up the snapshot, then it would try to clean it up and run the “fuser -m -k -9” on the root file-system. And my shell and SSH would get killed.

Once I tracked it down, I added some code to check that the directory is not a part of the root file-system:

eturn(os.stat(mountpoint).st_dev != os.stat('/').st_dev)
comments powered by Disqus

Join our other satisfied clients. Contact us today.