tummy.com: we do linux

Recent Journal Comments

Below are the most recent comments to the journal entries on tummy.com's website. This page is also available as an RSS feed, simply feed this URL to your RSS new reader.

Date: Wednesday December 07, 2011 at 13:42
In Reply To: Report on 2011 Code Retreat by Sean Reifschneider
Subject: How I finished in 35 minutes...
Author: Sean Reifschneider

We didn't have any forced constraints like your sessions did, Chip. So that may have been part of it. I was doing some things myself to help keep it interesting though.

To answer your question, I don't recall that I've ever implemented a game of life before that day, though that day we had done 2 implementation attempts before the successful one. I had certainly known about the GoL, and played with it before, but if I did an implementation it must have been more than 20 years ago.

The first attempt was done in Ruby, and we spent around 5 minutes tracking down an issue that in the end was because one of the files was in a different directory than the others. In that attempt we were using a static array and we implemented the tests and the code for rules 1 through 3 of the GoL, and were just starting on the 4th. We were surprisingly close. This was basically one class storing both the game logic and the world.

The second attempt was in Python and we were going to try implementing it as a torus world. We didn't make a lot of progress in that one, but we did have quite a discussion about what the appropriate level of abstraction for the neighbors counting function should be. I mentioned that in my post above. I don't recall if we even implemented any of the rules of the GoL in that session, but I don't think we did. This attempt used a class for the game logic and another for the world, something that felt more natural than the first attempt.

In this attempt we decided to try an infinite world, again in Python. Greg wanted to see Python, and I'm very familiar with it and was happy to show it off. I decided to use a dictionary using the x,y position as a tuple and the value as a boolean whether that cell was alive.

Another thing that felt strained in the previous attempts was the level of tests. For this attempt, I tried making tests for the API. So the first thing implemented was "test_Set()" which tested setting a cell at a position as being alive. Then we implemented that by creating the "Game" and "World" classes, the game initializer created a world, and the world created teh dictionary and the set() function on it was just a simple dictionary call: "self.life_map[(x, y)] = is_alive".

The test for that succeeded. so we went on to the next API function with "test_Get()", which returns if the cell at the specified position is alive. The implmentation of that was initially a "return self.life_map[(x,y])", but the tests were failing for cells that had never been set. So I added a "if (x,y) not in self.life_map: return False". However, that in retrospect should have just been a "return self.life_map.get((x, y), False)". But, the first attempt at it got it working so we went on to the next test.

At around this point I realized that in the tests we were duplicaing world setup code so I refactored that code to have a "setUp()" that created the world, and modified the tests, then ran them to make sure they worked.

This test was for the API function "count_neighbors()", so we did a bunch of tests, set a cell, did those tests again with the new values, set another cell, did those tests again with the new new values...

The implementation of that was fairly simple, using the pattern Scott had come up with in the initial attempt in Ruby of two for loops walking through offsets of -1, 0, and 1, then checking if that cell was alive and incrementing the counter if so. Before running the test I realized that offset x=0,y=0 needed to be skipped in the check for life, so I put a conditional in there.

The test for that worked. The "world" class ended up being 16 lines of code, but two of those were compound "if X: Y". So, not a lot of code.

Now that we had the API working, we implemented a test for rule 1. Basically, set a cell, run the "next_generation()" metod of the game, and check to see that the cell died. The code for this walked the dictionary of x,y locations in the dictionary and if that cell was alive it wouldcheck the number of neighbors and if it was less than 1 it would "set(x,y,False)". Test ran successfully, so off we went...

Next was tests for Rule 2, which I created one for 2 neighbors and one for 3 neighbors. We ran the test before implementing the code, as we did with every attempt, and realized that this was already implemented by the code we had written, which if a cell didn't match any of the rules, it was propagated from one generation to the next. So we got rule 2 for free.

Next we added the test for rule 3 by creating a cell with 4 neighbors and then seeing that it died. The code for that was fairly simple, just kill the cell if it was alive and it had more than 3 neighbors. We ran the tests and now were seeing failures for tests that previously were working. So I looked at the new code and realized that where I intended to have added the new rule as an "elif", I had instead added it as an "if". Fixed that and now all my tests were passing again.

So finally we added a test for rule 4, if a cell is dead and it has exactly 3 neighbors, it comes alive. The implementation for this was added, and we ran the tests and they failed... I thought about it for a second and realized that we were only checking the live cells, since we were checking the cells in the dictionary. I realized that we needed to check the cells around the live cells too, so I used code similar to the "count_neighbor" code described above, and ran the rules on all the neighbors of the live cells. I realized that this would check some cells multiple times, but my goal was the simplest, most obvious code, there were no performance constraints. With this code in place, the tests succeeded.

At this point we looked at each-other and decided it passed all the rules tests, so it must be done. I said to Bill "Remember how you said there's no way to finish it?" He asked if we could show him a glider. "Yeah, what are the points that make up the glider? The copy he had on his screen was moving so it was hard to tell, but he then suggested a bar of 3 cells that just oscilates...

So we wrote a test that set a bar, checked the world for the bar, generated and tested for the line (and that the cells that should have died did die) and then generated again and tested live+dead cells again. We agreed that it was working...

So, I plugged in a "main()" part of the code, set a bar of 3 cells live, and added a function to the Game class to print out part of the world. My first attempt at that used dynamic min and max for the x and y coordinates, so it would pretty much just print "XXX" or "X\nX\nX\n", not super exciting.

So I just hard-coded it to print the first 5 rows and columns. We ran 3 generations of that and it looked good. So then we tried to figure out the coordinates of the glider. Bill too a screen print of the animated GIF on the Wikipedia page and from that we were able to poke in the coordinates. We ran 4 generations of that, and sure enough we had a glider being printed out.

Around this point, "time" was called. Unlike previous attempts, I did end up saving off a tar of this for later analysis (which is how I was able to recreate the attempt above).

In the end, the "World" class was 16 lines, the "Game" class was 19 lines plus another 9 lines to print the world, and the tests were 82 lines (many of them just repetitions of "assertEquals" lines for various locations. Another 25 lines in my "main" function with did 3 generations of the "bar" and then 5 generations of the glider and printed them out.

In retrospective, I'm quite certain that having the tests allowed me to develop the code more quickly. There were several errors I made, as outlined above, but having the tests allowed us to very quickly determine exactly where the problem was introduced, and look at exactly that code to determine the problem.

Without tests I would have *HAD* to implement the display() function earlier, and basically do test code that would spit out the world and eyeball that it was doing the right thing. I also likely would have been running testing in my head as I wrote the code, which would have slowed it down. And when I ran into the problems, I would have had to have reviewed the bulk of the code rather than exactly the last set of code I wrote.

So, those are the details as I remember them.
(go to entry which this is a comment to)


Date: Wednesday December 07, 2011 at 10:20
In Reply To: Report on 2011 Code Retreat by Sean Reifschneider
Subject: Thanks
Author: Chip Camden

Thanks for your account. One of the other participants in your session tells me that you actually had a working implementation from one of your iterations. Can you describe that one further?
(go to entry which this is a comment to)

Date: Sunday December 04, 2011 at 12:53
In Reply To: Lucid hangs after "Begin: Running /scripts/init-bottom". by Sean Reifschneider
Subject: Rescue mode...
Author: Sean Reifschneider

The "rescue mode" I mentioned in the original post was either booting from the install media (in my case a PXE server on the network, but for most that's likely the optical media), or possibly one of the "rescue" options in the GRUB boot menu. You can get to GRUB by holding down the shift key at the beginning of the Linux boot, it will bring up that menu. It also often brings it up if you have a failed boot previously.

I would suspect that I used the boot media option, but I don't honestly remember.
(go to entry which this is a comment to)


Date: Sunday December 04, 2011 at 11:37
In Reply To: Lucid hangs after "Begin: Running /scripts/init-bottom". by Sean Reifschneider
Subject: Ubuntu hang on startup
Author: Aaron

How did you get to the shell, I seem to get a box that says devices connected around its wording and wont let me say ok
(go to entry which this is a comment to)

Date: Tuesday November 29, 2011 at 12:16
In Reply To: "Word" Doc Authoring with pandoc by Sean Reifschneider
Subject: I have used RST...
Author: Sean Reifschneider

I have used RST in the past, it's a very similar idea and pandoc supports it as well, so the same idea applies. The reason I chose markdown for this is just that I've been doing more markdown recently because github will serve up a README.markdown as a nice page. Oh, and stackoverflow and serverfault use it for markup as well, which I guess is where I first started using it.

So, I'd call them fairly equivalent, but in my world I just had more markdown happening at recently.
(go to entry which this is a comment to)


Date: Tuesday November 29, 2011 at 08:49
In Reply To: "Word" Doc Authoring with pandoc by Sean Reifschneider
Subject: reStructuredText
Author: Davide Del Vento

Interesting. I wonder if you used reStructuredText too and which one do you prefer.
(go to entry which this is a comment to)

Date: Friday November 18, 2011 at 09:34
In Reply To: Using FreeDOS CD for BIOS updates. by Sean Reifschneider
Subject: Thanks
Author: Jeff

Used this to upgrade the BIOS on a Dell R710 & a Dell R610. Worked great!
(go to entry which this is a comment to)

Date: Thursday November 17, 2011 at 01:47
In Reply To: Switching from hardware to software RAID. by Sean Reifschneider
Subject: backup your LVM config
Author: Bob Blanchett

from hard experience, recovering LVM2 on sw raid is hard without the LVM config files. back them up and your partition tables whenever you change your physical or LVM geometry. you won't regret it.
(go to entry which this is a comment to)

Date: Tuesday November 15, 2011 at 01:35
In Reply To: ineedpy2: Library to run newer Python from a system-installed Python. by Sean Reifschneider
Subject: That's a solution to a different problem.
Author: Sean Reifschneider

As I had mentioned, our systems with really old Python versions have another Python available, it's just a matter of determining which one is available and using it. For most systems we don't need 2.7, so it'd be a shame to have to maintain a custom built Python 2.7 just to give a consistent path, when the system python there will meet the needs of most programs we are running on it.

The issue really comes in when we have a system with Python 2.1 as the default, and we're ok with the already installed python 2.4 or 2.5, we just need to swap over to it from the stock Python, without having to have a different package for every type of system.

We're in a somewhat different situation because we have a lot of systems, but we also have a lot of unique administrative domains. So using something like Puppet to have all the machines be the same isn't going to fly.
(go to entry which this is a comment to)


Date: Tuesday November 15, 2011 at 01:13
In Reply To: ineedpy2: Library to run newer Python from a system-installed Python. by Sean Reifschneider
Subject: My solution to the same problem
Author: Michael Dillon

My solution to this same problem was to write a build script that would build a portable Python 2.7.2 distro for Linux that will run on any new enough Linux distro or even on Solaris and FreeBSD.

https://github.com/wavetossed/pybuild

Currently the build script will run on Debian/Ubuntu, but the resulting tarball contains everything needed including private copies of shared libraries, so it will run on SUSE or Redhat distros without installing any binary libraries using the system tools.

Also, it includes a lot of 3rd party libraries that I happen to use, or might need in the near future. They also provide examples of how many other third party libraries could similarly be bundled along with dependent shared libraries.
(go to entry which this is a comment to)


Date: Sunday October 02, 2011 at 21:55
In Reply To: Proxmox VE versus VMWare ESXi by Sean Reifschneider
Subject: We've tried 1.9, seems fine.
Author: Sean Reifschneider

We recently upgraded to 1.9, and it's working well. Except that you can't live migrate from 1.8 hosts to 1.9 hosts, so we had to hard reboot those machines. Then there was another update a week or so later that would also require a hard reboot for migration. Unfortunate, but livable. Proxmox has been working extremely well for us.
(go to entry which this is a comment to)

Date: Sunday October 02, 2011 at 20:43
In Reply To: Proxmox VE versus VMWare ESXi by Sean Reifschneider
Subject: PVE 1.8
Author: IGnatius T Foobar

I've been running a production data center workload under PVE 1.8 for about six months now, and as you might expect it's even better this time around. 1.9 is available but I haven't upgraded yet. There are still some things that VMware can do which PVE can't, but PVE is rapidly closing the gap and I would wholeheartedly recommend it for production workloads.
(go to entry which this is a comment to)

Date: Sunday October 02, 2011 at 18:25
In Reply To: Introducing nanomon: Extremely light-weight monitoring. by Sean Reifschneider
Subject: Fixed.
Author: Sean Reifschneider

Ok, I've pushed up a new version that does not require the "with" statement in Python.
(go to entry which this is a comment to)

Date: Sunday October 02, 2011 at 15:10
In Reply To: Introducing nanomon: Extremely light-weight monitoring. by Sean Reifschneider
Subject: Python version
Author: Fred

On Python 2.4.3 (CentOS 5) I get this: nanomon status File "/root/bin/nanomon", line 128 with open(config.statusfile, 'r') as fp: ^ SyntaxError: invalid syntax This is on python 2.4.3. I attempted to re-write "with open(config.statusfile, 'r') as fp: with fp = open(config.statusfile, 'r') but got an error on the following "try:" statement. Not a Python programmer ;-) What version of Python does this target?
(go to entry which this is a comment to)

Date: Tuesday September 27, 2011 at 12:54
In Reply To: Introducing nanomon: Extremely light-weight monitoring. by Sean Reifschneider
Subject: Listing under the GPL.
Author: Sean Reifschneider

Thanks for the reply, I've listed it under the GPL license.
(go to entry which this is a comment to)

Date: Monday September 26, 2011 at 22:50
In Reply To: Introducing nanomon: Extremely light-weight monitoring. by Sean Reifschneider
Subject: License...
Author: Me

Will you be open sourcing it? The copyright says all rights reserved.
(go to entry which this is a comment to)

Date: Sunday July 31, 2011 at 21:48
In Reply To: Getting the expiration time of an SMTP certificate. by Sean Reifschneider
Subject: HEAD vs GET
Author: Daniel

Using "HEAD" vs "GET" should generate less load/bandwidth
(go to entry which this is a comment to)

Date: Monday July 25, 2011 at 08:13
In Reply To: The burden of data. by Sean Reifschneider
Subject: DFSMShsm
Author: Ken Whitesell

Ok, so the idea was sufficiently intriguing to get me to do some digging on my own. And sure enough, IBM does sell an HSM product for Linux, along with a Linux version of the Tivoli Storage Manager.

Unfortunately, the pricing fits into the standard IBM category of "If you have to ask, you can't afford it."

A brief search reveals some open source options - no idea as to the quality, maturity, or stability of any of them though. Still, if it's something you're seriously interested in, it may be worth some time researching.
(go to entry which this is a comment to)


Date: Monday July 25, 2011 at 07:47
In Reply To: The burden of data. by Sean Reifschneider
Subject: DFHSM
Author: Ken Whitesell

So it sounds to me like you're looking for something similar to IBM's DFHSM "Data Facility - Hierarchical Storage Manager" (? - I'm doing this off the top of my head)

So it's obviously been done before - It would be interesting to see something like this in the Linux / FreeBSD world.

Although, with a high-speed infrastructure (SANs, NASs, SCSI over IP, etc), it's probably less an issue of getting it built into the kernel than something used to manage a file system. I could see having a dedicated Linux system acting as the "Storage Manager" that would migrate data as necessary - with all data access being channeled through it.
(go to entry which this is a comment to)


Date: Monday July 04, 2011 at 10:14
In Reply To: A Review of Linux-based SIP Phones by Sean Reifschneider
Subject: gizmo
Author: arondir

sipphone was destroyed by the google kraken.
(go to entry which this is a comment to)

Date: Monday June 27, 2011 at 15:32
In Reply To: Using FreeDOS CD for BIOS updates. by Sean Reifschneider
Subject: Super recipe
Author: Zhi-Wei Lu

This is the simplest recipe for Linux geeks to make a BIOS upgrade bootable CD. I have wasted many hours elsewhere.
(go to entry which this is a comment to)

Date: Monday June 13, 2011 at 15:27
In Reply To: PyCon 2011 Networking Preliminary Information by Sean Reifschneider
Subject: Write-up for 2011's networking
Author: Sean Reifschneider

I had been saving this to reply once I had some time, but it's becoming clear that won't be happening. :-) So, I think I'm going to pass on putting up anything about the networking this year. Partly that's due to lack of time, partly because I wasn't really involved in the networking this year (I didn't attend PyCon because of a business emergency), and partly because the statistics I have are fairly close to what we saw last year. So the easy route is to say that it was like last year. :-)
(go to entry which this is a comment to)

Date: Monday June 13, 2011 at 13:56
In Reply To: Selecting input/output devices in PulseAudio by Sean Reifschneider
Subject: That volume control looks good.
Author: Sean Reifschneider

That gnome-volume-control looks very good as well, sometimes the one I mentioned above acts weird so I may try this as well. Unfortunately, on my system, right-clicking the volume application brought up some mixer that really was totally unhelpful for doing this, probably because I'm running KDE instead of Gnome, or possibly this could be fixed by going to another volume control application.
(go to entry which this is a comment to)

Date: Thursday June 02, 2011 at 22:17
In Reply To: Selecting input/output devices in PulseAudio by Sean Reifschneider
Subject: Gnome makes it easy
Author: Stephen Warren

Gnome includes a simple application for this; gnome-volume-control. You can access it by (left-) clicking on the speaker/volume icon in the notification area (typically top/right of the screen) and selecting "Sound Preferences".
(go to entry which this is a comment to)

Date: Thursday June 02, 2011 at 10:47
In Reply To: Manually checking a RAID array for consistency. by Sean Reifschneider
Subject: RAID-5 can detect it, not correct.
Author: Sean Reifschneider

My thinking was that RAID-5 could see that the data stripes didn't match the checksum, but it couldn't tell if one of the data stripes or the checksum were what had been corrupted.

I guess that as long as it's not the checksum that was corrupted you could build the checksum with each of the combinations of a simulated single drive failure, and see if any of those matched the checksum, and if you found a match use that combination as the solution.

With RAID-6 you can do that even if one of the checksum stripes gets corrupted.

ZFS stores checksums with each of the data stripes so it can tell which stripe has failed and rebuild that particular block as if an I/O failure had happened.

You are aware that software RAID-1 on Linux will normally show increasing mismatch_count due to the way that the buffer cache works? This is also something Stephen found. The situation arises when you write a file and then delete it, the blocks can get invalidated in the block cache when one drive has written the buffer but the other has not. So running a verify on software RAID-1 only serves to make sure that both drives are having no read errors.
(go to entry which this is a comment to)