Sunday December 04, 2011 at 16:28
Subject: Report on 2011 Code Retreat
Keywords:
Programming, Python
Posted by: Sean Reifschneider
Yesterday Bill Tucker and Matt Rose organized a Code Retreat in Fort
Collins. It was "Global Day of Code
Retreat", and there were events happening all around the world. The
basic idea was that you paired up and for 45 minutes used best practices
(as if you had all the time in the world to do it) to work on a programming
problem. Then everyone deleted their code, stood up and discussed the previous
round. Then you paired up with someone new and started over on the same
problem.
It was a great event, extremely rewarding on many levels. Read on for
more of my thoughts on the event.
The problem we were working on was
Conway's Game
of Life. This was a good choice because it presented plenty of
opportunity for writing tests as part of the development process, was small
enough that you could make plenty of progress during the 45 minutes, and
had opportunities for variants that could keep it interesting (representing
the field as an array, allowing it to be infinitely sized, a torus
world)...
One of the nice things about the event was the exposure to various
languages and environments. I ended up doing most of them in Python, but
did start with an attempt in Ruby, and also made a vain attempt at
implementing it in Prolog. That was in vain because neither of us had used
Prolog at all recently.
I really enjoyed the opportunity to do a number of iterations using
test-driven development.. I've never really gotten comfortable with
building the testing scaffolding (remember, I'm not a programmer, I'm lucky
if I get to program 5% of the time). This was an opportunity to address
that, and I really did get comfortable with it.
Another thing I really liked was the "pair programming" aspect of it.
It's something I haven't really done extensively before (we mostly use
reviews rather than pair programming, which are very effective). I'm not
sure I'd promote pair programming as the default, but I could see it being
very beneficial in situations where the problem or solution aren't well
defined.
One interesting thing was that last week I had reviewed my personal
Wiki because of some vandalism, so I knew there was a recipe in there for
setting up unit tests in Python. That let me get going very quickly with
the testing, and prevented any spinning of wheels there. Having the right
references available is a huge productivity enhancement, that was one of
the big barriers in the attempt at Prolog -- we were struggling to just
express what we needed, despite knowing roughly "the Prolog way".
Another thing I really liked was the opportunity to just throw away
the code and start over from scratch. I've had experience with re-writing
code from scratch and making huge advancements because of that (one time I
had a day's worth of code go away when I had an argument with my version
control system, before backups for that day had run).
However, I did save the code of one attempt and I analyzed the process
we used on it last night and found that process extremely rewarding.
At one point during a later attempt I did have an urge to go back and
review that saved off previous attempt, but I was able to restrain myself.
I'm glad I didn't in retrospect, that would have been negatively impacted
my experience.
One of the things that was really reinforced for me was testing at the
right level. The one attempt that we were able to succeed on, the testing
consisted of testing the API ("set(x,y)", "get(x,y)",
"count_neighbors(x,y)") which allowed us to define the API. Then we added
tests for the 4 rules of the game of life, implementing each rule after
building the test.
One of the neat things about this was that it revealed that our
implementation of rule 1 also implemented rule 2. So when we added the
test and it succeeded we could say "oh, yeah, I guess this part of the rule
1 implementation causes rule 2 to pass as well". So building the test and
running it before we started coding it ended up saving the time of thinking
through the code or possibly implementing some code that was redundant.
The implementing of rule 3 introduced a bug which testing caught right
away and we were able to address. Then in our implementation of rule 4, we
realized that we had a bug in our algorithm, and we were able to address it
at that time.
This process of having the right tests, no more and no less, I'm
confident significantly increased the speed of development of that round.
A later round I did which Jesse, which was basically the same code but
implemented with a Python set() rather than dictionary, did not succeed and
I believe part of that was due to not having the right testing. In that
attempt we were just testing the game of life rules, not the API, but we
also were not building tests, running tests, and then coding, we were
running in different directions at various times for various reasons...
One thing I thought would be very interesting to try is having a
"backseat" position in the groups, allowing for more involvement of more
introductory-level programmers, allowing them to see how development goes
without having to necessarily be at the skill level where they can drive.
I know several people who are interested in learning programming right now
that I think would have been at the level they could have benefited from
watching, but they may not have been prepared to be more actively involved
in the process.
Another thing that was reinforced for me is that a good solution you
are familiar with is better in many ways than a better solution you aren't
familiar with. In the "infinite world" implementation, I picked
dictionaries over sets initially because I was just more comfortable with
dictionaries. I know sets are fairly simple, but I did run into a couple
of road-blocks that I had to clear when doing the set() implementation that
I didn't with dictionaries.
I think that sets are the right data-type for the implementation I was
working on, but the best way to implement that would have been to implement
it with dictionaries, get all the tests and code running, and then change
the implementation to use sets. In that way I would have had basic code
working and known the issues were in the set implementation, rather than
have been debugging the basic code and the sets and
refactoring of another part of the code all at the same time.
That reminds me of a programming class I took back in high school
where we had to implement a maze solving program, but we couldn't use
recursion. I spent quite a while trying to implement it without recursion,
and eventually I gave up and implemented it recursively (which really made
sense to me), and then converted the code to not be recursive.
Bill at one point fairly early on, in the break between attempts,
asked the question "what changes would you have to do to support an
infinitely sized world?" This was a wonderful question, because up until
that point I had been using a two dimensional array, and that made me think
about the problem differently. Because of that, for a later attempt I
tried a dictionary of alive cells, and that ended up being dramatically
simpler than the array of arrays. Without Bill asking that question, I
likely would have been stuck on the arrays.
In one of the groups we had this great discussion about testing. We
were going to return a list of neighbors of a cell. I was thinking we
would return the coordinates of the cell neighbors, Jen was thinking we
would return whether the cells were alive or not. This turned into a
lively discussion about what was the right level of abstraction for this
function, how that impacted testing, whether we should encapsulate more or
less logic in that function (less logic in the function allows for more
accurate pin-pointing with the testing), etc...
This was one instance I really wish we had more time for, because we
were basically at the tail end of the discussion when time ran out. And, I
had conceded the point in the interest of making progress, where the most
valuable thing would have been to have pursued that discussion to the best
possible solution.
There was some discussion of having the time be longer so you could
explore other things. For me, I think 45 minutes was about right. The
goal was that you wouldn't be able to complete it in that time, which I
think is a good target. Because if you get into the mentality that you can
finish it, then you really need to have more than enough time for everyone
to complete it. I say this because if you have just enough time to solve
it if you cut corners or you don't do anything inventive, then you probably
won't try those other things. And trying those other things is the whole
point.
So I'm firmly in the camp of "not so much time". Though I understand
that not every group programs at the same speed, so if you were only
getting 10% of the way through it's going to be a different experience than
if you're getting 80% of the way through.
There was also some discussion of keeping the same pairing, and in
retrospect I am very glad that that didn't happen. Part of the whole
reason for the Code Retreat, in my mind, is to get experience with
different pairings, so you can get new experiences of how people are
working on code. If we had kept the same pairings, I might not have seen
the windowed vim setup, the Eclipse development environment, or worked with
Ben on a wild-assed attempt to get Prolog working. ("Ok, I'm building a
Prolog environment now." :-).
Those are my thoughts about the code retreat. It was an extremely
educational and entertaining experience. I'd highly recommend it to anyone
who wants to program.
(Post Reply)
(Post Reply)
| Comment |
Author:
Chip Camden Subject: Thanks |
Thanks for your account. One of the other participants in your session tells me that you actually had a working implementation from one of your iterations. Can you describe that one further?
| Comment |
Author:
Sean Reifschneider Subject: How I finished in 35 minutes... |
We didn't have any forced constraints like your sessions did, Chip. So
that may have been part of it. I was doing some things myself to help keep
it interesting though.
To answer your question, I don't recall that I've ever implemented a
game of life before that day, though that day we had done 2 implementation
attempts before the successful one. I had certainly known about the GoL,
and played with it before, but if I did an implementation it must have been
more than 20 years ago.
The first attempt was done in Ruby, and we spent around 5 minutes
tracking down an issue that in the end was because one of the files was in
a different directory than the others. In that attempt we were using a
static array and we implemented the tests and the code for rules 1 through
3 of the GoL, and were just starting on the 4th. We were surprisingly
close. This was basically one class storing both the game logic and the
world.
The second attempt was in Python and we were going to try implementing
it as a torus world. We didn't make a lot of progress in that one, but we
did have quite a discussion about what the appropriate level of abstraction
for the neighbors counting function should be. I mentioned that in my post
above. I don't recall if we even implemented any of the rules of the GoL
in that session, but I don't think we did. This attempt used a class for
the game logic and another for the world, something that felt more natural
than the first attempt.
In this attempt we decided to try an infinite world, again in Python.
Greg wanted to see Python, and I'm very familiar with it and was happy to
show it off. I decided to use a dictionary using the x,y position as a
tuple and the value as a boolean whether that cell was alive.
Another thing that felt strained in the previous attempts was the
level of tests. For this attempt, I tried making tests for the API. So
the first thing implemented was "test_Set()" which tested setting a cell at
a position as being alive. Then we implemented that by creating the "Game"
and "World" classes, the game initializer created a world, and the world
created teh dictionary and the set() function on it was just a simple
dictionary call: "self.life_map[(x, y)] = is_alive".
The test for that succeeded. so we went on to the next API function
with "test_Get()", which returns if the cell at the specified position is
alive. The implmentation of that was initially a "return
self.life_map[(x,y])", but the tests were failing for cells that had never
been set. So I added a "if (x,y) not in self.life_map: return False".
However, that in retrospect should have just been a "return
self.life_map.get((x, y), False)". But, the first attempt at it got it
working so we went on to the next test.
At around this point I realized that in the tests we were duplicaing
world setup code so I refactored that code to have a "setUp()" that created
the world, and modified the tests, then ran them to make sure they worked.
This test was for the API function "count_neighbors()", so we did a
bunch of tests, set a cell, did those tests again with the new values, set
another cell, did those tests again with the new new values...
The implementation of that was fairly simple, using the pattern Scott
had come up with in the initial attempt in Ruby of two for loops walking
through offsets of -1, 0, and 1, then checking if that cell was alive and
incrementing the counter if so. Before running the test I realized that
offset x=0,y=0 needed to be skipped in the check for life, so I put a
conditional in there.
The test for that worked. The "world" class ended up being 16 lines
of code, but two of those were compound "if X: Y". So, not a lot of code.
Now that we had the API working, we implemented a test for rule 1.
Basically, set a cell, run the "next_generation()" metod of the game, and
check to see that the cell died. The code for this walked the dictionary
of x,y locations in the dictionary and if that cell was alive it wouldcheck
the number of neighbors and if it was less than 1 it would
"set(x,y,False)". Test ran successfully, so off we went...
Next was tests for Rule 2, which I created one for 2 neighbors and one
for 3 neighbors. We ran the test before implementing the code, as we did
with every attempt, and realized that this was already implemented by the
code we had written, which if a cell didn't match any of the rules, it was
propagated from one generation to the next. So we got rule 2 for free.
Next we added the test for rule 3 by creating a cell with 4 neighbors
and then seeing that it died. The code for that was fairly simple, just
kill the cell if it was alive and it had more than 3 neighbors. We ran the
tests and now were seeing failures for tests that previously were working.
So I looked at the new code and realized that where I intended to have
added the new rule as an "elif", I had instead added it as an "if". Fixed
that and now all my tests were passing again.
So finally we added a test for rule 4, if a cell is dead and it has
exactly 3 neighbors, it comes alive. The implementation for this was
added, and we ran the tests and they failed... I thought about it for a
second and realized that we were only checking the live cells, since we
were checking the cells in the dictionary. I realized that we needed to
check the cells around the live cells too, so I used code similar to
the "count_neighbor" code described above, and ran the rules on all the
neighbors of the live cells. I realized that this would check some cells
multiple times, but my goal was the simplest, most obvious code, there were
no performance constraints. With this code in place, the tests succeeded.
At this point we looked at each-other and decided it passed all the
rules tests, so it must be done. I said to Bill "Remember how you said
there's no way to finish it?" He asked if we could show him a glider.
"Yeah, what are the points that make up the glider? The copy he had on his
screen was moving so it was hard to tell, but he then suggested a bar of 3
cells that just oscilates...
So we wrote a test that set a bar, checked the world for the bar,
generated and tested for the line (and that the cells that should have died
did die) and then generated again and tested live+dead cells again. We
agreed that it was working...
So, I plugged in a "main()" part of the code, set a bar of 3 cells
live, and added a function to the Game class to print out part of the
world. My first attempt at that used dynamic min and max for the x and y
coordinates, so it would pretty much just print "XXX" or "X\nX\nX\n", not
super exciting.
So I just hard-coded it to print the first 5 rows and columns. We ran
3 generations of that and it looked good. So then we tried to figure out
the coordinates of the glider. Bill too a screen print of the animated GIF
on the Wikipedia page and from that we were able to poke in the
coordinates. We ran 4 generations of that, and sure enough we had a glider
being printed out.
Around this point, "time" was called. Unlike previous attempts, I did
end up saving off a tar of this for later analysis (which is how I was able
to recreate the attempt above).
In the end, the "World" class was 16 lines, the "Game" class was 19
lines plus another 9 lines to print the world, and the tests were 82 lines
(many of them just repetitions of "assertEquals" lines for various
locations. Another 25 lines in my "main" function with did 3 generations
of the "bar" and then 5 generations of the glider and printed them out.
In retrospective, I'm quite certain that having the tests allowed me
to develop the code more quickly. There were several errors I made, as
outlined above, but having the tests allowed us to very quickly determine
exactly where the problem was introduced, and look at exactly that code to
determine the problem.
Without tests I would have *HAD* to implement the display() function
earlier, and basically do test code that would spit out the world and
eyeball that it was doing the right thing. I also likely would have been
running testing in my head as I wrote the code, which would have slowed it
down. And when I ran into the problems, I would have had to have reviewed
the bulk of the code rather than exactly the last set of code I wrote.
So, those are the details as I remember them.