Yesterday was the second day of the Python Need for Speed sprint. Getting 15 Python geeks together for a day can do amazing things. Lots of progress has been made. Read on for more.
Probably one of the bigger successes yesterday was Fredrik Lundh and Andrew Dalke went over strings with a fine-tooth comb. They were able to get Unicode string performance improved by a huge amount. The overall Unicode test they were using went from 3.8 seconds to 1.3 seconds, so it's pretty impressive. Similar changes can be made to regular strings, but yesterday was concentrating on Unicode. This work will continue on Wednesday.
Early in the day, I went to Tim Peters and said “In appreciation for all the things you do for Python, I'd like to present you with the second largest can of coke in the world.” Steve Holden then took the second best photograph:
Richard Jones and Steve Holden spent much of the day working on pybench. Because of the problems we were having on Monday, and how little people were trusting pybench, it was pretty important to make it more reliable. One of the bigger changes was to use the best time out of N runs, instead of averaging the runs. This tends to reduce jitter. Unlike taking the best performance of a pro athlete, your CPU isn't likely to be on steroids for a test. It's more an indicator of having fewer things detracting from the test, like interrupts and other processes. The results seem to be a totally usable pybench for performance analysis.
Work was done by Georg Brandl, Jack Diederich, Christian Tismer on the decimal module.
Runar Petursson and Tim Peters worked on string to integer conversion tweaks. Part of what Runar was working on was allowing for integer conversions from strings to be given a start offset and length for the sub-string conversion. A common int conversion use-case is to slice a string to hand off, which requires creating a new object. If you are doing this in a tight loop, getting rid of creating a new object is costly.
Martin Blais started working on the buffer object, particularly for use with sockets. John Benediktsson showed us the Java byte buffer, which has a feature that he'd really like to see in Python. You can take the buffer and temporarily give it a new starting-point and length. Currently in Python this is done by slicing an existing string, which creates a new object. So for parsing a string it can be very expensive. However, if you can just keep the original string and set it to each of the sub-fields as they are processed, it's much faster.
Bob Ippolito made the struct module compile the conversion rules and got a 20% speed improvement. Yay!
The zombie frames from Monday were committed by Richard Jones, and then he continued working on the frame objects by bringing in a patch from the tracker.
Richard M Tew and I took some time staring at the 60% slow-down in raising a new-class exception. We seemed to have tracked it down to new style class creation time, it's not caused by changes to the exception machinery. However, new-style classes are only 10% slower to create than old-style. I bugged Brett Cannon and he got on IRC and helped talk it over with us. That's going to have to continue to be worked on.
Richard Emslie worked on getting John Shipman's skiplist implementation working in rpython.
We ended up working pretty late and then eating dinner at the hotel. It was a good, solid day of work with great results. Yay Python!comments powered by Disqus