Kurt McKee

lessons learned in production

Archive

Hey there! This article was written in 2011.

It might not have aged well for any number of reasons, so keep that in mind when reading (or clicking outgoing links!).

Predictions and facts

Posted 28 March 2011 in performance, profiling, programming, and software

After porting feedparser to Python 3, I've consistently tested every change on Python 2.4 through Python 3.1. That's four versions of Python 2, and two versions of Python 3. I've also started creating coverage reports for each test run, so I can ensure that the tests are reasonably thorough. Unfortunately, both versions of Python 3 take at least five minutes to run on my computer, so after installing Python 3.2, a full test run takes almost 20 minutes!

Dissatisfied, I sat down and used the inexorably awesome cProfile module to get information on what's taking forever in Python 3.0, and pprint() is being called almost 3 million times wtf?! Where is pprint() even being used?

Well. It turns out that pretty error messages were being preemptively created for every test, regardless of whether the test failed or not. You know -- just in case. I moved two lines of code, and Python 3.0 through 3.2 are now running faster than Python 2. I never would have guessed that pprint() would cause me so much watch-checking aggravation, but thank goodness I had tools to tell me precisely where to look for a solution!

With that Aesop fresh in my mind, I've been really frustrated by a situation I've found myself in. I've been interacting with a guy who's been telling me that I need to use foo to fix all of my problems. I've expressed doubt multiple times since there's no profiling information, no performance numbers, and definitely no facts to support his case. Unfortunately, this guy keeps waving his hands and confidently predicting that foo will solve all of my problems and make me a cup of hot cocoa when I come in from the cold. Recently, however, I found out that foo might actually make it more difficult to get performance numbers. The stupid thing might actually insulate itself from profiling and reporting tools! When I brought this to his attention he replied (paraphrasing and emphasis my own) that it "shouldn't be an issue. I predict that we'll be able to predict where the problems are."

Frankly, I don't need a prophet. I need an application profile. I need performance numbers. I need call counts and execution plans and logs. I won't be snookered by confidence, and you shouldn't be either.

☕ Like my work? I accept tips!