Kurt McKee

lessons learned in production

Archive

Hey there! This article was written in 2009.

It might not have aged well for any number of reasons, so keep that in mind when reading (or clicking outgoing links!).

listparser v0.10 - "Internet-ready"

Posted 12 December 2009 in listparser

It's been over two months since the last listparser release, but believe me: the wait has been worth it!

Python 3 support

listparser is now 100% Python 3 compatible! I can prove it, too: it passes all of the 159 unit tests! This is in addition to the existing support for Python 2.4, 2.5, and 2.6. In order to convert listparser to Python 3 format, simply run the following command:

$ 2to3 -w listparser.py lptest.py

After a little churning, 2to3 will write out the necessary changes, and you can run the unit tests by typing:

$ python3 lptest.py

(On that note: I noticed a weird issue where Python 3 would only occasionally fail a single test the first time it was run. Re-running the test suite, however, makes the problem go away. I have no idea what's causing that...)

Support for undeclared character references

I've known for a long time that listparser would eventually need to handle undeclared character references to be robust, so I finally took the time to whip up DOCTYPE injection code. The basic idea is that if listparser encounters an undeclared character reference (such as æ) it will inject the necessary declarations and re-parse the document. Thus, æ would be correctly transformed to æ. Sweet.

Crasher bugs, et cetera

I identified several places where listparser might crash and patched those problems right up. Additionally, thanks to coverage and cProfile, I identified several places that hadn't been tested thoroughly, and even improved some very stupid code that made listparser take over five minutes to parse the Planet KDE FOAF file (which, by the way, listparser can now parse just fine).

Cheeseshop support

listparser has really grown over the past six months, and I figured it was time to make it available at the Python Package Index. Assuming you have easy_install on your system, you can now install listparser by typing:

$ easy_install listparser

easy_install will automatically go out and get the latest version for you. By the way, I discovered that the Python Package Index can host documentation, so may I recommend that you go check out the listparser documentation? It looks great!

listparser is a Python library that parses subscription lists (also called reading lists) and returns all of the feeds, subscription lists, and "opportunity" URLs that it finds. It supports OPML, RDF+FOAF, and the iGoogle exported settings format.

[
homepage |
download |
repository |
documentation |
bugs ]

☕ Like my work? I accept tips!