Kurt McKee

lessons learned in production

Archive

Hey there! This article was written in 2015.

It might not have aged well for any number of reasons, so keep that in mind when reading (or clicking outgoing links!).

feedparser 5.2.0

Posted 24 April 2015 in feedparser and release

I'm pleased to announce the release of feedparser 5.2.0!

It's available only on the Python Package Index (PyPI) as Google Code is shutting down its services. The update incorporates over two years of work and patches from multiple authors.

Some of the release highlights include what will hopefully be a significant boost to performance: all in-HTML microformat parsing has been removed, the BeautifulSoup dependency has been removed, HTML Tidy support has been removed, and chardet use is now lazy -- it will only run exactly when it is needed...IF it is needed. Additionally, date-time parsing has been revamped to use procedural code, and many additional HTML5, MathML, and feed elements have been added, including Podlove Simple Chapters, MediaRSS, Dublin Core, GeoRSS, and GML, Finally, there have been a number of bug fixes.

The official repository has moved to GitHub in advance of Google Code shutting down services. Additionally, to support faster development and release cycles, feedparser now follows the Successful Git Branching Model. Basic summary: all development occurs in feature branches, these are merged with --no-ff into the develop branch, and when a new release is ready it branches off of develop for prep work and is merged into master with --no-ff and an annotated tag is created. The release branch is then merged back into develop.

tl;dr -- always create a new branch off of the develop branch when making changes to the code and submitting pull requests. =)

Future development is going to focus on a number of core issues, some of which have made feedparser a difficult body of code to approach and contribute to. For instance:

  • Pruning the list of supported Python interpreter versions
  • Splitting feedparser into a manageable set of function-specific files
  • Resolving differences between the various XML and SGML parsers
  • Reducing the amount of Unicode encode/decode operations
  • Modernizing the syntax, code idioms, and best practices used throughout the code
  • Leveraging the vibrant Python package ecosystem that has grown in the 13 years since feedparser was started

I'm excited for how feedparser will improve in 2015!

☕ Like my work? I accept tips!