Other articles

  1. The Debris Cathedral

    For a very long time I've been writing code at my job that automates some very tedious tasks. It's not Python code, though! It's a macro language that has saved the company a lot of time and energy but suffers from severe limitations:

    • No scoping -- everything is in a global …
    read more
  2. Feedparser porting status

    I spent some time this weekend working on porting feedparser to Python 3, and found that it will be difficult because there are two separate parsers included (a strict parser and a loose parser), and while each works differently, both use the same core machinery in feedparser.

    With the strict …

    read more
  3. Subclassing Python types

    I recently found a need to subclass the builtin type unicode and add some additional properties. To instantiate I wanted to pass in a big, ugly object and get a unicode object back. After trying fruitlessly to override __init__, I finally read up on the Python data model. Turns out …

    read more
  4. Wordpress.com and comment feeds

    A few days ago I added support for tracking Wordpress.com comments, but its implementation leaves a lot to be desired, as I'm merely using comment feeds for the purpose.

    The first problem is that Wordpress.com limits the number of comments in feeds to 10. Thus, it is very …

    read more
  5. Revisiting comment scraping

    Earlier this month I wrote about scraping LiveJournal comments. What was I thinking?

    While I was able to account for a number of variables in the page by tweaking my XPath statements, it became obvious early on that screen scraping for comments should be a last resort. So I decided …

    read more
  6. Scraping LiveJournal comments

    As a first attempt at expanding my comment tracking software, I did a little testing in regards to scraping LiveJournal comments. Having written some uncomfortably convoluted XSL transformations in the past, I've become familiar with XPath. While BeautifulSoup has served me well in the past for quick excursions into the …

    read more
  7. Comments elsewhere

    I've been working very hard to track down and save up all of my comments on others' websites. Although there are still some comments out there, I've manually added over 300 comments, which has been a serious chore.

    Tonight I slapped a little CSS together so that the comment viewing …

    read more
  8. Python's time module

    So I've been working like a fiend to import all of my comments from around the internet. It has been a herculean effort because almost everything has to be done manually. One comment here, another there, and no uniform way to extract those comments.

    Two sites, however, made comment retrieval …

    read more