Kurt McKee

lessons learned in production

Hey there! This article was written in 2009.

It might not have aged well for any number of reasons, so keep that in mind when reading (or clicking outgoing links!).

Normalizing years

Posted 22 June 2009 in listparser

I've been implementing the RFC 822 date and time specification in listparser in order to support the dateCreated and dateModified tags, and while doing so I've found a fun problem to mull over.

RFC 822, in classic shortsighted fashion, calls for years to be represented using only two digits. OPML calls for years to be either two digits or four digits. Thus, the following two dates are equivalent:

Sun, 21 Jun 09 19:22:00 CDT
Sun, 21 Jun 2009 19:22:00 CDT

The problem is ensuring that the year is always four digits long by the end of the program. One simple way to do this is to use the following, assuming arbitrarily that everything in the 90's is the 1990's, and everything else is in the 21st century:

if year < 100:
    if year >= 90:
        year += 1900
    else:
        year += 2000

After writing that code, I was so disgusted that I rewrote it as a single line with no if statements at all:

year += (year < 100) * (19 + (year < 90)) * 100

I'm abusing the fact that False and True evaluate to 0 and 1, respectively. Unfortunately, this is magic and unmaintainable code. I lamented that Python didn't have a ternary operator, then realized that I had never actually checked. Sure enough:

if year < 100:
    year += 1900 if year >= 90 else 2000

Much more maintainable, but ternary operators were only introduced in Python 2.5, which not everyone might have. Happily, Wikipedia's entry on ternary operators introduced me to using booleans as list indices (which hadn't occurred to me, despite knowing that they evaluated to 0 and 1 in arithmetic). Here's the final, reasonable, readable code:

if year < 100:
    year += (1900, 2000)[year < 90]

I expect to have another listparser release out next week.

UPDATE

Heh, I guess I could remove the if statement again by extending the above concept:

year += (0, 1900, 2000)[(year < 90) + (year < 100)]