Normalizing years
Posted 22 June 2009 in listparserI've been implementing the RFC 822 date and time specification in listparser in order to support the dateCreated
and dateModified
tags, and while doing so I've found a fun problem to mull over.
RFC 822, in classic shortsighted fashion, calls for years to be represented using only two digits. OPML calls for years to be either two digits or four digits. Thus, the following two dates are equivalent:
Sun, 21 Jun 09 19:22:00 CDT
Sun, 21 Jun 2009 19:22:00 CDT
The problem is ensuring that the year is always four digits long by the end of the program. One simple way to do this is to use the following, assuming arbitrarily that everything in the 90's is the 1990's, and everything else is in the 21st century:
if year < 100:
if year >= 90:
year += 1900
else:
year += 2000
After writing that code, I was so disgusted that I rewrote it as a single line with no if
statements at all:
year += (year < 100) * (19 + (year < 90)) * 100
I'm abusing the fact that False
and True
evaluate to 0 and 1, respectively. Unfortunately, this is magic and unmaintainable code. I lamented that Python didn't have a ternary operator, then realized that I had never actually checked. Sure enough:
if year < 100:
year += 1900 if year >= 90 else 2000
Much more maintainable, but ternary operators were only introduced in Python 2.5, which not everyone might have. Happily, Wikipedia's entry on ternary operators introduced me to using booleans as list indices (which hadn't occurred to me, despite knowing that they evaluated to 0 and 1 in arithmetic). Here's the final, reasonable, readable code:
if year < 100:
year += (1900, 2000)[year < 90]
I expect to have another listparser release out next week.
UPDATE
Heh, I guess I could remove the if
statement again by extending the above concept:
year += (0, 1900, 2000)[(year < 90) + (year < 100)]