Kurt McKee

lessons learned in production

Archive

Hey there! This article was written in 2010.

It might not have aged well for any number of reasons, so keep that in mind when reading (or clicking outgoing links!).

Normalizing URLs

Posted 1 June 2010 in urlnorm

Last weekend I started coding up a URL normalizer. The idea is that it will be able to take a URL and change it to its canonical form, so that http://KURTMCKEE.ORG:80 will become http://kurtmckee.org/. Most of that time was spent coding up some proof-of-concept functions. This weekend I added IP address conversion math to accommodate all of the degenerate forms that IP addresses could potentially take. I also kicked out some unit tests and figured it was time to push the code to github. Check it out.

This isn't a release announcement; the software is sketchy at best right now, and won't function properly. The only thing it can competently do right now is manipulate IP addresses! However, my vision is ultimately to have a tool that might could rip out query variables that add visit tracking cruft to the URL, such as those added by Feedburner.

☕ Like my work? I accept tips!