Kurt McKee

lessons learned in production

Sitemap now included!

Posted 25 April 2020 in pelican and website

After I imported all of my oldest content I knew that it would be tough for search engines to find the content. I figured a sitemap would be a good start (though it's not a panacea). However, of the two Pelican plugins that I found, neither of them seemed to fit my needs.

First, neither plugin automatically generates (or updates) a robots.txt file. That puts me in the unenviable position of generating a sitemap that never gets found and used by search engines. Any plugin that generates a sitemap needs to automatically modify robots.txt for me, or it should notify me that I need to do additional work.

Second, the existing plugins add unnecessary whitespace to the XML. Yes, I am complaining about 4KB of wasted bytes.

I'm not an expert on sitemaps, let's be clear on that. But my third concern is that it looks like both existing sitemaps plugins are including tags and category index pages, which doesn't seem interesting to me. Search engines can find those index pages trivially once they know the URL's to my actual content so it seems unnecessary to include them.

With all of that in mind, I created a sitemaps plugin that meets my needs. I'll eventually publish it so others can benefit, but my standards are high so it needs unit tests and documentation before it sees the light of day. Like my pelican-precompress plugin, I'll announce the sitemaps plugin if and when I publish it.

Further reading: Pelican, sitemaps, robots.txt