Self-documenting code

Posted 26 December 2019 in programming and python

The problem

I maintain code and narrative documentation at work to help ensure that there's a good understanding of the environmental requirements in which the software runs. However, this increases the maintenance burden: every time I want to make a change to the code I have to separately update the documentation.

For example, if the software must connect to several servers in a specific order, that's a requirement that has to be documented. If the list of servers changes for any reason, I have to update the code and the documentation. What a waste of time.

It would be far better if I could update the code or documentation and have the other piece automatically update, too. Although it's possible to parse the documentation at runtime into usable code, I think it's easier to generate the documentation using up-to-date code.

To do this, I'm going to modify the module's docstring at import so the Sphinx autodoc extension can extract the docstring.

The current situation

Let's say I have a module I can import named server.py. It currently looks like this:

# server.py
# ---------

servers = (
    ('Boston', '192.168.1.10'),
    ('Liverpool', '192.168.2.20'),
    ('Moscow', '192.168.3.30'),
    ('Seoul', '192.168.4.40'),
)

# A bunch of additional code connects to the servers,
# collects information, and logs it.

The process is documented in a ReST file named process.rst. It currently looks like this:

..  process.rst
..  -----------

The following data centers will be accessed, in this order:

Data center     IP address
--------------- ------------
Boston          192.168.1.10
Liverpool       192.168.2.20
Moscow          192.168.3.30
Seoul           192.168.4.40
--------------- ------------

This is inefficient and can be fixed by writing a little more code.

The solution

First, I'm going to create a docstring in server.py. The code below will automatically run when Python imports the module:

# server.py
# ---------

"""The following data centers will be accessed, in this order:

#SERVERMARKER

"""

servers = (
    ('Boston', '192.168.1.10'),
    ('Liverpool', '192.168.2.20'),
    ('Moscow', '192.168.3.30'),
    ('Seoul', '192.168.4.40'),
)

# Adjust the docstring.
lines = []
for line in __doc__.splitlines():
    if not line.startswith('#SERVERMARKER'):
        lines.append(line)
        continue

    lines.append('..  csv-table::')
    lines.append('    :header: "Data center", "IP address"')
    lines.append('')
    for location, ip_address in servers:
        lines.append(f'    "{location}", "{ip_address}"')

__doc__ = '\n'.join(lines)

# The `del` keyword could be used to remove the additional variables
# (like `lines` and `location`) from the module namespace.

When this code runs, server.__doc__ will contain the following text:

The following data centers will be accessed, in this order:

..  csv-table::
    :header: "Data center", "IP address"

    "Boston", "192.168.1.10"
    "Liverpool", "192.168.2.20"
    "Moscow", "192.168.3.30"
    "Seoul", "192.168.4.40"

Then, I can update process.rst with this autodoc directive:

..  process.rst
..  -----------

..  automodule:: server
    :no-members:

This results in the same narrative documentation but eliminates the need to update the documentation when the code changes.

☕ Like my work? I accept tips!

Kurt McKee

lessons learned in production

Self-documenting code

The problem

The current situation

The solution