Monkeypatching: Retrieving data from the parent scope
Posted 22 October 2020 in monkeypatching, pelican, programming, and pythonI've previously covered the basics of monkeypatching and now it's time to dig deeper.
Let's begin with a quick review of monkeypatching.
A simple example
Let's say that you're interacting with an app that adds the first two numbers in a list and returns the result. Our goal is to modify the library so it adds all of the numbers in the list. The code could look like this:
# app.py
# ------
def add_numbers(numbers):
"""Add the first two numbers in *numbers*."""
return sum(numbers[:2])
def main():
"""The app runs from here."""
numbers = [1, 2, 3]
print(add_numbers(numbers))
A simple solution
This is ideal. Remember that our goal is to make app.add_numbers() add all of the numbers (not just the first two), and because all of the numbers are available in the numbers parameter it's easy to monkeypatch the function and change its behavior.
# my_code.py
# ----------
import app
def add_all_numbers(numbers):
"""Add all of the numbers."""
return sum(numbers)
# Monkeypatch app.add_numbers()
app.add_numbers = add_all_numbers
app.main()
Now when app.add_numbers() is called, the monkey-patched function will add all of the numbers together.
A complex example
What happens if add_numbers() is written so that it doesn't accept a list of numbers? What if it only accepts two numbers as individual parameters?
# app.py
# ------
def add_numbers(x, y):
"""Add two numbers."""
return x + y
def main():
"""The app runs from here."""
numbers = [1, 2, 3]
print(add_numbers(numbers[0], numbers[1]))
In this example, add_numbers() doesn't have access to the full list of numbers in its scope. The full list of numbers is only available in the local scope of main(). I would have to rewrite main() and reimplement all of its functionality just to change one small bit of its behavior.
Is it possible for add_numbers() to reach into the local scope of main()?
Hecks yeah it is!
A complex solution
To reach out of the called function's scope and into the caller's scope, let's use the inspect module.
The inspect module allows a function to interact with the entire call stack. This includes accessing variables in the other functions' local scopes. In our case, we just want to access the variables one level up in the call stack. Here's how to do it.
# my_code.py
# ----------
import inspect
import app
def add_all_numbers(x, y):
"""Add all numbers. *x* and *y* will be ignored."""
# *add_all_numbers()* is the first frame in the stack.
# Its parent frame is at index 1 in the stack.
parent_frame = inspect.stack()[1].frame
# Retrieve *numbers* from the local scope.
all_numbers = parent_frame.f_locals['numbers']
return sum(all_numbers)
app.add_numbers = add_all_numbers
app.main()
When is this useful?
I've been reading content through feed aggregators for over 15 years. During that time I've observed several classes of problems with feed generators and feed aggregators. One of the most noticeable problems is the dreaded Post Flood (TM). This occurs when somebody changes their domain, or changes their site URL structure, or changes their blog software. These changes cause their tag URI's to change and result in developer planets and individual aggregators getting flooded with duplicate content.
When I began importing my old content into Pelican I tried to avoid this problem. Unfortunately, Pelican calls a function named get_tag_uri() whose sole parameters are the post's link and the post's date. As shown in the complex example above, I needed to reach up into the calling function's local scope to access all of the post metadata. In my case, I had stored all of the tag URI's in a metadata field named uri.
Using the technique shown above, I monkeypatched get_tag_uri() so that it would reach into the calling function's scope, retrieve the value of the uri metadata field, and return that value.
Here's the code that I added to my Pelican configuration file:
# pelicanconf.py
# --------------
import feedgenerator
import pelican.writers
def new_get_tag_uri(*args, **kwargs):
"""Use an existing tag URI, if any exists in the item metadata."""
parent_frame = inspect.stack()[1].frame
uri = getattr(parent_frame.f_locals.get('item'), 'uri', '')
if uri:
return uri
return feedgenerator.get_tag_uri(*args, **kwargs)
pelican.writers.get_tag_uri = new_get_tag_uri
Note that Pelican has a pretty good plugin architecture, so monkeypatching is only one way to solve this particular problem.