Source

Re-construct source code and program state during runtime

To get started, we can take a look at inspect.currentframe, which will get us the current frame we’re in.

Problem Definition

When writing test libraries, it can be incredibly useful to understand what the call site source code looks like.

For instance, let’s say you call equal(x in [1, 2, 3], should_exist). When we’re in equal, the arguments have already been evaluated and we simply have two booleans as arguments, each True or False. If this test fails, it likely isn’t that helpful to see False != True.

Instead, it is likely much more helpful to see something like x in [1, 2, 3] != should_exist where x = 4 and should_exist=True.

Investigation

inspect.currentframe

curr_fm = inspect.currentframe()
curr_fm
<frame at 0x75b502a9da80, file '/tmp/ipykernel_2991/4207490087.py', line 1, code <module>>

Return the frame object for the caller’s stack frame.

CPython implementation detail: This function relies on Python stack frame support in the interpreter, which isn’t guaranteed to exist in all implementations of Python. If running in an implementation without Python stack frame support this function returns None.

It doesn’t matter to us whether non-CPython implementations are supported because we’re only using CPython. Okay, so practically speaking, we should be able to assume inspect.currentframe never returns None for our use cases.

public_attrs(curr_fm)
['clear',
 'f_back',
 'f_builtins',
 'f_code',
 'f_globals',
 'f_lasti',
 'f_lineno',
 'f_locals',
 'f_trace',
 'f_trace_lines',
 'f_trace_opcodes']

What do each of the frame attributes mean?

🤖Reply🤖

Here’s a quick rundown:

Attribute Meaning
f_back The previous (caller’s) frame — lets you walk up the call stack
f_code The code object being executed — contains function name, filename, etc.
f_locals Local variables in this frame
f_globals Global variables in this frame’s module
f_builtins Built-in names available in this frame
f_lineno Current line number being executed
f_lasti Index of last attempted instruction (bytecode level)
f_trace Tracing function for this frame (used by debuggers)
f_trace_lines Whether to call the trace function on each new line
f_trace_opcodes Whether to call the trace function on each opcode
clear A method to clear local variables (helps break reference cycles)

For your use case (reconstructing call-site source), the most relevant ones will likely be f_back, f_code, and f_lineno. Which would you like to explore first?

inspect.getouterframes

Get a list of FrameInfo objects for a frame and all outer frames. These frames represent the calls that lead to the creation of frame. The first entry in the returned list represents frame; the last entry represents the outermost call on frame’s stack.

Okay, so we have the current frame. But we want to know the caller frame.

Let’s create a function ‘f’ which retrieves the outer frame for where we’re calling ‘f’ from.

When calling ‘f’, we’ll pass args with expressions, constants, function calls, and kwargs. Plus indent our call to ‘f’ within a context manager.

def f(x,y,z,a=1):
    return inspect.getouterframes(inspect.currentframe())[1]
with open("/tmp/tmp.txt", mode="a") as _:
    caller_fm = f(
        1+1,
        2,
        max(5, 2),
        a=2,
    )
caller_fm
FrameInfo(frame=<frame at 0x75b503589970, file '/tmp/ipykernel_2991/653877691.py', line 1, code <module>>, filename='/tmp/ipykernel_2991/653877691.py', lineno=2, function='<module>', code_context=['    caller_fm = f(\n'], index=0, positions=Positions(lineno=2, end_lineno=7, col_offset=16, end_col_offset=5))

From this, we can “code_context” has the first line of context of where we’re executing “f”. Furthermore, note “positions” has all the information around we’re extract out only our call to “f” from this block of code.

inspect.getsourcelines

Return a list of source lines and starting line number for an object. The argument may be a module, class, method, function, traceback, frame, or code object. The source code is returned as a list of the lines corresponding to the object and the line number indicates where in the original source file the first line of code was found. An OSError is raised if the source code cannot be retrieved. A TypeError is raised if the object is a built-in module, class, or function.

While we could read a source code file directly with the “positions” information we have, it is more robust to use inspect.getsourcelines because we could be in a context where there is no file (e.g. a REPL).

src, _ = inspect.getsourcelines(caller_fm.frame)
src
['with open("/tmp/tmp.txt", mode="a") as _:\n',
 '    caller_fm = f(\n',
 '        1+1,\n',
 '        2,\n',
 '        max(5, 2),\n',
 '        a=2,\n',
 '    )\n',
 'caller_fm\n']
caller_fm.positions.lineno, caller_fm.positions.end_lineno
(2, 7)
srcl = src[caller_fm.positions.lineno-1:caller_fm.positions.end_lineno]
srcl
['    caller_fm = f(\n',
 '        1+1,\n',
 '        2,\n',
 '        max(5, 2),\n',
 '        a=2,\n',
 '    )\n']

Great! So now we have the lines of source code we’re interested in. However, we also need to look at the column offsets of the first and last line to get only our call to “f”.

srcl[0][caller_fm.positions.col_offset]
'f'
srcl[-1][caller_fm.positions.end_col_offset]
'\n'
fn_src = "".join(
    [srcl[0][caller_fm.positions.col_offset:]] +
    srcl[1:-1] +
    [srcl[-1][:caller_fm.positions.end_col_offset]]
)
fn_src
'f(\n        1+1,\n        2,\n        max(5, 2),\n        a=2,\n    )'

ast.parse and ast.unparse

Now that we have the specific segment of raw source code we’re interested in, we can parse it into an AST via the ast module, use that to more easily get the raw source code for each passed in argument, and then unparse it back out into the raw source code.

fn_src_args = [ast.unparse(arg) for arg in ast.parse(fn_src).body[0].value.args]
fn_src_args
['1 + 1', '2', 'max(5, 2)']
fn_src_kwargs = {kw.arg: ast.unparse(kw.value) for kw in ast.parse(fn_src).body[0].value.keywords}
fn_src_kwargs
{'a': '2'}

Putting It All Together


get_call_args


def get_call_args(
    frames_up:int=1
)->tuple:

Get the source code expressions of the caller’s arguments.

Exported source
def get_call_args(frames_up: int = 1) -> tuple[list[str], dict[str,Any]]:
    "Get the source code expressions of the caller's arguments."
    fm_info = inspect.getouterframes(inspect.currentframe())[frames_up+1]
    src, _ = inspect.getsourcelines(fm_info.frame)
    srcl = src[fm_info.positions.lineno - 1:fm_info.positions.end_lineno]
    if len(srcl) == 0:
        raise RuntimeError(
            "`get_call_args` is designed to be called from within a function, " +
            "not at the top-level. If you want to call it at the top level, pass `frames_up=0`."
        )
    elif len(srcl) == 1:
        fn_src = srcl[0][fm_info.positions.col_offset:fm_info.positions.end_col_offset]
    else:
        fn_src = "".join(
            [srcl[0][fm_info.positions.col_offset:]] +
            srcl[1:-1] +
            [srcl[-1][:fm_info.positions.end_col_offset]]
        )
    call_node = ast.parse(fn_src).body[0].value
    args_src = [ast.unparse(arg) for arg in call_node.args]
    kwargs_src = {kw.arg: ast.unparse(kw.value) for kw in call_node.keywords}
    return args_src, kwargs_src

Because get_call_args() is expected to be used from within a function, it generally will not work properly by default if you call it from the top-level.

try: get_call_args(); raise AssertionError("should have raised")
except RuntimeError as e: print(e)
`get_call_args` is designed to be called from within a function, not at the top-level. If you want to call it at the top level, pass `frames_up=0`.

However, you can have it work if you pass frames_up=0 (i.e. the current frame we’re calling get_call_args() from rather than the caller frame).

args, kwargs = get_call_args(frames_up=0)
assert args == [], f"Got {args}"
assert kwargs == {"frames_up": "0"}, f"Got {kwargs}"

When you’re in a function, it works as expected.

def f(a, b=1):
    return get_call_args()
args, kwargs = f(1+1,b=max(10, 20))
assert args == ["1 + 1"], f"Got {args}"
assert kwargs == {"b": "max(10, 20)"}, f"Got {kwargs}"

It also works with multi-line function calls.

args, kwargs = f(
    1,
    b=10,
    
    
)
assert args == ["1"], f"Got {args}"
assert kwargs == {"b": "10"}, f"Got {kwargs}"

If you use it with a function within a function, you’ll get the arguments for the inner function.

def nested_f(a, b=1):
    def inner_f(c, d, e=1):
        return get_call_args()
    return inner_f(5 * 5, 10 / 10)
args, kwargs = nested_f(1)
assert args == ["5 * 5", "10 / 10"], f"Got {args}"
assert kwargs == {}, f"Got {kwargs}"

However, you can still get the outer function arguments with the frames_up keyword argument.

def nested_f(a, b=1):
    def inner_f(c, d, e=1):
        return get_call_args(frames_up=2)
    return inner_f(5 * 5, 10 / 10)
args, kwargs = nested_f(1)
assert args == ["1"], f"Got {args}"
assert kwargs == {}, f"Got {kwargs}"