Returning generators from with statements
Recently, an interesting issue came up at work that involved a subtle interaction between context managers and generator functions. Here is some example code demonstrating the problem:
@contextlib.contextmanager
def resource():
"""Context manager for some resource"""
print("Resource setup")
yield
print("Resource teardown")
def _load_values():
"""Load a list of values (requires resource to be held)"""
for i in range(3):
print("Generating value %d" % i)
yield i
def load_values():
"""Load values while holding the required resource"""
with resource():
return _load_values()
This is the output when run:
>>> for val in load_values(): pass
Resource setup
Resource teardown
Generating value 0
Generating value 1
Generating value 2
Whoops. The resource is destroyed before the values are actually generated. This is obviously a problem if the generator depends on the existence of the resource.
When you think about it, it's pretty clear what's going on. Calling
_load_values()
produces a generator object, whose code is only executed when
values are requested. load_values()
returns that generator, exiting the
with
statement and leading to the destruction of the resource. When the outer
for
loop (for val
) comes around to iterating over the generator, the
resource is long gone.
How do you solve this problem? In Python 3.3 and newer, you can use the yield
from
syntax to turn load_values()
into a generator as well. The
execution of load_values()
is halted at the yield from
point until the
child generator is exhausted, at which point it is safe to dispose of the
resource:
def load_values():
"""Load values while holding the required resource"""
with resource():
yield from _load_values()
In older Python versions, an explicit for loop over the child generator is required:
def load_values():
"""Load values while holding the required resource"""
with resource():
for val in _load_values():
yield val
Still another method would be to turn the result of _load_values()
into a
list
and returning that instead. This incurs higher memory overhead since all
values have to be held in memory at the same time, so it's only
appropriate for relatively short lists.
To sum up, it's a bad idea to return generators from under with
statements.
While it's not terribly confusing what's going on, it's a whee bit subtle and
not many people think about this until they ran into the issue. Hope this
heads-up helps.