Writing Functions In Python
Crafting a docstring
You’ve decided to write the world’s greatest open-source natural language processing Python package. It will revolutionize working with free-form text, the way numpy
did for arrays, pandas
did for tabular data, and scikit-learn
did for machine learning.
The first function you write is count_letter()
. It takes a string and a single letter and returns the number of times the letter appears in the string. You want the users of your open-source package to be able to understand how this function works easily, so you will need to give it a docstring. Build up a Google Style docstring for this function by following these steps.
Instructions 1/4 – 4/4
- Copy the following string and add it as the docstring for the function:
Count the number of times `letter` appears in `content`.
- Now add the arguments section, using the Google style for docstrings. Use
str
to indicate a string. - Add a returns section that informs the user the return value is an
int
. - Finally, add some information about the
ValueError
that gets raised when the arguments aren’t correct.
def count_letter(content, letter): """Count the number of times `letter` appears in `content`. Args: content (str): The string to search. letter (str): The letter to search for. Returns: int # Add a section detailing what errors might be raised Raises: ValueError: If `letter` is not a one-character string. """ if (not isinstance(letter, str)) or len(letter) != 1: raise ValueError('`letter` must be a single character string.') return len([char for char in content if char == letter])
While it does require a bit more typing, the information presented here will make it very easy for others to use this code in the future. Remember that even though computers execute it, code is actually written for humans to read (otherwise you’d just be writing the 1s and 0s that the computer operates on).
Retrieving docstrings
You and a group of friends are working on building an amazing new Python IDE (integrated development environment — like PyCharm, Spyder, Eclipse, Visual Studio, etc.). The team wants to add a feature that displays a tooltip with a function’s docstring whenever the user starts typing the function name. That way, the user doesn’t have to go elsewhere to look up the documentation for the function they are trying to use. You’ve been asked to complete the build_tooltip()
function that retrieves a docstring from an arbitrary function.
You will be reusing the count_letter()
function that you developed in the last exercise to show that we can properly extract its docstring.
Instructions 1/3
- Begin by getting the docstring for the function
count_letter()
. Use an attribute of thecount_letter()
function.
# Get the "count_letter" docstring by using an attribute of the function docstring = count_letter.__doc__ border = '#' * 28 print('{}\n{}\n{}'.format(border, docstring, border))
<script.py> output: ############################ Count the number of times `letter` appears in `content`. Args: content (str): The string to search. letter (str): The letter to search for. Returns: int Raises: ValueError: If `letter` is not a one-character string. ############################
- Now use a function from the
inspect
module to get a better-formatted version ofcount_letter()
‘s docstring.
import inspect # Inspect the count_letter() function to get its docstring docstring = inspect.getdoc(count_letter) border = '#' * 28 print('{}\n{}\n{}'.format(border, docstring, border))
<script.py> output: ############################ Count the number of times `letter` appears in `content`. Args: content (str): The string to search. letter (str): The letter to search for. Returns: int Raises: ValueError: If `letter` is not a one-character string. ############################
Now create a build_tooltip()
function that can extract the docstring from any function that we pass to it.
import inspect def build_tooltip(function): """Create a tooltip for any function that shows the function's docstring. Args: function (callable): The function we want a tooltip for. Returns: str """ # Get the docstring for the "function" argument by using inspect docstring = inspect.getdoc(function) border = '#' * 28 return '{}\n{}\n{}'.format(border, docstring, border) print(build_tooltip(count_letter)) print(build_tooltip(range)) print(build_tooltip(print))
<script.py> output: ############################ Count the number of times `letter` appears in `content`. Args: content (str): The string to search. letter (str): The letter to search for. Returns: int Raises: ValueError: If `letter` is not a one-character string. ############################ ############################ range(stop) -> range object range(start, stop[, step]) -> range object Return an object that produces a sequence of integers from start (inclusive) to stop (exclusive) by step. range(i, j) produces i, i+1, i+2, ..., j-1. start defaults to 0, and stop is omitted! range(4) produces 0, 1, 2, 3. These are exactly the valid indices for a list of 4 elements. When step is given, it specifies the increment (or decrement). ############################ ############################ print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False) Prints the values to a stream, or to sys.stdout by default. Optional keyword arguments: file: a file-like object (stream); defaults to the current sys.stdout. sep: string inserted between values, default a space. end: string appended after the last value, default a newline. flush: whether to forcibly flush the stream. ############################
This IDE is going to be an incredibly delightful experience for your users now! Notice how the count_letter.__doc__
version of the docstring had strange whitespace at the beginning of all but the first line. That’s because the docstring is indented to line up visually when reading the code. But when we want to print the docstring, removing those leading spaces with inspect.getdoc()
will look much better.
Docstrings to the rescue!
Some maniac has corrupted your installation of numpy
! All of the functions still exist, but they’ve been given random names. You desperately need to call the numpy.histogram()
function and you don’t have time to reinstall the package. Fortunately for you, the maniac didn’t think to alter the docstrings, and you know how to access them. numpy
has a lot of functions in it, so we’ve narrowed it down to four possible functions that could be numpy.histogram()
in disguise: numpy.leyud()
, numpy.uqka()
, numpy.fywdkxa()
or numpy.jinzyxq()
.
Examine each of these functions’ docstrings in the IPython shell to determine which of them is actually numpy.histogram()
.
# Get the "count_letter" docstring by using an attribute of the function docstring = count_letter.__doc__ border = '#' * 28 print('{}\n{}\n{}'.format(border, docstring, border)) In [1]: inspect.getdoc(numpy.leyud) Out[1]: "Gives a new shape to an array without changing its data.\n\nParameters\n----------\na : array_like\n Array to be reshaped.\nnewshape : int or tuple of ints\n The new shape should be compatible with the original shape. If\n an integer, then the result will be a 1-D array of that length.\n One shape dimension can be -1. In this case, the value is\n inferred from the length of the array and remaining dimensions.\norder : {'C', 'F', 'A'}, optional\n Read the elements of `a` using this index order, and place the\n elements into the reshaped array using this index order. 'C'\n means to read / write the elements using C-like index order,\n with the last axis index changing fastest, back to the first\n axis index changing slowest. 'F' means to read / write the\n elements using Fortran-like index order, with the first index\n changing fastest, and the last index changing slowest. Note that\n the 'C' and 'F' options take no account of the memory layout of\n the underlying array, and only refer to the order of indexing.\n 'A' means to read / write the elements in Fortran-like index\n order if `a` is Fortran *contiguous* in memory, C-like order\n otherwise.\n\nReturns\n-------\nreshaped_array : ndarray\n This will be a new view object if possible; otherwise, it will\n be a copy. Note there is no guarantee of the *memory layout* (C- or\n Fortran- contiguous) of the returned array.\n\nSee Also\n--------\nndarray.reshape : Equivalent method.\n\nNotes\n-----\nIt is not always possible to change the shape of an array without\ncopying the data. If you want an error to be raised when the data is copied,\nyou should assign the new shape to the shape attribute of the array::\n\n >>> a = np.zeros((10, 2))\n # A transpose makes the array non-contiguous\n >>> b = a.T\n # Taking a view makes it possible to modify the shape without modifying\n # the initial object.\n >>> c = b.view()\n >>> c.shape = (20)\n AttributeError: incompatible shape for a non-contiguous array\n\nThe `order` keyword gives the index ordering both for *fetching* the values\nfrom `a`, and then *placing* the values into the output array.\nFor example, let's say you have an array:\n\n>>> a = np.arange(6).reshape((3, 2))\n>>> a\narray([[0, 1],\n [2, 3],\n [4, 5]])\n\nYou can think of reshaping as first raveling the array (using the given\nindex order), then inserting the elements from the raveled array into the\nnew array using the same kind of index ordering as was used for the\nraveling.\n\n>>> np.reshape(a, (2, 3)) # C-like index ordering\narray([[0, 1, 2],\n [3, 4, 5]])\n>>> np.reshape(np.ravel(a), (2, 3)) # equivalent to C ravel then C reshape\narray([[0, 1, 2],\n [3, 4, 5]])\n>>> np.reshape(a, (2, 3), order='F') # Fortran-like index ordering\narray([[0, 4, 3],\n [2, 1, 5]])\n>>> np.reshape(np.ravel(a, order='F'), (2, 3), order='F')\narray([[0, 4, 3],\n [2, 1, 5]])\n\nExamples\n--------\n>>> a = np.array([[1,2,3], [4,5,6]])\n>>> np.reshape(a, 6)\narray([1, 2, 3, 4, 5, 6])\n>>> np.reshape(a, 6, order='F')\narray([1, 4, 2, 5, 3, 6])\n\n>>> np.reshape(a, (3,-1)) # the unspecified value is inferred to be 2\narray([[1, 2],\n [3, 4],\n [5, 6]])" In [2]: inspect.getdoc(numpy.uqka) Out[2]: "Returns the indices that would sort an array.\n\nPerform an indirect sort along the given axis using the algorithm specified\nby the `kind` keyword. It returns an array of indices of the same shape as\n`a` that index data along the given axis in sorted order.\n\nParameters\n----------\na : array_like\n Array to sort.\naxis : int or None, optional\n Axis along which to sort. The default is -1 (the last axis). If None,\n the flattened array is used.\nkind : {'quicksort', 'mergesort', 'heapsort', 'stable'}, optional\n Sorting algorithm.\norder : str or list of str, optional\n When `a` is an array with fields defined, this argument specifies\n which fields to compare first, second, etc. A single field can\n be specified as a string, and not all fields need be specified,\n but unspecified fields will still be used, in the order in which\n they come up in the dtype, to break ties.\n\nReturns\n-------\nindex_array : ndarray, int\n Array of indices that sort `a` along the specified axis.\n If `a` is one-dimensional, ``a[index_array]`` yields a sorted `a`.\n More generally, ``np.take_along_axis(a, index_array, axis=a)`` always\n yields the sorted `a`, irrespective of dimensionality.\n\nSee Also\n--------\nsort : Describes sorting algorithms used.\nlexsort : Indirect stable sort with multiple keys.\nndarray.sort : Inplace sort.\nargpartition : Indirect partial sort.\n\nNotes\n-----\nSee `sort` for notes on the different sorting algorithms.\n\nAs of NumPy 1.4.0 `argsort` works with real/complex arrays containing\nnan values. The enhanced sort order is documented in `sort`.\n\nExamples\n--------\nOne dimensional array:\n\n>>> x = np.array([3, 1, 2])\n>>> np.argsort(x)\narray([1, 2, 0])\n\nTwo-dimensional array:\n\n>>> x = np.array([[0, 3], [2, 2]])\n>>> x\narray([[0, 3],\n [2, 2]])\n\n>>> np.argsort(x, axis=0) # sorts along first axis (down)\narray([[0, 1],\n [1, 0]])\n\n>>> np.argsort(x, axis=1) # sorts along last axis (across)\narray([[0, 1],\n [0, 1]])\n\nIndices of the sorted elements of a N-dimensional array:\n\n>>> ind = np.unravel_index(np.argsort(x, axis=None), x.shape)\n>>> ind\n(array([0, 1, 1, 0]), array([0, 0, 1, 1]))\n>>> x[ind] # same as np.sort(x, axis=None)\narray([0, 2, 2, 3])\n\nSorting with keys:\n\n>>> x = np.array([(1, 0), (0, 1)], dtype=[('x', '<i4'), ('y', '<i4')])\n>>> x\narray([(1, 0), (0, 1)],\n dtype=[('x', '<i4'), ('y', '<i4')])\n\n>>> np.argsort(x, order=('x','y'))\narray([1, 0])\n\n>>> np.argsort(x, order=('y','x'))\narray([0, 1])" In [3]: inspect.getdoc(numpy.fywdkxa) Out[3]: 'Compute the histogram of a set of data.\n\nParameters\n----------\na : array_like\n Input data. The histogram is computed over the flattened array.\nbins : int or sequence of scalars or str, optional\n If `bins` is an int, it defines the number of equal-width\n bins in the given range (10, by default). If `bins` is a\n sequence, it defines the bin edges, including the rightmost\n edge, allowing for non-uniform bin widths.\n\n .. versionadded:: 1.11.0\n\n If `bins` is a string, it defines the method used to calculate the\n optimal bin width, as defined by `histogram_bin_edges`.\n\nrange : (float, float), optional\n The lower and upper range of the bins. If not provided, range\n is simply ``(a.min(), a.max())``. Values outside the range are\n ignored. The first element of the range must be less than or\n equal to the second. `range` affects the automatic bin\n computation as well. While bin width is computed to be optimal\n based on the actual data within `range`, the bin count will fill\n the entire range including portions containing no data.\nnormed : bool, optional\n\n .. deprecated:: 1.6.0\n\n This is equivalent to the `density` argument, but produces incorrect\n results for unequal bin widths. It should not be used.\n\n .. versionchanged:: 1.15.0\n DeprecationWarnings are actually emitted.\n\nweights : array_like, optional\n An array of weights, of the same shape as `a`. Each value in\n `a` only contributes its associated weight towards the bin count\n (instead of 1). If `density` is True, the weights are\n normalized, so that the integral of the density over the range\n remains 1.\ndensity : bool, optional\n If ``False``, the result will contain the number of samples in\n each bin. If ``True``, the result is the value of the\n probability *density* function at the bin, normalized such that\n the *integral* over the range is 1. Note that the sum of the\n histogram values will not be equal to 1 unless bins of unity\n width are chosen; it is not a probability *mass* function.\n\n Overrides the ``normed`` keyword if given.\n\nReturns\n-------\nhist : array\n The values of the histogram. See `density` and `weights` for a\n description of the possible semantics.\nbin_edges : array of dtype float\n Return the bin edges ``(length(hist)+1)``.\n\n\nSee Also\n--------\nhistogramdd, bincount, searchsorted, digitize, histogram_bin_edges\n\nNotes\n-----\nAll but the last (righthand-most) bin is half-open. In other words,\nif `bins` is::\n\n [1, 2, 3, 4]\n\nthen the first bin is ``[1, 2)`` (including 1, but excluding 2) and\nthe second ``[2, 3)``. The last bin, however, is ``[3, 4]``, which\n*includes* 4.\n\n\nExamples\n--------\n>>> np.histogram([1, 2, 1], bins=[0, 1, 2, 3])\n(array([0, 2, 1]), array([0, 1, 2, 3]))\n>>> np.histogram(np.arange(4), bins=np.arange(5), density=True)\n(array([ 0.25, 0.25, 0.25, 0.25]), array([0, 1, 2, 3, 4]))\n>>> np.histogram([[1, 2, 1], [1, 0, 1]], bins=[0,1,2,3])\n(array([1, 4, 1]), array([0, 1, 2, 3]))\n\n>>> a = np.arange(5)\n>>> hist, bin_edges = np.histogram(a, density=True)\n>>> hist\narray([ 0.5, 0. , 0.5, 0. , 0. , 0.5, 0. , 0.5, 0. , 0.5])\n>>> hist.sum()\n2.4999999999999996\n>>> np.sum(hist * np.diff(bin_edges))\n1.0\n\n.. versionadded:: 1.11.0\n\nAutomated Bin Selection Methods example, using 2 peak random data\nwith 2000 points:\n\n>>> import matplotlib.pyplot as plt\n>>> rng = np.random.RandomState(10) # deterministic random data\n>>> a = np.hstack((rng.normal(size=1000),\n... rng.normal(loc=5, scale=2, size=1000)))\n>>> plt.hist(a, bins=\'auto\') # arguments are passed to np.histogram\n>>> plt.title("Histogram with \'auto\' bins")\n>>> plt.show()' In [4]: inspect.getdoc(numpy.jinzyxq) Out[4]: "Return an array of zeros with the same shape and type as a given array.\n\nParameters\n----------\na : array_like\n The shape and data-type of `a` define these same attributes of\n the returned array.\ndtype : data-type, optional\n Overrides the data type of the result.\n\n .. versionadded:: 1.6.0\norder : {'C', 'F', 'A', or 'K'}, optional\n Overrides the memory layout of the result. 'C' means C-order,\n 'F' means F-order, 'A' means 'F' if `a` is Fortran contiguous,\n 'C' otherwise. 'K' means match the layout of `a` as closely\n as possible.\n\n .. versionadded:: 1.6.0\nsubok : bool, optional.\n If True, then the newly created array will use the sub-class\n type of 'a', otherwise it will be a base-class array. Defaults\n to True.\n\nReturns\n-------\nout : ndarray\n Array of zeros with the same shape and type as `a`.\n\nSee Also\n--------\nempty_like : Return an empty array with shape and type of input.\nones_like : Return an array of ones with shape and type of input.\nfull_like : Return a new array with shape of input filled with value.\nzeros : Return a new array setting values to zero.\n\nExamples\n--------\n>>> x = np.arange(6)\n>>> x = x.reshape((2, 3))\n>>> x\narray([[0, 1, 2],\n [3, 4, 5]])\n>>> np.zeros_like(x)\narray([[0, 0, 0],\n [0, 0, 0]])\n\n>>> y = np.arange(3, dtype=float)\n>>> y\narray([ 0., 1., 2.])\n>>> np.zeros_like(y)\narray([ 0., 0., 0.])" In [5]:
numpy.fywdkxa()
is actually numpy.histogram()
in disguise. If you’ve spent any time browsing numpy’s online documentation, you will notice that it is built directly from the docstrings. There are some wonderful tools like sphinx
(Overview — Sphinx documentation (sphinx-doc.org) and pydoc
pydoc — Documentation generator and online help system — Python 3.9.7 documentation that will automatically generate online documentation for you based off of your docstrings.
Extract a function
While you were developing a model to predict the likelihood of a student graduating from college, you wrote this bit of code to get the z-scores of students’ yearly GPAs. Now you’re ready to turn it into a production-quality system, so you need to do something about the repetition. Writing a function to calculate the z-scores would improve this code.
# Standardize the GPAs for each year
df['y1_z'] = (df.y1_gpa - df.y1_gpa.mean()) / df.y1_gpa.std()
df['y2_z'] = (df.y2_gpa - df.y2_gpa.mean()) / df.y2_gpa.std()
df['y3_z'] = (df.y3_gpa - df.y3_gpa.mean()) / df.y3_gpa.std()
df['y4_z'] = (df.y4_gpa - df.y4_gpa.mean()) / df.y4_gpa.std()
Note: df
is a pandas DataFrame where each row is a student with 4 columns of yearly student GPAs: y1_gpa
, y2_gpa
, y3_gpa
, y4_gpa
Instructions
- Finish the function so that it returns the z-scores of a column.
- Use the function to calculate the z-scores for each year (
df['y1_z']
,df['y2_z']
, etc.) from the raw GPA scores (df.y1_gpa
,df.y2_gpa
, etc.).
def standardize(column): """Standardize the values in a column. Args: column (pandas Series): The data to standardize. Returns: pandas Series: the values as z-scores """ # Finish the function so that it returns the z-scores z_score = (column - column.mean()) / column.std() return z_score # Use the standardize() function to calculate the z-scores df['y1_z'] = standardize(df['y1_gpa']) df['y2_z'] = standardize(df['y2_gpa']) df['y3_z'] = standardize(df['y3_gpa']) df['y4_z'] = standardize(df['y4_gpa'])
standardize()
will probably be useful in other places in your code, and now it is easy to use, test, and update if you need to. It’s also easier to tell what the code is doing because of the docstring and the name of the function.
Split up a function
Another engineer on your team has written this function to calculate the mean and median of a sorted list. You want to show them how to split it into two simpler functions: mean()
and median()
def mean_and_median(values):
"""Get the mean and median of a sorted list of `values`
Args:
values (iterable of float): A list of numbers
Returns:
tuple (float, float): The mean and median
"""
mean = sum(values) / len(values)
midpoint = int(len(values) / 2)
if len(values) % 2 == 0:
median = (values[midpoint - 1] + values[midpoint]) / 2
else:
median = values[midpoint]
return mean, median
Instructions 1/2
- 1Write the
mean()
function.
def mean(values): """Get the mean of a sorted list of values Args: values (iterable of float): A list of numbers Returns: float """ # Write the mean() function mean = sum(values) /len(values) return mean
- Write the
median()
function.
def median(values): """Get the median of a sorted list of values Args: values (iterable of float): A list of numbers Returns: float """ # Write the median() function midpoint = int(len(values) / 2) if len(values)% 2 == 0: median = (values[midpoint - 1] + values[midpoint]) / 2 else: median = values[midpoint] return median
Each function does one thing and does it well. Using, testing, and maintaining these will be a breeze (although you’ll probably just use numpy.mean()
and numpy.median()
for this in real life).
Mutable or immutable?
The following function adds a mapping between a string and the lowercase version of that string to a dictionary. What do you expect the values of d
and s
to be after the function is called?
def store_lower(_dict, _string):
"""Add a mapping between `_string` and a lowercased version of `_string` to `_dict`
Args:
_dict (dict): The dictionary to update.
_string (str): The string to add.
"""
orig_string = _string
_string = _string.lower()
_dict[orig_string] = _string
d = {}
s = 'Hello'
store_lower(d, s)
d = {'Hello': 'hello'}
,s = 'Hello'
Dictionaries are mutable objects in Python, so the function can directly change it in the _dict[_orig_string] = _string
statement. Strings, on the other hand, are immutable. When the function creates the lowercase version, it has to assign it to the _string
variable. This disconnects what happens to _string
from the external s
variable.
Best practice for default arguments
One of your co-workers (who obviously didn’t take this course) has written this function for adding a column to a pandas DataFrame. Unfortunately, they used a mutable variable as a default argument value! Please show them a better way to do this so that they don’t get unexpected behavior.
def add_column(values, df=pandas.DataFrame()):
"""Add a column of `values` to a DataFrame `df`.
The column will be named "col_<n>" where "n" is
the numerical index of the column.
Args:
values (iterable): The values of the new column
df (DataFrame, optional): The DataFrame to update.
If no DataFrame is passed, one is created by default.
Returns:
DataFrame
"""
df['col_{}'.format(len(df.columns))] = values
return df
Instructions
- Change the default value of
df
to an immutable value to follow best practices. - Update the code of the function so that a new DataFrame is created if the caller didn’t pass one.
# Use an immutable variable for the default argument def better_add_column(values, df=None): """Add a column of `values` to a DataFrame `df`. The column will be named "col_<n>" where "n" is the numerical index of the column. Args: values (iterable): The values of the new column df (DataFrame, optional): The DataFrame to update. If no DataFrame is passed, one is created by default. Returns: DataFrame """ # Update the function to create a default DataFrame if df is None: df = pandas.DataFrame() df['col_{}'.format(len(df.columns))] = values return df
When you need to set a mutable variable as a default argument, always use None
and then set the value in the body of the function. This prevents unexpected behavior like adding multiple columns if you call the function more than once.
The number of cats
You are working on a natural language processing project to determine what makes great writers so great. Your current hypothesis is that great writers talk about cats a lot. To prove it, you want to count the number of times the word “cat” appears in “Alice’s Adventures in Wonderland” by Lewis Carroll. You have already downloaded a text file, alice.txt
, with the entire contents of this great book.
Instructions
- Use the
open()
context manager to openalice.txt
and assign the file to thefile
variable.
# Open "alice.txt" and assign the file to "file" with open('alice.txt') as file: text = file.read() n = 0 for word in text.split(): if word.lower() in ['cat', 'cats']: n += 1 print('Lewis Carroll uses the word "cat" {} times'.format(n))
Lewis Carroll uses the word "cat" 24 times
By opening the file using the with open()
statement, you were able to read in the text of the file. More importantly, when you were done reading the text, the context manager closed the file for you.
The speed of cats
You’re working on a new web service that processes Instagram feeds to identify which pictures contain cats (don’t ask why — it’s the internet). The code that processes the data is slower than you would like it to be, so you are working on tuning it up to run faster. Given an image, image
, you have two functions that can process it:
process_with_numpy(image)
process_with_pytorch(image)
Your colleague wrote a context manager, timer()
, that will print out how long the code inside the context block takes to run. She is suggesting you use it to see which of the two options is faster. Time each function to determine which one to use in your web service.
Instructions
- Use the
timer()
context manager to time how longprocess_with_numpy(image)
takes to run. - Use the
timer()
context manager to time how longprocess_with_pytorch(image)
takes to run.
image = get_image_from_instagram() # Time how long process_with_numpy(image) takes to run with timer(): print('Numpy version') process_with_numpy(image) # Time how long process_with_pytorch(image) takes to run with timer(): print('Pytorch version') process_with_pytorch(image)
<script.py> output: Numpy version Processing..........done! Elapsed: 1.52 seconds Pytorch version Processing..........done! Elapsed: 0.33 seconds
Now that you know the pytorch
version is faster, you can use it in your web service to ensure your users get the rapid response time they expect.
You may have noticed there was no as <variable name>
at the end of the with
statement in timer()
context manager. That is because timer()
is a context manager that does not return a value, so the as <variable name>
at the end of the with
statement isn’t necessary. In the next lesson, you’ll learn how to write your own context managers like timer()
.
There are two ways to define context managers:
- Define a function
- (optional ) Add any set up code your context needs
- Use the yield keyword
- (Optional) Add any teardown code your context needs
- Add the ‘@contextlib.contextmanager’ decorator.
What is the ‘yield’ keyword?
In Python, yield is the keyword that works similarly as the return statement does in any program by returning the values from the function called. As in any programming language if we execute a function and it needs to perform some task and have to give its result so to return these results we use the return statement.
The timer() context manager
A colleague of yours is working on a web service that processes Instagram photos. Customers are complaining that the service takes too long to identify whether or not an image has a cat in it, so your colleague has come to you for help. You decide to write a context manager that they can use to time how long their functions take to run.
Instructions
- Add a decorator from the
contextlib
module to thetimer()
function that will make it act like a context manager. - Send control from the
timer()
function to the context block.
# Add a decorator that will make timer() a context manager @contextlib.contextmanager def timer(): """Time the execution of a context block. Yields: None """ start = time.time() # Send control back to the context block yield end = time.time() print('Elapsed: {:.2f}s'.format(end - start)) with timer(): print('This should take approximately 0.25 seconds') time.sleep(0.25)
<script.py> output: This should take approximately 0.25 seconds Elapsed: 0.25s
You’re managing context like a boss! And your colleague can now use your timer()
context manager to figure out which of their functions is running too slow. Notice that the three elements of a context manager are all here: a function definition, a yield statement, and the @contextlib.contextmanager
decorator. It’s also worth noticing that timer()
is a context manager that does not return an explicit value, so yield
is written by itself without specifying anything to return.
A read-only open() context manager
You have a bunch of data files for your next deep learning project that took you months to collect and clean. It would be terrible if you accidentally overwrote one of those files when trying to read it in for training, so you decide to create a read-only version of the open()
context manager to use in your project.
The regular open()
context manager:
- takes a filename and a mode (
'r'
for read,'w'
for write, or'a'
for append) - opens the file for reading, writing, or appending
- yields control back to the context, along with a reference to the file
- waits for the context to finish
- and then closes the file before exiting
Your context manager will do the same thing, except it will only take the filename as an argument and it will only open the file for reading.
Instructions
- Yield control from
open_read_only()
to the context block, ensuring that theread_only_file
object gets assigned tomy_file
. - Use
read_only_file
‘s.close()
method to ensure that you don’t leave open files lying around.
@contextlib.contextmanager def open_read_only(filename): """Open a file in read-only mode. Args: filename (str): The location of the file to read Yields: file object """ read_only_file = open(filename, mode='r') # Yield read_only_file so it can be assigned to my_file yield read_only_file # Close read_only_file read_only_file.close() with open_read_only('my_file.txt') as my_file: print(my_file.read())
@contextlib.contextmanager def open_read_only(filename): """Open a file in read-only mode. Args: filename (str): The location of the file to read Yields: file object """ read_only_file = open(filename, mode='r') # Yield read_only_file so it can be assigned to my_file yield read_only_file # Close read_only_file read_only_file.close() with open_read_only('my_file.txt') as my_file: print(my_file.read()) Congratulations! You wrote a context manager that acts like "open()" but operates in read-only mode!
That is a radical read-only context manager! Now you can relax, knowing that every time you use with open_read_only()
your files are safe from being accidentally overwritten. This function is an example of a context manager that does return a value, so we write yield read_only_file
instead of just yield
. Then the read_only_file
object gets assigned to my_file
in the with
statement so that whoever is using your context can call its .read()
method in the context block.
Nested context:
def copy(src, dst): '''Cop
Scraping the NASDAQ
Training deep neural nets is expensive! You might as well invest in NVIDIA stock since you’re spending so much on GPUs. To pick the best time to invest, you are going to collect and analyze some data on how their stock is doing. The context manager stock('NVDA')
will connect to the NASDAQ and return an object that you can use to get the latest price by calling its .price()
method.
You want to connect to stock('NVDA')
and record 10 timesteps of price data by writing it to the file NVDA.txt
.
You will notice the use of an underscore when iterating over the for loop. If this is confusing to you, don’t worry. It could easily be replaced with i
, if we planned to do something with it, like use it as an index. Since we won’t be using it, we can use a dummy operator, _
, which doesn’t use any additional memory.
Instructions
- Use the
stock('NVDA')
context manager and assign the result tonvda
. - Open a file for writing with
open('NVDA.txt', 'w')
and assign the file object tof_out
so you can record the price over time.
# Use the "stock('NVDA')" context manager # and assign the result to the variable "nvda" with stock('NVDA') as nvda: # Open "NVDA.txt" for writing as f_out with open('NVDA.txt', 'w') as f_out: for _ in range(10): value = nvda.price() print('Logging ${:.2f} for NVDA'.format(value)) f_out.write('{:.2f}\n'.format(value))
<script.py> output: Opening stock ticker for NVDA Logging $139.50 for NVDA Logging $139.54 for NVDA Logging $139.61 for NVDA Logging $139.65 for NVDA Logging $139.72 for NVDA Logging $139.73 for NVDA Logging $139.80 for NVDA Logging $139.78 for NVDA Logging $139.73 for NVDA Logging $139.64 for NVDA Closing stock ticker
Super stock scraping! Now you can monitor the NVIDIA stock price and decide when is the exact right time to buy. Nesting context managers like this allows you to connect to the stock market (the CONNECT/DISCONNECT pattern) and write to a file (the OPEN/CLOSE pattern) at the same time.
Changing the working directory
You are using an open-source library that lets you train deep neural networks on your data. Unfortunately, during training, this library writes out checkpoint models (i.e., models that have been trained on a portion of the data) to the current working directory. You find that behavior frustrating because you don’t want to have to launch the script from the directory where the models will be saved.
You decide that one way to fix this is to write a context manager that changes the current working directory, lets you build your models, and then resets the working directory to its original location. You’ll want to be sure that any errors that occur during model training don’t prevent you from resetting the working directory to its original location.
Instructions
- Add a statement that lets you handle any errors that might occur inside the context.
- Add a statement that ensures
os.chdir(current_dir)
will be called, whether there was an error or not.
def in_dir(directory): """Change current working directory to `directory`, allow the user to run some code, and change back. Args: directory (str): The path to a directory to work in. """ current_dir = os.getcwd() os.chdir(directory) # Add code that lets you handle errors try: yield # Ensure the directory is reset, # whether there was an error or not finally: os.chdir(current_dir)
Excellent error handling! Now, even if someone writes buggy code when using your context manager, you will be sure to change the current working directory back to what it was when they called in_dir()
. This is important to do because your users might be relying on their working directory being what it was when they started the script. in_dir()
is a great example of the CHANGE/RESET pattern that indicates you should use a context manager.
Building a command line data app
You are building a command line tool that lets a user interactively explore a data set. We’ve defined four functions: mean()
, std()
, minimum()
, and maximum()
that users can call to analyze their data. Help finish this section of the code so that your users can call any of these functions by typing the function name at the input prompt.
Note: The function get_user_input()
in this exercise is a mock version of asking the user to enter a command. It randomly returns one of the four function names. In real life, you would ask for input and wait until the user entered a value.
Instructions
- Add the functions
std()
,minimum()
, andmaximum()
to thefunction_map
dictionary, like we did withmean()
. - The name of the function the user wants to call is stored in
func_name
. Use the dictionary of functions,function_map
, to call the chosen function and passdata
as an argument.
# Add the missing function references to the function map function_map = { 'mean': mean, 'std': std, 'minimum': minimum, 'maximum': maximum } data = load_data() print(data) func_name = get_user_input() # Call the chosen function and pass "data" as an argument function_map[func_name](data)
<script.py> output: height weight 0 72.1 198 1 69.8 204 2 63.2 164 3 64.7 238 Type a command: > maximum height 72.1 weight 238.0 dtype: float64
By adding the functions to a dictionary, you can select the function based on the user’s input. You could have also used a series of if/else statements, but putting them in a dictionary like this is much easier to read and maintain.
Reviewing your co-worker’s code
Your co-worker is asking you to review some code that they’ve written and give them some tips on how to get it ready for production. You know that having a docstring is considered best practice for maintainable, reusable functions, so as a sanity check you decide to run this has_docstring()
function on all of their functions.
def has_docstring(func):
"""Check to see if the function
`func` has a docstring.
Args:
func (callable): A function.
Returns:
bool
"""
return func.__doc__ is not None
Instructions 1/3
- 1Call
has_docstring()
on your co-worker’sload_and_plot_data()
function.
# Call has_docstring() on the load_and_plot_data() function ok = has_docstring(load_and_plot_data) if not ok: print("load_and_plot_data() doesn't have a docstring!") else: print("load_and_plot_data() looks ok")
load_and_plot_data() looks ok
Check if the function as_2D()
has a docstring.
# Call has_docstring() on the as_2D() function ok = has_docstring(as_2D) if not ok: print("as_2D() doesn't have a docstring!") else: print("as_2D() looks ok")
<script.py> output: as_2D() looks ok
Check if the function log_product()
has a docstring.
# Call has_docstring() on the log_product() function ok = has_docstring(log_product) if not ok: print("log_product() doesn't have a docstring!") else: print("log_product() looks ok")
<script.py> output: log_product() doesn't have a docstring!
You have discovered that your co-worker forgot to write a docstring for log_product()
. You have learned enough about best practices to tell them how to fix it.
To pass a function as an argument to another function, you had to determine which one you were calling and which one you were referencing. Keeping those straight will be important as we dig deeper into this chapter. From the function names can you think of any other advice you might give your co-worker about their functions?
Returning functions for a math game
You are building an educational math game where the player enters a math term, and your program returns a function that matches that term. For instance, if the user types “add”, your program returns a function that adds two numbers. So far you’ve only implemented the “add” function. Now you want to include a “subtract” function.
Instructions
- Define the
subtract()
function. It should take two arguments and return the first argument minus the second argument.
def create_math_function(func_name): if func_name == 'add': def add(a, b): return a + b return add elif func_name == 'subtract': # Define the subtract() function def subtract(a, b): return a - b return subtract else: print("I don't know that one") add = create_math_function('add') print('5 + 2 = {}'.format(add(5, 2))) subtract = create_math_function('subtract') print('5 - 2 = {}'.format(subtract(5, 2)))
<script.py> output: 5 + 2 = 7 5 - 2 = 3
Now that you’ve implemented the subtract()
function, you can keep going to include multiply()
and divide()
. I predict this game is going to be even bigger than Fortnite!
Notice how we assign the return value from create_math_function()
to the
and subtractvariables in the script. Sinceadd
returns a function, we can then call those variables as functions.create_math_function()
Understanding scope
What four values does this script print?
x = 50
def one():
x = 10
def two():
global x
x = 30
def three():
x = 100
print(x)
for func in [one, two, three]:
func()
print(x)
Possible Answers
- 50, 30, 100, 50
- 10, 30, 30, 30
- 50, 30, 100, 30
- 10, 30, 100, 50
- 50, 50, 50, 50
one()
doesn’t change the global x
, so the first print()
statement prints 50
.
two()
does change the global x
so the second print()
statement prints 30
.
The print()
statement inside the function three()
is referencing the x
value that is local to three()
, so it prints 100
.
But three()
does not change the global x
value so the last print()
statement prints 30
again.
Modifying variables outside local scope
Sometimes your functions will need to modify a variable that is outside of the local scope of that function. While it’s generally not best practice to do so, it’s still good to know how in case you need to do it. Update these functions so they can modify variables that would usually be outside of their scope.
Instructions 1/3
- 1 Add a keyword that lets us update
call_count
from inside the function.
call_count = 0 def my_function(): # Use a keyword that lets us update call_count global call_count call_count += 1 print("You've called my_function() {} times!".format( call_count )) for _ in range(20): my_function()
<script.py> output: You've called my_function() 1 times! You've called my_function() 2 times! You've called my_function() 3 times! You've called my_function() 4 times! You've called my_function() 5 times! You've called my_function() 6 times! You've called my_function() 7 times! You've called my_function() 8 times! You've called my_function() 9 times! You've called my_function() 10 times! You've called my_function() 11 times! You've called my_function() 12 times! You've called my_function() 13 times! You've called my_function() 14 times! You've called my_function() 15 times! You've called my_function() 16 times! You've called my_function() 17 times! You've called my_function() 18 times! You've called my_function() 19 times! You've called my_function() 20 times!
Add a keyword that lets us modify file_contents
from inside save_contents()
. From the perspective of the code in save_contents()
, variables defined in read_files()
are not in the local scope or the global scope.
def read_files(): file_contents = None def save_contents(filename): # Add a keyword that lets us modify file_contents nonlocal file_contents if file_contents is None: file_contents = [] with open(filename) as fin: file_contents.append(fin.read()) for filename in ['1984.txt', 'MobyDick.txt', 'CatsEye.txt']: save_contents(filename) return file_contents print('\n'.join(read_files()))
<script.py> output: It was a bright day in April, and the clocks were striking thirteen. Call me Ishmael. Time is not a line but a dimension, like the dimensions of space.
Add a keyword to done
in check_is_done()
so that wait_until_done()
eventually stops looping.
def wait_until_done(): def check_is_done(): # Add a keyword so that wait_until_done() # doesn't run forever global done if random.random() < 0.1: done = True while not done: check_is_done() done = False wait_until_done() print('Work done? {}'.format(done))
Work done? True
By adding global done
in check_is_done()
, you ensure that the done
being referenced is the one that was set to False
before wait_until_done()
was called. Without this keyword, wait_until_done()
would loop forever because the done = True
in check_is_done()
would only be changing a variable that is local to check_is_done()
. Understanding what scope your variables are in will help you debug tricky situations like this one.
Checking for closure
You’re teaching your niece how to program in Python, and she is working on returning nested functions. She thinks she has written the code correctly, but she is worried that the returned function won’t have the necessary information when called. Show her that all of the nonlocal variables she needs are in the new function’s closure.
Instructions 1/3
- Use an attribute of the
my_func()
function to show that it has a closure that is notNone
.
def return_a_func(arg1, arg2): def new_func(): print('arg1 was {}'.format(arg1)) print('arg2 was {}'.format(arg2)) return new_func my_func = return_a_func(2, 17) # Show that my_func()'s closure is not None print(my_func.__closure__ is not None)
True
Show that there are two variables in the closure.
def return_a_func(arg1, arg2): def new_func(): print('arg1 was {}'.format(arg1)) print('arg2 was {}'.format(arg2)) return new_func my_func = return_a_func(2, 17) print(my_func.__closure__ is not None) # Show that there are two variables in the closure print(len(my_func.__closure__) == 2)
True
- Get the values of the variables in the closure so you can show that they are equal to
[2, 17]
, the arguments passed toreturn_a_func()
.
def return_a_func(arg1, arg2): def new_func(): print('arg1 was {}'.format(arg1)) print('arg2 was {}'.format(arg2)) return new_func my_func = return_a_func(2, 17) print(my_func.__closure__ is not None) print(len(my_func.__closure__) == 2) # Get the values of the variables in the closure closure_values = [ my_func.__closure__[i].cell_contents for i in range(2) ] print(closure_values == [2, 17])
True True True
Your niece is relieved to see that the values she passed to return_a_func()
are still accessible to the new function she returned, even after the program has left the scope of return_a_func()
.
Values get added to a function’s closure in the order they are defined in the enclosing function (in this case, arg1
and then arg2
), but only if they are used in the nested function. That is, if return_a_func()
took a third argument (e.g., arg3
) that wasn’t used by new_func()
, then it would not be captured in new_func()
‘s closure.
Closures keep your values safe
You are still helping your niece understand closures. You have written the function get_new_func()
that returns a nested function. The nested function call_func()
calls whatever function was passed to get_new_func()
. You’ve also written my_special_function()
which simply prints a message that states that you are executing my_special_function()
.
You want to show your niece that no matter what you do to my_special_function()
after passing it to get_new_func()
, the new function still mimics the behavior of the original my_special_function()
because it is in the new function’s closure.
Instructions 1/3
- 1Show that you still get the original message even if you redefine
my_special_function()
to only print “hello”.
def my_special_function(): print('You are running my_special_function()') def get_new_func(func): def call_func(): func() return call_func new_func = get_new_func(my_special_function) # Redefine my_special_function() to just print "hello" def my_special_function(): print("hello") new_func()
<script.py> output: You are running my_special_function()
Show that even if you delete my_special_function()
, you can still call new_func()
without any problems.
def my_special_function(): print('You are running my_special_function()') def get_new_func(func): def call_func(): func() return call_func new_func = get_new_func(my_special_function) # Delete my_special_function() del(my_special_function) new_func()
You are running my_special_function()
Show that you still get the original message even if you overwrite my_special_function()
with the new function.
def my_special_function(): print('You are running my_special_function()') def get_new_func(func): def call_func(): func() return call_func # Overwrite `my_special_function` with the new function my_special_function = get_new_func(my_special_function) my_special_function()
You are running my_special_function()
Your niece feels like she understands closures now. She has seen that you can modify, delete, or overwrite the values needed by the nested function, but the nested function can still access those values because they are stored safely in the function’s closure. She even realized that you could run into memory issues if you wound up adding a very large array or object to the closure, and has resolved to keep her eye out for that sort of problem.
Using decorator syntax
You have written a decorator called print_args
that prints out all of the arguments and their values any time a function that it is decorating gets called.
Instructions 1/2
- 1 Decorate
my_function()
with theprint_args()
decorator by redefining themy_function
variable.
def my_function(a, b, c): print(a + b + c) # Decorate my_function() with the print_args() decorator my_function = print_args(my_function) my_function(1, 2, 3)
<script.py> output: my_function was called with a=1, b=2, c=3 6
Decorate my_function()
with the print_args()
decorator using decorator syntax.
my_function was called with a=1, b=2, c=3 6
Note that @print_args
before the definition of my_function
is exactly equivalent to my_function = print_args(my_function)
. Remember, even though decorators are functions themselves, when you use decorator syntax with the @
symbol you do not include the parentheses after the decorator name.
Defining a decorator
Your buddy has been working on a decorator that prints a “before” message before the decorated function is called and prints an “after” message after the decorated function is called. They are having trouble remembering how wrapping the decorated function is supposed to work. Help them out by finishing their print_before_and_after()
decorator.
Instructions
- Call the function being decorated and pass it the positional arguments
*args
. - Return the new decorated function.
def print_before_and_after(func): def wrapper(*args): print('Before {}'.format(func.__name__)) # Call the function being decorated with *args func(*args) print('After {}'.format(func.__name__)) # Return the nested function return wrapper @print_before_and_after def multiply(a, b): print(a * b) multiply(5, 10)
Before multiply 50 After multiply
The decorator print_before_and_after()
defines a nested function wrapper()
that calls whatever function gets passed to print_before_and_after()
. wrapper()
adds a little something else to the function call by printing one message before the decorated function is called and another right afterwards. Since print_before_and_after()
returns the new wrapper()
function, we can use it as a decorator to decorate the multiply()
function.
Print the return type
You are debugging a package that you’ve been working on with your friends. Something weird is happening with the data being returned from one of your functions, but you’re not even sure which function is causing the trouble. You know that sometimes bugs can sneak into your code when you are expecting a function to return one thing, and it returns something different. For instance, if you expect a function to return a numpy array, but it returns a list, you can get unexpected behavior. To ensure this is not what is causing the trouble, you decide to write a decorator, print_return_type()
, that will print out the type of the variable that gets returned from every call of any function it is decorating.
Instructions
- Create a nested function,
wrapper()
, that will become the new decorated function. - Call the function being decorated.
- Return the new decorated function.
def print_return_type(func): # Define wrapper(), the decorated function def wrapper(*args, **kwargs): # Call the function being decorated result = func(*args, **kwargs) print('{}() returned type {}'.format( func.__name__, type(result) )) return result # Return the decorated function return wrapper @print_return_type def foo(value): return value print(foo(42)) print(foo([1, 2, 3])) print(foo({'a': 42}))
<script.py> output: foo() returned type <class 'int'> 42 foo() returned type <class 'list'> [1, 2, 3] foo() returned type <class 'dict'> {'a': 42}
Your new decorator helps you examine the results of your functions at runtime. Now you can apply this decorator to every function in the package you are developing and run your scripts. Being able to examine the types of your return values will help you understand what is happening and will hopefully help you find the bug.
Counter
You’re working on a new web app, and you are curious about how many times each of the functions in it gets called. So you decide to write a decorator that adds a counter to each function that you decorate. You could use this information in the future to determine whether there are sections of code that you could remove because they are no longer being used by the app.
Instructions
- Call the function being decorated and return the result.
- Return the new decorated function.
- Decorate
foo()
with thecounter()
decorator.
def counter(func): def wrapper(*args, **kwargs): wrapper.count += 1 # Call the function being decorated and return the result return wrapper wrapper.count = 0 # Return the new decorated function return wrapper # Decorate foo() with the counter() decorator @counter def foo(): print('calling foo()') foo() foo() print('foo() was called {} times.'.format(foo.count))
<script.py> output: foo() was called 2 times.
Now you can go decorate a bunch of functions with the counter()
decorator, let your program run for a while, and then print out how many times each function was called.
It seems a little magical that you can reference the wrapper()
function from inside the definition of wrapper()
as we do here on line 3. That’s just one of the many neat things about functions in Python — any function, not just decorators.
Preserving docstrings when decorating functions
Your friend has come to you with a problem. They’ve written some nifty decorators and added them to the functions in the open-source library they’ve been working on. However, they were running some tests and discovered that all of the docstrings have mysteriously disappeared from their decorated functions. Show your friend how to preserve docstrings and other metadata when writing decorators.
Instructions 1/4
- 1/4 Decorate
print_sum()
with theadd_hello()
decorator to replicate the issue that your friend saw – that the docstring disappears.
def add_hello(func): def wrapper(*args, **kwargs): print('Hello') return func(*args, **kwargs) return wrapper # Decorate print_sum() with the add_hello() decorator @add_hello def print_sum(a, b): """Adds two numbers and prints the sum""" print(a + b) print_sum(10, 20) print_sum_docstring = print_sum.__doc__ print(print_sum_docstring)
<script.py> output: Hello 30 None
- 2/4 To show your friend that they are printing the
wrapper()
function’s docstring, not theprint_sum()
docstring, add the following docstring towrapper()
:
"""Print 'hello' and then call the decorated function."""
def add_hello(func): # Add a docstring to wrapper def wrapper(*args, **kwargs): """Print 'hello' and then call the decorated function.""" print('Hello') return func(*args, **kwargs) return wrapper @add_hello def print_sum(a, b): """Adds two numbers and prints the sum""" print(a + b) print_sum(10, 20) print_sum_docstring = print_sum.__doc__ print(print_sum_docstring)
<script.py> output: Hello 30 Print 'hello' and then call the decorated function.
- 3/4 Import a function that will allow you to add the metadata from
print_sum()
to the decorated version ofprint_sum()
.
# Import the function you need to fix the problem from functools import wraps def add_hello(func): def wrapper(*args, **kwargs): """Print 'hello' and then call the decorated function.""" print('Hello') return func(*args, **kwargs) return wrapper @add_hello def print_sum(a, b): """Adds two numbers and prints the sum""" print(a + b) print_sum(10, 20) print_sum_docstring = print_sum.__doc__ print(print_sum_docstring)
<script.py> output: Hello 30 Print 'hello' and then call the decorated function.
- 4/4 Finally, decorate
wrapper()
so that the metadata fromfunc()
is preserved in the new decorated function.
from functools import wraps def add_hello(func): # Decorate wrapper() so that it keeps func()'s metadata @wraps(func) def wrapper(*args, **kwargs): """Print 'hello' and then call the decorated function.""" print('Hello') return func(*args, **kwargs) return wrapper @add_hello def print_sum(a, b): """Adds two numbers and prints the sum""" print(a + b) print_sum(10, 20) print_sum_docstring = print_sum.__doc__ print(print_sum_docstring)
<script.py> output: Hello 30 Adds two numbers and prints the sum
Your friend was concerned that they couldn’t print the docstrings of their functions. They now realize that the strange behavior they were seeing was caused by the fact that they were accidentally printing the wrapper()
docstring instead of the docstring of the original function. After adding @wraps(func)
to all of their decorators, they see that the docstrings are back where they expect them to be.
Measuring decorator overhead
Your boss wrote a decorator called check_everything()
that they think is amazing, and they are insisting you use it on your function. However, you’ve noticed that when you use it to decorate your functions, it makes them run muchslower. You need to convince your boss that the decorator is adding too much processing time to your function. To do this, you are going to measure how long the decorated function takes to run and compare it to how long the undecorated function would have taken to run. This is the decorator in question:
def check_everything(func):
@wraps(func)
def wrapper(*args, **kwargs):
check_inputs(*args, **kwargs)
result = func(*args, **kwargs)
check_outputs(result)
return result
return wrapper
Instructions
- Call the original function instead of the decorated version by using an attribute of the function that the
wraps()
statement in your boss’s decorator added to the decorated function.
duplicated_list = duplicate(list(range(50))) t_end = time.time() decorated_time = t_end - t_start t_start = time.time() # Call the original function instead of the decorated one duplicated_list = duplicate.__wrapped__(list(range(50))) t_end = time.time() undecorated_time = t_end - t_start print('Decorated time: {:.5f}s'.format(decorated_time)) print('Undecorated time: {:.5f}s'.format(undecorated_time))
<script.py> output: Finished checking inputs Finished checking outputs Decorated time: 1.51256s Undecorated time: 0.00006s
Your function ran approximately 10,000 times faster without your boss’s decorator. At least they were smart enough to add @wraps(func)
to the nested wrapper()
function so that you were able to access the original function. You should show them the results of this test. Be sure to ask for a raise while you’re at it!
Run_n_times()
In the video exercise, I showed you an example of a decorator that takes an argument: run_n_times()
. The code for that decorator is repeated below to remind you how it works. Practice different ways of applying the decorator to the function print_sum()
. Then I’ll show you a funny prank you can play on your co-workers.
def run_n_times(n):
"""Define and return a decorator"""
def decorator(func):
def wrapper(*args, **kwargs):
for i in range(n):
func(*args, **kwargs)
return wrapper
return decorator
Instructions 1/3
- 1 Add the
run_n_times()
decorator toprint_sum()
using decorator syntax so thatprint_sum()
runs 10 times.
# Make print_sum() run 10 times with the run_n_times() decorator @run_n_times(10) def print_sum(a, b): print(a + b) print_sum(15, 20)
<script.py> output: 35 35 35 35 35 35 35 35 35 35
- 2/3 Use
run_n_times()
to create a decoratorrun_five_times()
that will run any function five times.
# Use run_n_times() to create the run_five_times() decorator run_five_times = run_n_times(5) @run_five_times def print_sum(a, b): print(a + b) print_sum(4, 100)
<script.py> output: 104 104 104 104 104
- 3/3 Here’s the prank: use
run_n_times()
to modify the built-inprint()
function so that it always prints 20 times!
# Modify the print() function to always run 20 times print = run_n_times(20)(print) print('What is happening?!?!')
What is happening?!?! What is happening?!?! What is happening?!?! What is happening?!?! What is happening?!?! What is happening?!?! What is happening?!?! What is happening?!?! What is happening?!?! What is happening?!?! What is happening?!?! What is happening?!?! What is happening?!?! What is happening?!?! What is happening?!?! What is happening?!?! What is happening?!?! What is happening?!?! What is happening?!?! What is happening?!?!
Good job!
Good job!
Good job!
You’ve become an expert at using decorators. Notice how when you use decorator syntax for a decorator that takes arguments, you need to call the decorator by adding parentheses, but you don’t add parenthesis for decorators that don’t take arguments.
Warning: overwriting commonly used functions is probably not a great idea, so think twice before using these powers for evil.
HTML Generator
You are writing a script that generates HTML for a webpage on the fly. So far, you have written two decorators that will add bold or italics tags to any function that returns a string. You notice, however, that these two decorators look very similar. Instead of writing a bunch of other similar looking decorators, you want to create one decorator, html()
, that can take any pair of opening and closing tags.
def bold(func):
@wraps(func)
def wrapper(*args, **kwargs):
msg = func(*args, **kwargs)
return '<b>{}</b>'.format(msg)
return wrapper
def italics(func):
@wraps(func)
def wrapper(*args, **kwargs):
msg = func(*args, **kwargs)
return '<i>{}</i>'.format(msg)
return wrapper
Instructions 1/4
- 1/4 Return the decorator and the decorated function from the correct places in the new
html()
decorator.
def html(open_tag, close_tag): def decorator(func): @wraps(func) def wrapper(*args, **kwargs): msg = func(*args, **kwargs) return '{}{}{}'.format(open_tag, msg, close_tag) # Return the decorated function return wrapper # Return the decorator return decorator
- 2/4 Use the
html()
decorator to wrap the return value ofhello()
in the strings<b>
and</b>
(the HTML tags that mean “bold”).
# Make hello() return bolded text @html('<b>', '</b>') def hello(name): return 'Hello {}!'.format(name) print(hello('Alice'))
<script.py> output: <b>Hello Alice!</b>
- 3/4 Use
html()
to wrap the return value ofgoodbye()
in the strings<i>
and</i>
(the HTML tags that mean “italics”).
# Make goodbye() return italicized text @html('<i>', '</i>') def goodbye(name): return 'Goodbye {}.'.format(name) print(goodbye('Alice'))
<script.py> output: <i>Goodbye Alice.</i>
- 4/4 Use
html()
to wraphello_goodbye()
in a DIV, which is done by adding the strings<div>
and</div>
tags around a string.
# Wrap the result of hello_goodbye() in <div> and </div> @html('<div>', '</div>') def hello_goodbye(name): return '\n{}\n{}\n'.format(hello(name), goodbye(name)) print(hello_goodbye('Alice'))
<script.py> output: <div> <b>Hello Alice!</b> <i>Goodbye Alice.</i> </div>
With the new html()
decorator you can focus on writing simple functions that return the information you want to display on the webpage and let the decorator take care of wrapping them in the appropriate HTML tags.
Tag your functions
Tagging something means that you have given that thing one or more strings that act as labels. For instance, we often tag emails or photos so that we can search for them later. You’ve decided to write a decorator that will let you tag your functions with an arbitrary list of tags. You could use these tags for many things:
- Adding information about who has worked on the function, so a user can look up who to ask if they run into trouble using it.
- Labeling functions as “experimental” so that users know that the inputs and outputs might change in the future.
- Marking any functions that you plan to remove in a future version of the code.
- Etc.
Instructions
- Define a new decorator, named
decorator()
, to return. - Ensure the decorated function keeps its metadata.
- Call the function being decorated and return the result.
- Return the new decorator.
def tag(*tags): # Define a new decorator, named "decorator", to return def decorator(func): # Ensure the decorated function keeps its metadata @wraps(func) def wrapper(*args, **kwargs): # Call the function being decorated and return the result return func(*args, **kwargs) wrapper.tags = tags return wrapper # Return the new decorator return decorator @tag('test', 'this is a tag') def foo(): pass print(foo.tags)
<script.py> output: ('test', 'this is a tag')
With this new decorator, you can do some really interesting things. For instance, you could tag a bunch of image transforming functions, and then write code that searches for all of the functions that transform images and apply them, one after the other, on a given input image. What other neat uses can you come up with for this decorator?
Check the return type
Python’s flexibility around data types is usually cited as one of the benefits of the language. It can sometimes cause problems though if incorrect data types go unnoticed. You’ve decided that in order to ensure your code is doing exactly what you want it to do, you will explicitly check the return types in all of your functions and make sure they’re returning what you expect. To do that, you are going to create a decorator that checks if the return type of the decorated function is correct.
Note: assert
is a keyword that you can use to test whether something is true. If you type assert condition
and condition
is True
, this function doesn’t do anything. If condition
is False
, this function raises an error. The type of error that it raises is called an AssertionError
.
Instructions 1/2
- Start by completing the
returns_dict()
decorator so that it raises anAssertionError
if the return type of the decorated function is not a dictionary.
def returns_dict(func): # Complete the returns_dict() decorator def wrapper(*args, **kwargs): result = AssertionError assert type(result) == dict return result return wrapper @returns_dict def foo(value): return value try: print(foo([1,2,3])) except AssertionError: print('foo() did not return a dict!')
<script.py> output: foo() did not return a dict!
- 2/2 Now complete the
returns()
decorator, which takes the expected return type as an argument.
def returns(return_type): # Complete the returns() decorator def decorator(func): def wrapper(*args, **kwargs): result = AssertionError assert type(result) == return_type return result return wrapper return decorator @returns(dict) def foo(value): return value try: print(foo([1,2,3])) except AssertionError: print('foo() did not return a dict!')
<script.py> output: foo() did not return a dict!
We took the training wheels off on this exercise, and you still did a great job. You know how to write your own decorators now, but even more importantly, you know why they work the way they do.