from time import time
NB: Python Timing
Programming for Data Science
Before we move onto the topic of NumPy, let’s pause and cover the topic of timing.
This sometimes called the runtime of a program or code block.
Timing in this context refers to measuring how long it takes your code to execute.
Python provides some tools that let you measure how long a block of code takes to execute, so you can compare the speed of different approaches to the same problem.
For example, we might compare the speed of using a comprehension versus a for loop.
The time
module
One way to to measure how long it takes a block of code to run is to use the time
module.
This module provides a number of functions to get and compute time.
The simplest function is time()
, which returns the number of seconds elapsed since the Epoch.
The Epoch is 00:00:00 UTC
on 1 January 1970, excluding leapseconds.
It corresponds roughly to when Unix was invented.
To get the time of a block, we get the time before the code runs \(t_0\) and substract it from the time the code finishes \(t_1\). Let’s try this by comparing a simple loop and comprehension.
= time() # START
t0 for i in range(10):
print(i, end=' ')
= time() # END t1
0 1 2 3 4 5 6 7 8 9
= time()
t3 = [print(i, end=' ') for i in range(10)]
_ = time() t4
0 1 2 3 4 5 6 7 8 9
= t1 - t0
delta_loop = t4 - t3
delta_comp print('loop:', delta_for)
print('comp:', delta_comp)
print('loop/comp:', round(delta_loop/delta_comp, 2))
loop: 8.58306884765625e-05
comp: 9.298324584960938e-05
loop/comp: 0.92
Interestingly, the for loop is faster than the comprehension.
Using timeit
To get a better measure of runtime, we can use the timeit
module.
Thie module measures timing across many runs.
Since timeit()
will return the runtime across all runs, we divide by the number of runs to get the mean runtime.
timeit()
works by evaluating code blocks written as strings.
Let’s compare two funcitons using timeit
:
from timeit import timeit
= 100
num_runs
= '''
loop_code vals = []
for i in range(1, 100001):
if i % 2 == 1:
i *= -1
vals.append(i)
'''
= '''
comp_code vals = [i * -1 if i % 2 == 1 else i for i in range(1, 100001)]
'''
= timeit(stmt = loop_code, number = num_runs) / num_runs
loop_mean_time = timeit(stmt = comp_code, number = num_runs) / num_runs
comp_mean_time = loop_mean_time / comp_mean_time
t_diff print('loop =', loop_mean_time)
print('comp =', comp_mean_time)
print('loop/list =', t_diff)
print('list/loop =', 1/t_diff)
loop = 0.005688848439604044
comp = 0.0046883809473365545
loop/list = 1.2133929609188114
list/loop = 0.8241353231872839
Using Magic
Instead of calling time
and timeit
directly, we can use the so-called magic commands.
Magic commands are %
or %%
prefixed commands that work in Jupyter notebooks and other IPython environments.
%
commands apply to single lines; they go at the beginning of the line.
%%
commands apply to cell blocks; they go at the top of the cell.
Placing %%timeit
or %%time
at the top of a cell will appy these functions to the cell block.
Placing %timeit
or %time
as the first item on a line of code will apply the to a single line.
Note that magic commands can take arguments.
For more on this topic, see Chapter 3 of Wes McKinney’s Python for Data Analysis and the official documentation
Let’s look at an example, similar to those above, comparing a loop to a comprehension.
time
= 10000 imax
%%time
= []
vals for i in range(1, imax+1):
if i % 2 == 1:
*= -1
i vals.append(i)
CPU times: user 1.17 ms, sys: 0 ns, total: 1.17 ms
Wall time: 1.18 ms
%time vals = [i*-1 if i % 2 == 1 else i for i in range(1,imax+1)]
CPU times: user 528 µs, sys: 0 ns, total: 528 µs
Wall time: 538 µs
timeit
%%timeit
= []
vals = 10000
imax for i in range(1, imax+1):
if i % 2 == 1:
*= -1
i vals.append(i)
507 µs ± 1.28 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
%timeit vals = [i*-1 if i % 2 == 1 else i for i in range(1,imax+1)]
469 µs ± 1.04 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
Types of Time
Note that the return values of time
contain a detailed description of results, including three kinds of CPU time and wall time.
Wall time measures how much time has passed, as if you were looking at the clock on your wall.
CPU time refers to how many seconds the CPU was actually busy.
In CPU time, user time is the amount of time a processor spends running application code.
System time is the amount of time it spends running operating system functions related to the application.