Python NumPy Exercises¶

Programming for Data Science Bootcamp

Time¶

Exercise 1¶

Write a for loop to build a list containing the integers from $1$ to $100,000$ where all odd numbers have negative sign

Time the code using the time() function from the time module.

Print out the resulting time delta.

In [3]:
import time 
In [2]:
t0 = time.time()
vals = []
for i in range(1, 100001):
    if i % 2 == 1:
        i *= -1
    vals.append(i)
print('runtime: ', time.time() - t0)
runtime:  0.013157844543457031

Exercise 2¶

Write a list comprehension to build the same list and time it using time()

Remember: While we put the for loop before if statements, it is different with if else statements.
if else statements precede a for loop in a list comprehension.

In [4]:
t0 = time.time()
vals = [i * -1 if i % 2 == 1 else i for i in range(1,100001)]
print('runtime: ', time.time() - t0)
runtime:  0.0068225860595703125

NumPy¶

In [9]:
import numpy as np

Exercise 3¶

Use NumPy to generate $10$ random integers ranging from $1$ to $6$ inclusive.

Hint: Use one of NumPy's random functions in the np.random package.

In [31]:
# np.random.randint?

randint(low, high=None, size=None, dtype=int)

In [38]:
randos = np.random.randint(1, 7, 10)
In [33]:
randos
Out[33]:
array([4, 4, 1, 4, 6, 1, 3, 5, 5, 5])

Exercise 4¶

Use random.randomint() to generate a single random integer and print the type of the result.

Then generate $5$ random integers between $1$ and $21$ and print the type of the result.

What difference do you see?

In [41]:
r1 = np.random.randint(10)
In [42]:
type(r1)
Out[42]:
int
In [43]:
r2 = np.random.randint(1, 21, 5)
In [44]:
type(r2)
Out[44]:
numpy.ndarray

Exercise 5¶

Plot a histogram of the array for random integers you just created.

Hint: Import Matplotlib's Histogram function as follows:

from matplotlib.pyplot import hist

Then pass the array to hist().

In [45]:
from matplotlib.pyplot import hist
In [46]:
hist(r2)
Out[46]:
(array([2., 0., 0., 1., 0., 0., 1., 0., 0., 1.]),
 array([ 7. ,  8.1,  9.2, 10.3, 11.4, 12.5, 13.6, 14.7, 15.8, 16.9, 18. ]),
 <BarContainer object of 10 artists>)
No description has been provided for this image

Use a trailing semi-colon to eliminate the text.

In [47]:
hist(r2);
No description has been provided for this image

Exercise 6¶

Generate and print a matrix (i.e. a $2$D array) of random normals of shape $2 \times 3$.

Hint: np.random.randn(m, n) samples from the standard normal distribution and generates an m by n matrix.

In [51]:
x = np.random.randn(2, 3)
In [52]:
x
Out[52]:
array([[-1.55688234, -0.23381469, -1.27364592],
       [-1.99888757, -1.83017366,  0.87450442]])

Exercise 7¶

Multiply the matrix you just created by $2$.

In linear algebra, this is called "scaling" the matrix.

In [50]:
x * 2
Out[50]:
array([[-0.00480114, -0.54747565, -1.151186  ],
       [ 0.45924853,  1.47640148, -0.02237077]])

Exercise 8¶

Now add the matrix to itself.

In [20]:
x + x
Out[20]:
array([[ 2.54430807, -0.18371682, -0.74557633],
       [ 0.30907492,  1.92516141, -1.70942241]])

Exercise 9¶

Now get a matrix of reciprocals.

Note: this is not the same as a reciprocal of a matrix.

In [21]:
1 / x
Out[21]:
array([[  0.78606833, -10.88631924,  -2.68248858],
       [  6.47092302,   1.03887393,  -1.16998583]])
In [22]:
# np.linalg.inv(x) # This is used to get the reciprocal, or inverse, of a matrix

Exercise 10¶

Create two two-dimensional arrays.

Make one of $0$s and one of $1$s.

Give both a shape of $2 \times 4$.

In [23]:
my_shape = (2, 4)
In [24]:
z1 = np.zeros(my_shape)
In [25]:
z1
Out[25]:
array([[0., 0., 0., 0.],
       [0., 0., 0., 0.]])
In [26]:
o1 = np.ones(my_shape)
In [27]:
o1
Out[27]:
array([[1., 1., 1., 1.],
       [1., 1., 1., 1.]])

Exercise 11¶

Create an identity matrix with $4$ rows and columns.

In [28]:
np.identity(4)
Out[28]:
array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

Note that identity matrices are square.

Exercise 12¶

Generate a vector of random numbers of length $5$.

Then print a slice that consists of $3$ elements, beginning with the second element.

Then print a slice that excludes the first and last elements.

In [29]:
x12 = np.random.randn(5)
In [30]:
x12
Out[30]:
array([-0.20266496,  0.19059664, -0.41064497, -0.3492225 ,  0.14831958])
In [31]:
x12[1:4]
Out[31]:
array([ 0.19059664, -0.41064497, -0.3492225 ])
In [32]:
x12[1:-1]
Out[32]:
array([ 0.19059664, -0.41064497, -0.3492225 ])

Exercise 13¶

From the last array you created, select all elements $> 0.15$.

What is it called when you filter an array in this manner?

That is, using truth values in the indexer.

In [33]:
x12[x12 > 0.15]
Out[33]:
array([0.19059664])

This is called 'boolean indexing'.

Consider the boolean array:

In [34]:
bool_idx = x12 > .15
In [35]:
bool_idx
Out[35]:
array([False,  True, False, False, False])

And this is our original array:

In [36]:
x12
Out[36]:
array([-0.20266496,  0.19059664, -0.41064497, -0.3492225 ,  0.14831958])

So, boolean indexing is something like:

In [37]:
[a for a, b in zip(x12, bool_idx) if b]
Out[37]:
[0.19059663613877315]

Or:

In [38]:
[x for x in x12 * bool_idx if x]
Out[38]:
[0.19059663613877315]

Exercise 14¶

Generate a $2$D array of random numbers with a shape of $3 \times 3$.

Then, select the all the rows but the first, and all the columns but the last.

In [39]:
x14 = np.random.randn(3,3)
In [40]:
x14
Out[40]:
array([[-0.71092828, -0.45263172, -1.60650309],
       [-0.15522386, -0.75890669,  1.55023292],
       [-1.18031566,  0.70695363,  0.08966593]])
In [41]:
x14[1:, :-1]
Out[41]:
array([[-0.15522386, -0.75890669],
       [-1.18031566,  0.70695363]])
In [42]:
x14[1:, :2]
Out[42]:
array([[-0.15522386, -0.75890669],
       [-1.18031566,  0.70695363]])

Exercise 15¶

Write code to generate a new array based on the previous array and which sets all negative values to $0$.

The second array should be based on a copy of the first.

Then print both arrays.

In [43]:
x15 = x14.copy()
x15[x15 < 0] = 0
In [44]:
x14
Out[44]:
array([[-0.71092828, -0.45263172, -1.60650309],
       [-0.15522386, -0.75890669,  1.55023292],
       [-1.18031566,  0.70695363,  0.08966593]])
In [45]:
x15
Out[45]:
array([[0.        , 0.        , 0.        ],
       [0.        , 0.        , 1.55023292],
       [0.        , 0.70695363, 0.08966593]])

Exercise 16¶

Write a function called roll_dice() that uses np.random.randint and returns a sorted $1$D list of integers of length n for a die of m sides.

  • Each integer is from $1$ to $m$ inclusive.
  • Make the default value of m $6$.
  • Give the user the option to return the results in reverse sort order. Set the default value to False.
  • Return the results as a list.

Then

  • Run it so that it rolls a $6$-sided die $10$ times. Print results with reverse sorting.
  • Run it so that it rolls a $12$-sided die $10$ times. Print results with no sorting.
In [53]:
def roll_dice(n, m=6, sort=False, reverse=False):
    ''' 
    This function returns a sorted list of integers of length n. 
    Each integer is from 1 to 6 inclusive. 
    '''
    x = np.random.randint(1, m+1, n)

    if sort:
        return sorted(x, reverse=reverse)
    else:
        return list(x)
In [54]:
game1 = roll_dice(10, sort=True, reverse=True)
In [55]:
game1
Out[55]:
[6, 5, 5, 5, 4, 4, 3, 2, 1, 1]
In [56]:
game2 = roll_dice(10, 12)
In [57]:
game2
Out[57]:
[1, 3, 2, 9, 4, 4, 7, 1, 6, 2]

Exercise 17¶

Make a plot showing an example play where where $m = 50$, $n = 8$, and sorting is turned off.

In [58]:
from matplotlib.pyplot import plot
In [59]:
plot(roll_dice(8, 50));
No description has been provided for this image

Exercise 18¶

Write a NumPy program to compute the eigenvalues and eigenvectors of a given square array.

In [60]:
m = np.mat("3 -2;1 0")
print(m)
[[ 3 -2]
 [ 1  0]]
In [61]:
w, v = np.linalg.eig(m) 
print( "Eigenvalues:",w)
print( "Eigenvectors:",v)
Eigenvalues: [2. 1.]
Eigenvectors: [[0.89442719 0.70710678]
 [0.4472136  0.70710678]]

Exercise 19¶

Create two $2$D matrices $p$ and $q$ by hand, each of shape $2 \times 2$.

Then multiply them, i.e. get their dot product.

In [64]:
p = [[1, 0], [0, 1]]
q = [[1, 2], [3, 4]]
In [65]:
print('p:', p)
print('q:', q)
p: [[1, 0], [0, 1]]
q: [[1, 2], [3, 4]]
In [66]:
result1 = np.dot(p, q)
print(result1)
[[1 2]
 [3 4]]
In [67]:
result2 = np.dot(q, p)
print(result2)
[[1 2]
 [3 4]]

Exercise 20¶

Use NumPy to calculate the difference between the maximum and the minimum values of a given array along the second axis.

Expected Output:

Original array:

[
    [0, 1, 2, 3, 4, 5],
    [6, 7, 8, 9, 10, 11]
]

Difference between the maximum and the minimum values of the said array:

[5, 5]
In [68]:
x = np.arange(12).reshape((2, 6))
print(x.shape)
print(x)
(2, 6)
[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]]

Here are max and min along the first axis (cols)

In [81]:
np.amax(x, axis=0), np.amin(x, axis=0)
Out[81]:
(array([ 6,  7,  8,  9, 10, 11]), array([0, 1, 2, 3, 4, 5]))

Here are max and min along the second axis (rows)

In [82]:
np.amax(x, axis=1), np.amin(x, axis=1)
Out[82]:
(array([ 5, 11]), array([0, 6]))

Here is the difference:

In [83]:
r1 = np.amax(x, axis=1) - np.amin(x, axis=1)
print(r1)
[5 5]

You may also use np.ptp.

numpy.ptp is a function in the NumPy library that calculates the peak-to-peak range of values in an array. The peak-to-peak range is the difference between the maximum and minimum values. It can be computed along a specified axis or across the entire array.

In [84]:
r2 = np.ptp(x, axis=1)
print(r2)
[5 5]

And here we show that the two methods produce the same result.

In [85]:
np.allclose(r1, r2) 
Out[85]:
True

Exercise 21¶

Use NumPy to sort a given array by the 2nd column.

Original array:

[
    [1, 5, 0],
    [3, 2, 5],
    [8, 7, 6]
]

Sorted array:

[
    [3, 2, 5],
    [1, 5, 0],
    [8, 7, 6]
]
In [90]:
nums = np.random.randint(0,10,(3,3))
print(nums)
[[7 0 1]
 [6 9 0]
 [6 3 0]]
In [91]:
print(nums[nums[:,1].argsort()])
[[7 0 1]
 [6 3 0]
 [6 9 0]]

Exercise 22¶

Use NumPy to find the norm of a matrix or vector.

In [109]:
v = np.arange(7)
vnorm = np.linalg.norm(v)
print(v)
print("Vector norm:", vnorm)
[0 1 2 3 4 5 6]
Vector norm: 9.539392014169456
In [110]:
m = np.matrix('1, 2; 3, 4') 
mnorm = np.linalg.norm(m)
print(m)
print("Matrix norm:", mnorm)
[[1 2]
 [3 4]]
Matrix norm: 5.477225575051661

Exercise 23¶

Use NumPy to calculate the QR decomposition of a given matrix.

In [93]:
m = np.array([[1,2],[3,4]])
print(m)
[[1 2]
 [3 4]]
In [94]:
result =  np.linalg.qr(m)
print(result)
(array([[-0.31622777, -0.9486833 ],
       [-0.9486833 ,  0.31622777]]), array([[-3.16227766, -4.42718872],
       [ 0.        , -0.63245553]]))