Python NumPy Exercises¶
Programming for Data Science Bootcamp
Time¶
Exercise 1¶
Write a for
loop to build a list containing the integers from $1$ to $100,000$ where all odd numbers have negative sign
Time the code using the time()
function from the time
module.
Print out the resulting time delta.
import time
t0 = time.time()
vals = []
for i in range(1, 100001):
if i % 2 == 1:
i *= -1
vals.append(i)
print('runtime: ', time.time() - t0)
runtime: 0.013157844543457031
Exercise 2¶
Write a list comprehension to build the same list and time it using time()
Remember: While we put the for loop
before if statements
, it is different with if else
statements.
if else
statements precede a for loop
in a list comprehension.
t0 = time.time()
vals = [i * -1 if i % 2 == 1 else i for i in range(1,100001)]
print('runtime: ', time.time() - t0)
runtime: 0.0068225860595703125
NumPy¶
import numpy as np
Exercise 3¶
Use NumPy to generate $10$ random integers ranging from $1$ to $6$ inclusive.
Hint: Use one of NumPy's random functions in the np.random
package.
# np.random.randint?
randint(low, high=None, size=None, dtype=int)
randos = np.random.randint(1, 7, 10)
randos
array([4, 4, 1, 4, 6, 1, 3, 5, 5, 5])
Exercise 4¶
Use random.randomint()
to generate a single random integer and print the type of the result.
Then generate $5$ random integers between $1$ and $21$ and print the type of the result.
What difference do you see?
r1 = np.random.randint(10)
type(r1)
int
r2 = np.random.randint(1, 21, 5)
type(r2)
numpy.ndarray
Exercise 5¶
Plot a histogram of the array for random integers you just created.
Hint: Import Matplotlib's Histogram function as follows:
from matplotlib.pyplot import hist
Then pass the array to hist()
.
from matplotlib.pyplot import hist
hist(r2)
(array([2., 0., 0., 1., 0., 0., 1., 0., 0., 1.]), array([ 7. , 8.1, 9.2, 10.3, 11.4, 12.5, 13.6, 14.7, 15.8, 16.9, 18. ]), <BarContainer object of 10 artists>)
Use a trailing semi-colon to eliminate the text.
hist(r2);
Exercise 6¶
Generate and print a matrix (i.e. a $2$D array) of random normals of shape $2 \times 3$.
Hint: np.random.randn(m, n)
samples from the standard normal distribution and generates an m
by n
matrix.
x = np.random.randn(2, 3)
x
array([[-1.55688234, -0.23381469, -1.27364592], [-1.99888757, -1.83017366, 0.87450442]])
Exercise 7¶
Multiply the matrix you just created by $2$.
In linear algebra, this is called "scaling" the matrix.
x * 2
array([[-0.00480114, -0.54747565, -1.151186 ], [ 0.45924853, 1.47640148, -0.02237077]])
Exercise 8¶
Now add the matrix to itself.
x + x
array([[ 2.54430807, -0.18371682, -0.74557633], [ 0.30907492, 1.92516141, -1.70942241]])
1 / x
array([[ 0.78606833, -10.88631924, -2.68248858], [ 6.47092302, 1.03887393, -1.16998583]])
# np.linalg.inv(x) # This is used to get the reciprocal, or inverse, of a matrix
Exercise 10¶
Create two two-dimensional arrays.
Make one of $0$s and one of $1$s.
Give both a shape of $2 \times 4$.
my_shape = (2, 4)
z1 = np.zeros(my_shape)
z1
array([[0., 0., 0., 0.], [0., 0., 0., 0.]])
o1 = np.ones(my_shape)
o1
array([[1., 1., 1., 1.], [1., 1., 1., 1.]])
Exercise 11¶
Create an identity matrix with $4$ rows and columns.
np.identity(4)
array([[1., 0., 0., 0.], [0., 1., 0., 0.], [0., 0., 1., 0.], [0., 0., 0., 1.]])
Note that identity matrices are square.
Exercise 12¶
Generate a vector of random numbers of length $5$.
Then print a slice that consists of $3$ elements, beginning with the second element.
Then print a slice that excludes the first and last elements.
x12 = np.random.randn(5)
x12
array([-0.20266496, 0.19059664, -0.41064497, -0.3492225 , 0.14831958])
x12[1:4]
array([ 0.19059664, -0.41064497, -0.3492225 ])
x12[1:-1]
array([ 0.19059664, -0.41064497, -0.3492225 ])
Exercise 13¶
From the last array you created, select all elements $> 0.15$.
What is it called when you filter an array in this manner?
That is, using truth values in the indexer.
x12[x12 > 0.15]
array([0.19059664])
This is called 'boolean indexing'.
Consider the boolean array:
bool_idx = x12 > .15
bool_idx
array([False, True, False, False, False])
And this is our original array:
x12
array([-0.20266496, 0.19059664, -0.41064497, -0.3492225 , 0.14831958])
So, boolean indexing is something like:
[a for a, b in zip(x12, bool_idx) if b]
[0.19059663613877315]
Or:
[x for x in x12 * bool_idx if x]
[0.19059663613877315]
Exercise 14¶
Generate a $2$D array of random numbers with a shape of $3 \times 3$.
Then, select the all the rows but the first, and all the columns but the last.
x14 = np.random.randn(3,3)
x14
array([[-0.71092828, -0.45263172, -1.60650309], [-0.15522386, -0.75890669, 1.55023292], [-1.18031566, 0.70695363, 0.08966593]])
x14[1:, :-1]
array([[-0.15522386, -0.75890669], [-1.18031566, 0.70695363]])
x14[1:, :2]
array([[-0.15522386, -0.75890669], [-1.18031566, 0.70695363]])
Exercise 15¶
Write code to generate a new array based on the previous array and which sets all negative values to $0$.
The second array should be based on a copy of the first.
Then print both arrays.
x15 = x14.copy()
x15[x15 < 0] = 0
x14
array([[-0.71092828, -0.45263172, -1.60650309], [-0.15522386, -0.75890669, 1.55023292], [-1.18031566, 0.70695363, 0.08966593]])
x15
array([[0. , 0. , 0. ], [0. , 0. , 1.55023292], [0. , 0.70695363, 0.08966593]])
Exercise 16¶
Write a function called roll_dice()
that uses np.random.randint
and returns a sorted $1$D list of integers of length n
for a die of m
sides.
- Each integer is from $1$ to $m$ inclusive.
- Make the default value of
m
$6$. - Give the user the option to return the results in reverse sort order. Set the default value to
False
. - Return the results as a list.
Then
- Run it so that it rolls a $6$-sided die $10$ times. Print results with reverse sorting.
- Run it so that it rolls a $12$-sided die $10$ times. Print results with no sorting.
def roll_dice(n, m=6, sort=False, reverse=False):
'''
This function returns a sorted list of integers of length n.
Each integer is from 1 to 6 inclusive.
'''
x = np.random.randint(1, m+1, n)
if sort:
return sorted(x, reverse=reverse)
else:
return list(x)
game1 = roll_dice(10, sort=True, reverse=True)
game1
[6, 5, 5, 5, 4, 4, 3, 2, 1, 1]
game2 = roll_dice(10, 12)
game2
[1, 3, 2, 9, 4, 4, 7, 1, 6, 2]
Exercise 17¶
Make a plot showing an example play where where $m = 50$, $n = 8$, and sorting is turned off.
from matplotlib.pyplot import plot
plot(roll_dice(8, 50));
Exercise 18¶
Write a NumPy program to compute the eigenvalues and eigenvectors of a given square array.
m = np.mat("3 -2;1 0")
print(m)
[[ 3 -2] [ 1 0]]
w, v = np.linalg.eig(m)
print( "Eigenvalues:",w)
print( "Eigenvectors:",v)
Eigenvalues: [2. 1.] Eigenvectors: [[0.89442719 0.70710678] [0.4472136 0.70710678]]
Exercise 19¶
Create two $2$D matrices $p$ and $q$ by hand, each of shape $2 \times 2$.
Then multiply them, i.e. get their dot product.
p = [[1, 0], [0, 1]]
q = [[1, 2], [3, 4]]
print('p:', p)
print('q:', q)
p: [[1, 0], [0, 1]] q: [[1, 2], [3, 4]]
result1 = np.dot(p, q)
print(result1)
[[1 2] [3 4]]
result2 = np.dot(q, p)
print(result2)
[[1 2] [3 4]]
Exercise 20¶
Use NumPy to calculate the difference between the maximum and the minimum values of a given array along the second axis.
Expected Output:
Original array:
[
[0, 1, 2, 3, 4, 5],
[6, 7, 8, 9, 10, 11]
]
Difference between the maximum and the minimum values of the said array:
[5, 5]
x = np.arange(12).reshape((2, 6))
print(x.shape)
print(x)
(2, 6) [[ 0 1 2 3 4 5] [ 6 7 8 9 10 11]]
Here are max and min along the first axis (cols)
np.amax(x, axis=0), np.amin(x, axis=0)
(array([ 6, 7, 8, 9, 10, 11]), array([0, 1, 2, 3, 4, 5]))
Here are max and min along the second axis (rows)
np.amax(x, axis=1), np.amin(x, axis=1)
(array([ 5, 11]), array([0, 6]))
Here is the difference:
r1 = np.amax(x, axis=1) - np.amin(x, axis=1)
print(r1)
[5 5]
You may also use np.ptp
.
numpy.ptp
is a function in the NumPy library that calculates the peak-to-peak range of values in an array. The peak-to-peak range is the difference between the maximum and minimum values. It can be computed along a specified axis or across the entire array.
r2 = np.ptp(x, axis=1)
print(r2)
[5 5]
And here we show that the two methods produce the same result.
np.allclose(r1, r2)
True
Exercise 21¶
Use NumPy to sort a given array by the 2nd column.
Original array:
[
[1, 5, 0],
[3, 2, 5],
[8, 7, 6]
]
Sorted array:
[
[3, 2, 5],
[1, 5, 0],
[8, 7, 6]
]
nums = np.random.randint(0,10,(3,3))
print(nums)
[[7 0 1] [6 9 0] [6 3 0]]
print(nums[nums[:,1].argsort()])
[[7 0 1] [6 3 0] [6 9 0]]
Exercise 22¶
Use NumPy to find the norm of a matrix or vector.
v = np.arange(7)
vnorm = np.linalg.norm(v)
print(v)
print("Vector norm:", vnorm)
[0 1 2 3 4 5 6] Vector norm: 9.539392014169456
m = np.matrix('1, 2; 3, 4')
mnorm = np.linalg.norm(m)
print(m)
print("Matrix norm:", mnorm)
[[1 2] [3 4]] Matrix norm: 5.477225575051661
Exercise 23¶
Use NumPy to calculate the QR decomposition of a given matrix.
m = np.array([[1,2],[3,4]])
print(m)
[[1 2] [3 4]]
result = np.linalg.qr(m)
print(result)
(array([[-0.31622777, -0.9486833 ], [-0.9486833 , 0.31622777]]), array([[-3.16227766, -4.42718872], [ 0. , -0.63245553]]))