NumPy is a well known Python library, used to perform various mathematical operations. It helps in Data Analysis calculations. Usually, NumPy is imported as below.
import numpy as np
NumPy Array
# defining a list of different car companies or string elements arr_str = ['Mercedes', 'BMW', 'Audi', 'Ferrari', 'Tesla'] # defining a list of number of cylinders in car or numerical elements arr_num = [5, 4, 6, 7, 3]
# connverting the list arr_str to a NumPy array np_arr_str = np.array(arr_str) # connverting the list arr_num to a NumPy array np_arr_num = np.array(arr_num) # checking the output print('Numpy Array (arr_str): ',np_arr_str) print('Numpy Array (arr_num): ',np_arr_num)
Numpy Array (arr_str): ['Mercedes' 'BMW' 'Audi' 'Ferrari' 'Tesla'] Numpy Array (arr_num): [5 4 6 7 3]
The resuts look similar to a list but arr_str and arr_num have been converted to NumPy arrays. Let’s check the data type to confirm this.
# printing the data type of lists print('Data type of arr_str: ',type(arr_str)) print('Data type of arr_num: ',type(arr_num)) # printing the data type after conversion of lists to array print('Data type of np_arr_str: ',type(np_arr_str)) print('Data type of np_arr_num: ',type(np_arr_num))
Data type of arr_str: <class 'list'> Data type of arr_num: <class 'list'> Data type of np_arr_str: <class 'numpy.ndarray'> Data type of np_arr_num: <class 'numpy.ndarray'>
NumPy Matrix
# let's say we have information of different number of cylinders in a car and we want to display them in a matrix format matrix = np.array([[1,2,1],[4,5,9],[1,8,9]]) print(matrix)
[[1 2 1] [4 5 9] [1 8 9]]
print('Data type of matrix: ',type(matrix))
Data type of matrix: <class 'numpy.ndarray'>
There are different ways to create NumPy arrays using the functions available in NumPy library
Using np.arange() function
arr2 = np.arange(start = 0, stop = 10) # 10 will be excluded from the output print(arr2) # or arr2 = np.arange(0,10) print(arr2)
[0 1 2 3 4 5 6 7 8 9] [0 1 2 3 4 5 6 7 8 9]
# adding a step size of 5 to create an array arr3 = np.arange(start = 0, stop = 20, step = 5) arr3
Output
array([ 0, 5, 10, 15])
Using np.linspace() function
matrix2 = np.linspace(0,5) # by default 50 evenly spaced values will be generated between 0 and 5 matrix2
Output
array([0. , 0.10204082, 0.20408163, 0.30612245, 0.40816327, 0.51020408, 0.6122449 , 0.71428571, 0.81632653, 0.91836735, 1.02040816, 1.12244898, 1.2244898 , 1.32653061, 1.42857143, 1.53061224, 1.63265306, 1.73469388, 1.83673469, 1.93877551, 2.04081633, 2.14285714, 2.24489796, 2.34693878, 2.44897959, 2.55102041, 2.65306122, 2.75510204, 2.85714286, 2.95918367, 3.06122449, 3.16326531, 3.26530612, 3.36734694, 3.46938776, 3.57142857, 3.67346939, 3.7755102 , 3.87755102, 3.97959184, 4.08163265, 4.18367347, 4.28571429, 4.3877551 , 4.48979592, 4.59183673, 4.69387755, 4.79591837, 4.89795918, 5. ])
How are these values getting generated?
The step size or the difference between each element will be decided by the following formula:
(stop – start) / (total elements – 1)
So, in this case: (5 – 0) / 49 = 0.10204082
The first value will be 0.10204082, the second value will be 0.10204082 + 0.10204082, the third value will be 0.10204082 + 0.10204082 +0.10204082, and so on.
# generating 10 evenly spaced values between 10 and 20 matrix3 = np.linspace(10,20,10) matrix3
Output
array([10. , 11.11111111, 12.22222222, 13.33333333, 14.44444444, 15.55555556, 16.66666667, 17.77777778, 18.88888889, 20. ])
Similarly we can create matrices using the functions available in NumPy library
Using np.zeros()
float
.matrix4 = np.zeros([3,5]) matrix4
Output
array([[0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.]])
Using np.ones()
float
.matrix5 = np.ones([3,5]) matrix5
Output
array([[1., 1., 1., 1., 1.], [1., 1., 1., 1., 1.], [1., 1., 1., 1., 1.]])
Using np.eye()
float
.matrix6 = np.eye(5) matrix6
Output
array([[1., 0., 0., 0., 0.], [0., 1., 0., 0., 0.], [0., 0., 1., 0., 0.], [0., 0., 0., 1., 0.], [0., 0., 0., 0., 1.]])
We can also convert a one dimension array to a matrix. This can be done by using the np.reshape() function.
# defining an array with values 0 to 9 arr4 = np.arange(0,10) arr4
Output
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
# reshaping the array arr4 to a 2 x 5 matrix arr4_reshaped = arr4.reshape((2,5)) arr4_reshaped
Output
array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]])
arr4
Output
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
# reshaping the array arr4 to a 2 x 6 matrix arr4.reshape((2,6))
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-19-d52ee4fd36fa> in <module>() 1 # reshaping the array arr4 to a 2 x 6 matrix ----> 2 arr4.reshape((2,6)) ValueError: cannot reshape array of size 10 into shape (2,6)
NumPy can also perform a large number of different mathematical operations and it provides different functions to do so.
NumPy provides:
Trigonometric functions
print('Sine Function:',np.sin(4)) print('Cosine Function:',np.cos(4)) print('Tan Function',np.tan(4))
Sine Function: -0.7568024953079282 Cosine Function: -0.6536436208636119 Tan Function 1.1578212823495775
Exponents and Logarithmic functions
np.exp(2)
Output
7.38905609893065
arr5 = np.array([2,4,6]) np.exp(arr5)
Output
array([ 7.3890561 , 54.59815003, 403.42879349])
# by default NumPy takes the base of log as e np.log(2)
Output
0.6931471805599453
np.log(arr5)
Output
array([0.69314718, 1.38629436, 1.79175947])
## log with base 10 np.log10(8)
Output
0.9030899869919435
Arithmetic Operations on arrays
# arithmetic on lists l1 = [1,2,3] l2 = [4,5,6] print(l1+l2) # this does not behave as you would expect!
[1, 2, 3, 4, 5, 6]
# we can +-*/ arrays together # defining two arrays arr7 = np.arange(1,6) print('arr7:', arr7) arr8 = np.arange(3,8) print('arr8:', arr8)
arr7: [1 2 3 4 5] arr8: [3 4 5 6 7]
print('Addition: ',arr7+arr8) print('Subtraction: ',arr8-arr7) print('Multiplication:' , arr7*arr8) print('Division:', arr7/arr8) print('Inverse:', 1/arr7) print('Powers:', arr7**arr8) # in python, powers are achieved using **, NOT ^!!! ^ does something completely different!
Addition: [ 4 6 8 10 12] Subtraction: [2 2 2 2 2] Multiplication: [ 3 8 15 24 35] Division: [0.33333333 0.5 0.6 0.66666667 0.71428571] Inverse: [1. 0.5 0.33333333 0.25 0.2 ] Powers: [ 1 16 243 4096 78125]
Operations on Matrices
matrix7 = np.arange(1,10).reshape(3,3) print(matrix7) matrix8 = np.eye(3) print(matrix8)
[[1 2 3] [4 5 6] [7 8 9]] [[1. 0. 0.] [0. 1. 0.] [0. 0. 1.]]
print('Addition: \n', matrix7+matrix8) print('Subtraction: \n ', matrix7-matrix8) print('Multiplication: \n', matrix7*matrix8) print('Division: \n', matrix7/matrix8)
Addition: [[ 2. 2. 3.] [ 4. 6. 6.] [ 7. 8. 10.]] Subtraction: [[0. 2. 3.] [4. 4. 6.] [7. 8. 8.]] Multiplication: [[1. 0. 0.] [0. 5. 0.] [0. 0. 9.]] Division: [[ 1. inf inf]
[inf 5. inf]
[inf inf 9.]]
/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:4: RuntimeWarning: divide by zero encountered in true_divide after removing the cwd from sys.path.
Linear algebra matrix multiplication
matrix9 = np.arange(1,10).reshape(3,3) print('First Matrix: \n',matrix9) matrix10 = np.arange(11,20).reshape(3,3) print('Second Matrix: \n',matrix10) print('') # taking linear algebra matrix multiplication (some may have heard this called the dot product) print('Multiplication: \n', matrix9 @ matrix10)
First Matrix: [[1 2 3] [4 5 6] [7 8 9]] Second Matrix: [[11 12 13] [14 15 16] [17 18 19]] Multiplication: [[ 90 96 102] [216 231 246] [342 366 390]]
Transpose of a matrix
print(matrix9)
[[1 2 3] [4 5 6] [7 8 9]]
# taking transpose of matrix np.transpose(matrix9)
Output
array([[1, 4, 7], [2, 5, 8], [3, 6, 9]])
# another way of taking a transpose matrix9.T
Output
array([[1, 4, 7], [2, 5, 8], [3, 6, 9]])
Function to find minimum and maximum values
print(matrix9)
[[1 2 3] [4 5 6] [7 8 9]]
print('Minimum value: ',np.min(matrix9))
Minimum value: 1
print('Maximum value: ',np.max(matrix9))
Maximum value: 9
Function to generate random samples
Using np.random.rand function
# Generating random values in an array rand_mat = np.random.rand(5) print(rand_mat)
[0.51224348 0.04347891 0.63676484 0.21565362 0.20840456]
# * Generating random values in a matrix rand_mat = np.random.rand(5,5) # uniform random variable print(rand_mat)
[[0.86306788 0.44597519 0.98724083 0.4589287 0.38586845] [0.50649135 0.95391258 0.85048883 0.36025143 0.31591653] [0.64923561 0.47141257 0.74644411 0.93158595 0.13779938] [0.31258084 0.14178965 0.13621574 0.73602144 0.82120803] [0.39707672 0.00899248 0.07510156 0.37820584 0.98857333]]
Using np.random.randn function
# Generating random values in an array rand_mat2 = np.random.randn(5) print(rand_mat2)
[-1.0071855 0.28979584 -0.17943124 -0.68585197 1.6968806 ]
# Generating random values in a matrix rand_mat2 = np.random.randn(5,5) print(rand_mat2)
[[ 0.5249153 -1.21274874 1.16687886 -0.98031339 1.55776756] [-0.0277859 -0.1909139 0.34541482 -0.9121202 1.29663511] [-0.29940613 0.48190686 1.2603141 1.29896705 1.5181862 ] [-0.03066447 1.85920421 0.15275318 1.39136145 2.19976953] [-0.79183103 0.01329189 1.7503743 -0.07835476 0.67801025]]
# Let's check the mean and standard deviation of rand_mat2 print('Mean:',np.mean(rand_mat2)) print('Standard Deviation:',np.std(rand_mat2))
Mean: 0.5188644860128971 Standard Deviation: 0.9539368236787206
Using np.random.randint function
# Generating random values in an array rand_mat3 = np.random.randint(1,5,10) print(rand_mat3)
[1 4 4 2 1 3 1 2 3 3]
# Generating random values in a matrix rand_mat3 = np.random.randint(1,10,[5,5]) print(rand_mat3)
[[2 8 7 2 5] [2 2 9 6 8] [3 6 6 4 6] [3 7 6 7 6] [2 7 3 6 8]]
# let's generate an array with 10 random values rand_arr = np.random.randn(10) print(rand_arr)
[-2.04194609 0.45031706 -0.02019619 1.40439998 0.27655997 0.78027138 -0.95580464 0.05992533 0.16359666 0.19479355]
# accessing the 6 th entry of rand_arr print(rand_arr[6])
-0.9558046380845322
# we can access multiple entries at once using print(rand_arr[4:9])
[ 0.27655997 0.78027138 -0.95580464 0.05992533 0.16359666]
# we can also access multiple non-consecutive entries using np.arange print('Index of values to access: ',np.arange(3,10,3)) print(rand_arr[np.arange(3,10,3)])
Index of values to access: [3 6 9] [ 1.40439998 -0.95580464 0.19479355]
Accessing arrays using logical operations
print(rand_arr)
[-2.04194609 0.45031706 -0.02019619 1.40439998 0.27655997 0.78027138 -0.95580464 0.05992533 0.16359666 0.19479355]
rand_arr>0
Output
array([False, True, False, True, True, True, False, True, True, True])
# accessing all the values of rand_arr which are greater than 0 print('Values greater than 0: ',rand_arr[rand_arr>0]) # accessing all the values of rand_arr which are less than 0 print('Values less than 0: ',rand_arr[rand_arr<0])
Values greater than 0: [0.45031706 1.40439998 0.27655997 0.78027138 0.05992533 0.16359666 0.19479355] Values less than 0: [-2.04194609 -0.02019619 -0.95580464]
Accessing the entries of a Matrix
# let's generate an array with 10 random values rand_mat = np.random.randn(5,5) print(rand_mat)
[[ 0.3357453 0.40461613 0.15462686 0.65122411 -0.97233248] [ 0.28322083 1.42640496 -0.78164392 0.0784663 0.54934505] [ 1.46738006 -0.79178691 -1.11833119 -0.14649231 1.24301 ] [ 0.45492561 0.96630705 -0.59473249 -1.08439922 -0.44078568] [-0.2954792 0.82300024 1.64248843 0.10615033 0.80220448]]
# acessing the second row of the rand_mat rand_mat[1]
Output
array([ 0.28322083, 1.42640496, -0.78164392, 0.0784663 , 0.54934505])
# acessing third element of the second row print(rand_mat[1][2]) #or print(rand_mat[1,2])
-0.7816439206526197 -0.7816439206526197
# accessing first two rows with second and third column print(rand_mat[0:2,1:3])
[[ 0.40461613 0.15462686] [ 1.42640496 -0.78164392]]
Accessing matrices using logical operations
print(rand_mat)
[[ 0.3357453 0.40461613 0.15462686 0.65122411 -0.97233248] [ 0.28322083 1.42640496 -0.78164392 0.0784663 0.54934505] [ 1.46738006 -0.79178691 -1.11833119 -0.14649231 1.24301 ] [ 0.45492561 0.96630705 -0.59473249 -1.08439922 -0.44078568] [-0.2954792 0.82300024 1.64248843 0.10615033 0.80220448]]
# accessing all the values of rand_mat which are greater than 0 print('Values greater than 0: \n ',rand_mat[rand_mat>0]) # accessing all the values of rand_mat which are less than 0 print('Values less than 0: \n',rand_mat[rand_mat<0])
Values greater than 0: [0.3357453 0.40461613 0.15462686 0.65122411 0.28322083 1.42640496 0.0784663 0.54934505 1.46738006 1.24301 0.45492561 0.96630705 0.82300024 1.64248843 0.10615033 0.80220448] Values less than 0: [-0.97233248 -0.78164392 -0.79178691 -1.11833119 -0.14649231 -0.59473249 -1.08439922 -0.44078568 -0.2954792 ]
Modifying the entries of an Array
print(rand_arr)
[-2.04194609 0.45031706 -0.02019619 1.40439998 0.27655997 0.78027138 -0.95580464 0.05992533 0.16359666 0.19479355]
# let's change some values in an array! # changing the values of index value 3 and index value 4 to 5 rand_arr[3:5] = 5 print(rand_arr)
[-2.04194609 0.45031706 -0.02019619 5. 5. 0.78027138 -0.95580464 0.05992533 0.16359666 0.19479355]
# changing the values of index value 0 and index value 1 to 2 and 3 respectively rand_arr[0:2] = [2,3] print(rand_arr)
[ 2. 3. -0.02019619 5. 5. 0.78027138 -0.95580464 0.05992533 0.16359666 0.19479355]
# modify entries using logical references rand_arr[rand_arr>0] = 65 rand_arr
array([ 6.50000000e+01, 6.50000000e+01, -2.01961913e-02, 6.50000000e+01, 6.50000000e+01, 6.50000000e+01, -9.55804638e-01, 6.50000000e+01, 6.50000000e+01, 6.50000000e+01])
Modifying the entries of a Matrix
print(rand_mat3)
[[2 8 7 2 5] [2 2 9 6 8] [3 6 6 4 6] [3 7 6 7 6] [2 7 3 6 8]]
# changing the values of the 4th and 5th element of the second and third rows of the matrix to 0 print('Matrix before modification: \n',rand_mat3) rand_mat3[1:3,3:5] = 0 print('Matrix after modification: \n',rand_mat3)
Matrix before modification: [[2 8 7 2 5] [2 2 9 6 8] [3 6 6 4 6] [3 7 6 7 6] [2 7 3 6 8]] Matrix after modification: [[2 8 7 2 5] [2 2 9 0 0] [3 6 6 0 0] [3 7 6 7 6] [2 7 3 6 8]]
# extracting the first 2 rows and first 3 columns from the matrix sub_mat = rand_mat[0:2,0:3] print(sub_mat)
[[ 0.3357453 0.40461613 0.15462686] [ 0.28322083 1.42640496 -0.78164392]]
# changing all the values of the extracted matrix to 3 sub_mat[:] = 3 print(sub_mat)
[[3. 3. 3.] [3. 3. 3.]]
# what happened to rand_mat when we change sub_mat? rand_mat
array([[ 3. , 3. , 3. , 0.65122411, -0.97233248], [ 3. , 3. , 3. , 0.0784663 , 0.54934505], [ 1.46738006, -0.79178691, -1.11833119, -0.14649231, 1.24301 ], [ 0.45492561, 0.96630705, -0.59473249, -1.08439922, -0.44078568], [-0.2954792 , 0.82300024, 1.64248843, 0.10615033, 0.80220448]])
# to prevent this behavior we need to use the .copy() method when we assign sub_mat # this behavior is the source of MANY errors for early python users!!! rand_mat = np.random.randn(5,5) print(rand_mat) sub_mat = rand_mat[0:2,0:3].copy() sub_mat[:] = 3 print(sub_mat) print(rand_mat)
[[ 0.03938222 0.34715163 -0.09053038 -2.83384359 0.6186174 ] [-0.48579697 -0.92822472 -0.37826625 -2.09295923 0.98692792] [ 0.15768358 1.50889694 -2.19241275 -1.99195257 1.07264506] [ 1.05162077 0.07893225 -0.21402247 -2.02199934 0.42356743] [-0.08313828 -0.54115578 0.87615438 0.44919499 -0.21109573]] [[3. 3. 3.] [3. 3. 3.]] [[ 0.03938222 0.34715163 -0.09053038 -2.83384359 0.6186174 ] [-0.48579697 -0.92822472 -0.37826625 -2.09295923 0.98692792] [ 0.15768358 1.50889694 -2.19241275 -1.99195257 1.07264506] [ 1.05162077 0.07893225 -0.21402247 -2.02199934 0.42356743] [-0.08313828 -0.54115578 0.87615438 0.44919499 -0.21109573]]
Let’s save some NumPy objects on the disk for use later!
from google.colab import drive drive.mount('/content/drive')
# creating a random matrices randint_matrix1 = np.random.randint(1,10,10).reshape(2,5) print(randint_matrix1) print('') randint_matrix2 = np.random.randint(10,20,10).reshape(2,5) print(randint_matrix2)
Using np.save() function
np.save('/content/drive/MyDrive/Python Course/saved_file_name',randint_matrix1)
Using np.savez() function
np.savez('/content/drive/MyDrive/Python Course/multiple_files',randint_matrix1=randint_matrix1,randint_matrix2=randint_matrix2)
# now let's load it loaded_arr = np.load('/content/drive/MyDrive/Python Course/saved_file_name.npy') loaded_multi = np.load('/content/drive/MyDrive/Python Course/multiple_files.npz') print(loaded_arr) print('') print(loaded_multi)
print('1st Matrix: \n',loaded_multi['randint_matrix1']) print('2nd Matrix: \n',loaded_multi['randint_matrix2']) new_matrix = loaded_multi['randint_matrix1'] print('New Matrix: \n',new_matrix)
# we can also save/load text files...but only single variables np.savetxt('/content/drive/MyDrive/Python Course/text_file_name.txt',randint_matrix1,delimiter=',') rand_mat_txt = np.loadtxt('/content/drive/MyDrive/Python Course/text_file_name.txt',delimiter=',') print(randint_matrix1) print('') print(rand_mat_txt)