Python : NumPy Basics

NumPy, which stands for Numerical Python.

NumPy has many uses including:

  • Efficiently working with many numbers at once
  • Generating random numbers
  • Performing many different numerical functions (i.e., calculating sin, cos, tan, mean, median, etc.)
Importing numpy

This is an universal way of importing NumPy and using np

import numpy as np  
NumPy Arrays

A NumPy array is a special type of list.
Each item can be of any type (strings, numbers, or even other arrays). You can even have different types of items in the same array.
Its best suited for numbers as it gives extra power for mathematical operations

import numpy as np

test_1 = np.array([92, 94, 88, 91, 87])  
print(test_1)  
print(type(test_1))

my_list = [1, 2, 3, 4, 5, 6]  
my_array = np.array(my_list)

print(my_list)  
print(type(my_list))  
print(my_array)  
print(type(my_array))  

Output is as bellow

[92 94 88 91 87]
<type 'numpy.ndarray'>  
[1, 2, 3, 4, 5, 6]
<type 'list'>  
[1 2 3 4 5 6]
<type 'numpy.ndarray'>  
reading from file
csv_array = np.genfromtxt('sample.csv', delimiter=',')  

NumPy arrays are more efficient than lists. One reason is that they allow you to do element-wise operations.

# With a list
l = [1, 2, 3, 4, 5]  
l_plus_3 = []  
for i in range(len(l)):  
    l_plus_3.append(l[i] + 3)
# With an array
a = np.array(l)  
a_plus_3 = a + 3  
>>> np.sqrt(a)
array([ 1, 1.41421356, 1.73205081, 2, 2.23606798, 2.44948974])  
Statistics with NumPy

Lets is consider following data set

import numpy as np

water_height = np.array([4.01, 4.03, 4.27, 4.29, 4.19,  
                         4.15, 4.16, 4.23, 4.29, 4.19,
                         4.00, 4.22, 4.25, 4.19, 4.10,
                         4.14, 4.03, 4.23, 4.08, 14.20,
                         14.03, 11.20, 8.19, 6.18, 4.04,
                         4.08, 4.11, 4.23, 3.99, 4.23])
# Calculate mean
np.mean(water_height)  
# Sort np array
np.sort(water_height)  
# Find median
np.median(water_height)  
# Find percentile value
np.percentile(water_height, 75)  
# Find standard Deviation
np.std(water_height)  
# Percentage of values greater than or equal to 4
np.mean(water_height >= 4)  

This works well for single dimensional array. Lets look at it how it looks for two dimensional array

ring_toss = np.array([[1, 0, 0],  
                          [0, 0, 1], 
                          [1, 0, 1]])
np.mean(ring_toss)  
0.44444444444444442  
# To find the means of each interior array, we specify axis 1 (the "rows"):
np.mean(ring_toss, axis=1)  
# To find the means of each index position, we specify axis 0 (the "columns"):
np.mean(ring_toss, axis=0)  

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.