Linear Regression Using Tensorflow

Linear regression attempts to model the relationship between two variables by fitting a linear equation to observed data. You can read more about what is regression and type of regression and about linear regression

Generally, you wont be using TensorFlow for problems like Linear regression, It can be best addressed by skitlearn, scipy libraries, however this is great starting point to understand TensorFlow.

Here is the code

################################################################################################
# name: TensorFlow_Linear_Regression_01.py
# desc: Linear Regression using TensorFlow
# date: 2019-02-03
################################################################################################
import tensorflow as tf
import numpy as np
from matplotlib import pyplot as plt

print('*** Program Started ***')
########## Input Data Creation
n = 20
x = np.arange(-n/2,n/2,1,dtype=np.float64)

m_real = np.random.uniform(0.8,0.9,(n,))
b_real = np.random.uniform(5,10,(n,))
print('m_real', type(m_real[0]))
y = x*m_real +b_real

########## Variables definition
m = tf.Variable(np.random.uniform(5,15,(1,)))
b = tf.Variable(np.random.uniform(5,15,(1,)))

########## display inout data and datatypes
print('x', x, type(x), type(x[0]))
print('y', y, type(y), type(y[0]))
print('m', m, type(m))
print('b', b, type(b))

########## Plot input to see the data
# plt.scatter(x,y,s=None, marker='o',color='g',edgecolors='g',alpha=0.9,label="Linear Relation")
# plt.grid(color='black', linestyle='--', linewidth=0.5,markevery=int)
# plt.legend(loc=2)
# plt.axis('scaled')
# plt.show()

########## Compute model and loss
loss = tf.reduce_mean(tf.pow(model - y, 2))

########## Use following model if you get TypeError
# model = tf.add(tf.multiply(x, tf.cast(m, tf.float64)), tf.cast(b, tf.float64))
# loss = tf.reduce_mean(tf.pow(model - tf.cast(y, tf.float64), 2))
###########################################################################################

# Create optimizer
learn_rate = 0.01 # you can use 0.1/0.01/0.001 to test the output
num_epochs = 500 # Test output accuracy for different epochs
num_batches = n

########## Initialize variables
init = tf.global_variables_initializer()

########## Launch session
with tf.Session() as sess:
sess.run(init)
print('*** Initialize')

########## This is where training happens
for epoch in range(num_epochs):
for batch in range(num_batches):
sess.run(optimizer)

########## Display and plot results
print('m = ', sess.run(m))
print('b = ', sess.run(b))

x1 = np.linspace(-10,10,50)
y1 = sess.run(m)*x1+sess.run(b)

plt.scatter(x,y,s=None, marker='o',color='g',edgecolors='g',alpha=0.9,label="Linear Relation")
plt.grid(color='black', linestyle='--', linewidth=0.5,markevery=int)
plt.legend(loc=2)
plt.axis('scaled')

plt.plot(x1, y1, 'r')
plt.savefig('TensorFlow_Linear_Regression_01.png')
plt.show()

print('*** Program ended ***')



You can change the input and see the output. If you get NaN value in TensorFlow output, please change 0.01 to 0.001 in following line

optimizer = tf.train.GradientDescentOptimizer(0.01)

Here is the output

*** Program Started ***
m_real <class 'numpy.float64'>
x [-10.  -9.  -8.  -7.  -6.  -5.  -4.  -3.  -2.  -1.   0.   1.   2.   3.
4.   5.   6.   7.   8.   9.] <class 'numpy.ndarray'> <class 'numpy.float64'>
y [-0.12267011  1.99923466 -1.82417449  3.70960816 -0.07838254  2.49865561
6.01521568  4.72467689  4.26350466  6.29306134  6.56424532  6.37343995
9.1530143   9.99292287 13.1932482   9.23547055 11.28963328 12.00597972
14.64760425 14.58158682] <class 'numpy.ndarray'> <class 'numpy.float64'>
m <tf.Variable 'Variable:0' shape=(1,) dtype=float64_ref> <class 'tensorflow.python.ops.variables.RefVariable'>
b <tf.Variable 'Variable_1:0' shape=(1,) dtype=float64_ref> <class 'tensorflow.python.ops.variables.RefVariable'>
2019-02-03 16:10:20.898092: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
*** Initialize
m =  [0.79374898]
b =  [7.12266825]
*** Program ended ***

You can ignore the line “2019-02-03 16:10:20.898092: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA ” We will discuss about it later.

TensorFlow Tutorial : Basics

In TensorFlow, the term tensor refers to the representation of data as multi-dimensional array whereas the term flow refers to the series of operations that one performs on tensors.

In TensorFlow, computation is described using a sort of flowchart of operations, called as data flow graphs. Each node of the graph represents an instance of a mathematical operation (like addition, division, or multiplication) and each edge is a multi-dimensional data set (tensor) on which the operations are performed. The input goes in at one end, and then it flows through this system of multiple operations and comes out the other end as output.

A tensor is a vector or matrix of n-dimensions that represents all types of data. All values in a tensor hold identical data type with a known (or partially known) shape. The shape of the data is the dimensionality of the matrix or array. tensors are just multidimensional arrays, that allows you to represent data having higher dimensions. In general, Deep Learning you deal with high dimensional data sets where dimensions refer to different features present in the data set.

• 0-d tensor: scalar (number)
• 1-d tensor: vector
• 2-d tensor: matrix

Constants

If you need constants with specific values inside your training model, then the constant object can be used

rate = tf.constant(15.2, name="rate", dtype=tf.float32)

Variables

Variables in TensorFlow are in-memory buffers containing tensors which have to be explicitly initialized and used in-graph to maintain state across session. By simply calling the constructor the variable is added in computational graph.

name = tf.Variable("techtrekking.com", name="name")

The graph is a set of computation that takes place successively. TensorFlow makes use of a graph framework. The graph gathers and describes all the series computations done during the training

Each operation is called an op node and are connected to each other.

A placeholder is TensorFlow’s way of allowing developers to inject data into the computation graph through placeholders which are bound inside some expressions. they allow developers to create operations, and the computational graph in general, without needing to provide the data in advance for that, and the data can be added in runtime from external sources.

distance = tf.placeholder(tf.float32, name="distance")

A Session object encapsulates the environment in which Operation objects are executed, and Tensor objects are evaluated.  In order to actually evaluate the nodes, we must run a computational graph within a session.

A session encapsulates the control and state of the TensorFlow runtime

Common functions

TensorFlow operator Description
tf.subtract x-y
tf.multiply x*y
tf.div x/y
tf.mod x % y
tf.abs |x|
tf.negative -x
tf.sign sign(x)
tf.square x*x
tf.round round(x)
tf.sqrt sqrt(x)
tf.pow x^y
tf.exp e^x
tf.log log(x)
tf.maximum max(x, y)
tf.minimum min(x, y)
tf.cos cos(x)
tf.sin sin(x)

The TensorBoard enables to monitor graphically and visually what TensorFlow is doing. This can be useful for gaining better understanding of machine learning models. We will look at TensorBoard in separate article

TensorFlow : Getting Started

TensorFlow is an open source machine learning framework for everyone. TensorFlow was developed by the Google Brain team for internal Google use. It was released under the Apache 2.0 open-source license. The reason this framework is critical is because its used by Google in production.

There is lots of fluff in data science and the one company which has actually used data science at a scale is google. Right from google search, to google photos to YouTube videos, Google has done amazing things with data science.

Please check how to install TensorFlow to get it installed. Once its installed, we will take a look at some basic and simplest program to get you started.

Here is most basic example of simple multiplication of two numbers

Here is output

python tensorflow_basics_01.py
*** Program Started ***
Tensor("Mul:0", shape=(), dtype=int32)
2019-01-19 21:23:01.021124: I tensorflow/core/platform/cpu_feature_guard.cc:141]
Your CPU supports instructions that this TensorFlow binary was not compiled to
use: AVX2
14
*** Program Ended ***

Please note that when we printed result its displayed Tensor(“Mul:0”, shape=(), dtype=int32). This is because tensorflow has not yet run. It has generated simply graphs. This is also called as lazy evaluation. We need to create session and then run session to get the output.

Let us have a look at another simple program doing multiplication of matrices