Brief History of Machine Learning

The term “Machine Learning” is coined by Arthur Samuel in 1959 while at IBM.

Brief History of ML

Date Details
1950 Alan Turing creates the “Turing Test” to determine if a computer has real intelligence. To pass the test, a computer must be able to fool a human into believing it is also human.
1950 Arthur Samuel wrote the first computer learning program. The program was the game of checkers, and the IBM computer improved at the game the more it played, studying which moves made up winning strategies and incorporating those moves into its program.
1957 Frank Rosenblatt designed the first neural network for computers (the perceptron)
1967 The “nearest neighbor” algorithm was written, allowing computers to begin using very basic pattern recognition. This could be used to map a route for traveling salesmen, starting at a random city but ensuring they visit all cities during a short tour.
1979 Students at Stanford University invent the “Stanford Cart” which can navigate obstacles in a room on its own.
1981 Gerald Dejong introduces the concept of Explanation Based Learning (EBL), in which a computer analyses training data and creates a general rule it can follow by discarding unimportant data.
1985 Terry Sejnowski invents NetTalk, which learns to pronounce words the same way a baby does.
1997 IBM’s Deep Blue beats the world champion at chess.
2006 Geoffrey Hinton coins the term “deep learning” to explain new algorithms that let computers “see” and distinguish objects and text in images and videos.
2008 DJ Patil and Jeff Hammerbacher coined the term “Data Scientist”
2011 IBM’s Watson beats its human competitors at Jeopardy.
2012 Google’s X Lab develops a machine learning algorithm that is able to autonomously browse YouTube videos to identify the videos that contain cats.
2014 Facebook FB develops DeepFace, a software algorithm that is able to recognize or verify individuals on photos to the same level as humans can.
2016 Google’s artificial intelligence algorithm beats a professional player at the Chinese board game Go, which is considered the world’s most complex board game and is many times harder than chess.

 

According to Michael I. Jordan, the ideas of machine learning, from methodological principles to theoretical tools, have had a long pre-history in statistics. He also suggested the term data science as a placeholder to call the overall field. You can refer to below, one the most famous venn diagram for Data Science.

How ML is different from AI ?

In the early days of AI, an increasing emphasis on the logical, knowledge-based approach caused a rift between AI and machine learning. By 1980, expert systems had come to dominate AI, and statistics was out of favor.

Machine learning, reorganized as a separate field, started to flourish in the 1990s. The field changed its goal from achieving artificial intelligence to tackling solvable problems of a practical nature. It shifted focus away from the symbolic approaches it had inherited from AI, and toward methods and models borrowed from statistics and probability theory.[11] It also benefited from the increasing availability of digitized information, and the ability to distribute it via the Internet.

Here is another famous venn diagram.

Hal Varian, Google’s chief economist, predicted in 2008 that the job of statistician will become the “sexiest” around. Data, he explains, are widely available; what is scarce is the ability to extract wisdom from them. Data are becoming the new raw material of business: an economic input almost on a par with capital and labour.

Machine Learning is a peer-reviewed scientific journal, published since 1986

Further reading

  1. https://en.wikipedia.org/wiki/Machine_learning
  2. A Very Short History Of Data Science
  3. Data, data everywhere

Linear Regression Using Tensorflow

Linear regression attempts to model the relationship between two variables by fitting a linear equation to observed data. You can read more about what is regression and type of regression and about linear regression 

Generally, you wont be using TensorFlow for problems like Linear regression, It can be best addressed by skitlearn, scipy libraries, however this is great starting point to understand TensorFlow.

Here is the code

################################################################################################
# name: TensorFlow_Linear_Regression_01.py
# desc: Linear Regression using TensorFlow
# date: 2019-02-03
# Author: conquistadorjd
################################################################################################
import tensorflow as tf
import numpy as np
from matplotlib import pyplot as plt

print('*** Program Started ***')
########## Input Data Creation
n = 20
x = np.arange(-n/2,n/2,1,dtype=np.float64)

m_real = np.random.uniform(0.8,0.9,(n,))
b_real = np.random.uniform(5,10,(n,))
print('m_real', type(m_real[0]))
y = x*m_real +b_real

########## Variables definition
m = tf.Variable(np.random.uniform(5,15,(1,)))
b = tf.Variable(np.random.uniform(5,15,(1,)))

########## display inout data and datatypes
print('x', x, type(x), type(x[0]))
print('y', y, type(y), type(y[0]))
print('m', m, type(m))
print('b', b, type(b))

########## Plot input to see the data
# plt.scatter(x,y,s=None, marker='o',color='g',edgecolors='g',alpha=0.9,label="Linear Relation")
# plt.grid(color='black', linestyle='--', linewidth=0.5,markevery=int)
# plt.legend(loc=2)
# plt.axis('scaled')
# plt.show()

########## Compute model and loss
model = tf.add(tf.multiply(x,m), b)
loss = tf.reduce_mean(tf.pow(model - y, 2))

########## Use following model if you get TypeError
# model = tf.add(tf.multiply(x, tf.cast(m, tf.float64)), tf.cast(b, tf.float64))
# loss = tf.reduce_mean(tf.pow(model - tf.cast(y, tf.float64), 2))
###########################################################################################

# Create optimizer
learn_rate = 0.01 # you can use 0.1/0.01/0.001 to test the output
num_epochs = 500 # Test output accuracy for different epochs
num_batches = n
optimizer = tf.train.GradientDescentOptimizer(learn_rate).minimize(loss)

########## Initialize variables
init = tf.global_variables_initializer()

########## Launch session
with tf.Session() as sess:
sess.run(init)
print('*** Initialize')

########## This is where training happens
for epoch in range(num_epochs):
for batch in range(num_batches):
sess.run(optimizer)

########## Display and plot results
print('m = ', sess.run(m))
print('b = ', sess.run(b))

x1 = np.linspace(-10,10,50)
y1 = sess.run(m)*x1+sess.run(b)

plt.scatter(x,y,s=None, marker='o',color='g',edgecolors='g',alpha=0.9,label="Linear Relation")
plt.grid(color='black', linestyle='--', linewidth=0.5,markevery=int)
plt.legend(loc=2)
plt.axis('scaled')

plt.plot(x1, y1, 'r')
plt.savefig('TensorFlow_Linear_Regression_01.png')
plt.show()

print('*** Program ended ***')

 

You can change the input and see the output. If you get NaN value in TensorFlow output, please change 0.01 to 0.001 in following line

optimizer = tf.train.GradientDescentOptimizer(0.01)

 

Here is the output

*** Program Started ***
m_real <class 'numpy.float64'>
x [-10.  -9.  -8.  -7.  -6.  -5.  -4.  -3.  -2.  -1.   0.   1.   2.   3.
   4.   5.   6.   7.   8.   9.] <class 'numpy.ndarray'> <class 'numpy.float64'>
y [-0.12267011  1.99923466 -1.82417449  3.70960816 -0.07838254  2.49865561
  6.01521568  4.72467689  4.26350466  6.29306134  6.56424532  6.37343995
  9.1530143   9.99292287 13.1932482   9.23547055 11.28963328 12.00597972
 14.64760425 14.58158682] <class 'numpy.ndarray'> <class 'numpy.float64'>
m <tf.Variable 'Variable:0' shape=(1,) dtype=float64_ref> <class 'tensorflow.python.ops.variables.RefVariable'>
b <tf.Variable 'Variable_1:0' shape=(1,) dtype=float64_ref> <class 'tensorflow.python.ops.variables.RefVariable'>
2019-02-03 16:10:20.898092: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
*** Initialize
m =  [0.79374898]
b =  [7.12266825]
*** Program ended ***

You can ignore the line “2019-02-03 16:10:20.898092: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA ” We will discuss about it later.

TensorFlow Tutorial : Basics

In TensorFlow, the term tensor refers to the representation of data as multi-dimensional array whereas the term flow refers to the series of operations that one performs on tensors.

In TensorFlow, computation is described using a sort of flowchart of operations, called as data flow graphs. Each node of the graph represents an instance of a mathematical operation (like addition, division, or multiplication) and each edge is a multi-dimensional data set (tensor) on which the operations are performed. The input goes in at one end, and then it flows through this system of multiple operations and comes out the other end as output.

A tensor is a vector or matrix of n-dimensions that represents all types of data. All values in a tensor hold identical data type with a known (or partially known) shape. The shape of the data is the dimensionality of the matrix or array. tensors are just multidimensional arrays, that allows you to represent data having higher dimensions. In general, Deep Learning you deal with high dimensional data sets where dimensions refer to different features present in the data set.

  • 0-d tensor: scalar (number)
  • 1-d tensor: vector
  • 2-d tensor: matrix

Constants

If you need constants with specific values inside your training model, then the constant object can be used

rate = tf.constant(15.2, name="rate", dtype=tf.float32)

Variables

Variables in TensorFlow are in-memory buffers containing tensors which have to be explicitly initialized and used in-graph to maintain state across session. By simply calling the constructor the variable is added in computational graph.

name = tf.Variable("techtrekking.com", name="name")

The graph is a set of computation that takes place successively. TensorFlow makes use of a graph framework. The graph gathers and describes all the series computations done during the training

Each operation is called an op node and are connected to each other.

A placeholder is TensorFlow’s way of allowing developers to inject data into the computation graph through placeholders which are bound inside some expressions. they allow developers to create operations, and the computational graph in general, without needing to provide the data in advance for that, and the data can be added in runtime from external sources.

distance = tf.placeholder(tf.float32, name="distance")

A Session object encapsulates the environment in which Operation objects are executed, and Tensor objects are evaluated.  In order to actually evaluate the nodes, we must run a computational graph within a session.

A session encapsulates the control and state of the TensorFlow runtime

Common functions

TensorFlow operator Description
tf.add x+y
tf.subtract x-y
tf.multiply x*y
tf.div x/y
tf.mod x % y
tf.abs |x|
tf.negative -x
tf.sign sign(x)
tf.square x*x
tf.round round(x)
tf.sqrt sqrt(x)
tf.pow x^y
tf.exp e^x
tf.log log(x)
tf.maximum max(x, y)
tf.minimum min(x, y)
tf.cos cos(x)
tf.sin sin(x)

The TensorBoard enables to monitor graphically and visually what TensorFlow is doing. This can be useful for gaining better understanding of machine learning models. We will look at TensorBoard in separate article

TensorFlow : Getting Started

TensorFlow is an open source machine learning framework for everyone. TensorFlow was developed by the Google Brain team for internal Google use. It was released under the Apache 2.0 open-source license. The reason this framework is critical is because its used by Google in production.

There is lots of fluff in data science and the one company which has actually used data science at a scale is google. Right from google search, to google photos to YouTube videos, Google has done amazing things with data science.

Please check how to install TensorFlow to get it installed. Once its installed, we will take a look at some basic and simplest program to get you started.

Here is most basic example of simple multiplication of two numbers

Here is output

python tensorflow_basics_01.py
*** Program Started ***
Tensor("Mul:0", shape=(), dtype=int32)
2019-01-19 21:23:01.021124: I tensorflow/core/platform/cpu_feature_guard.cc:141]
Your CPU supports instructions that this TensorFlow binary was not compiled to
use: AVX2
14
*** Program Ended ***

Please note that when we printed result its displayed Tensor(“Mul:0”, shape=(), dtype=int32). This is because tensorflow has not yet run. It has generated simply graphs. This is also called as lazy evaluation. We need to create session and then run session to get the output.

Let us have a look at another simple program doing multiplication of matrices

 

Adding and fetching hyperlink in Microsoft Excel

Microsoft Excel is words most widely used data analytics tool. Its most widely used even before widespread usage of term data analytics.

Fetching hyperlink from Hyperlinked cell

Many times we have to deal with hyperlinks in excel. If you have a text from which you want to extract hyperlink, there is no readymade formula available. However, you can do it multiple ways, simplest way is to create your own custom fucntion

  1. Press Alt+F11
  2. Insert->Modul
  3. Add following code
  4. Function GetURL(cell As Range, Optional default_value As Variant)
    'Lists the Hyperlink Address for a Given Cell
    'If cell does not contain a hyperlink, return default_value
        If (cell.Range("A1").Hyperlinks.Count <> 1) Then
          GetURL = default_value
        Else
          GetURL = cell.Range("A1").Hyperlinks(1).Address
        End If
    End Function

     

  5. type =GetURL in any cell and select cell having hyperlink, it will fetch only hyperlink.

How to Add hyperlink to a sell

If you have hyperlink column in your excel and you want to add hyperlink to any text, this can be done using following formula

How to install tensorflow on windows

TensorFlow is very easy to implement.  Let us look how to get started with TensorFlow.

pip install tensorflow
C:\Users\ABCDEFG>pip install tensorflow
Collecting tensorflow
  Downloading https://files.pythonhosted.org/packages/05/cd/c171d2e33c0192b04560
ce864c26eba83fed888fe5cd9ded661b2702f2ae/tensorflow-1.12.0-cp36-cp36m-win_amd64.
whl (45.9MB)
    71% |██████████████████████?         | 32.6MB 119kB/s eta 0:
01:52

 

How to restrict plugin access to multisite

WordPress multi-site is a great tool, it helps in creating network blogs with much easy. However, you need to be careful when allowing user full control. WordPress plugins can be misused by sub sites.

If you want to disable plugins, you can disable is very easily. Login using super admin and go to

Network Admin –> Settings –> Network Settings

Scroll all the way to down and un-check the plugin check box

How to clone gist

Gist is one of the most efficient way to share code snippets, single files and full applications with other people. However one disadvantage of gist is that you can’t share directories, but this is not a major issue considering gist is primarily used to share code snippets.

If you want to make local changes to a gist and push them up to the web, you can clone a gist, make changes and then make commits. It is exactly same process as you would with any Git repository.

Let us look at how to clone gist repository using https

Go to gist repository and get https link. Please find below image to see howto get the https link.

Using following command to clone repository

$ git clone https://gist.github.com/820c117b75d52514b2e58008be07a6eb.git
Cloning into '820c117b75d52514b2e58008be07a6eb'...
remote: Enumerating objects: 44, done.
remote: Total 44 (delta 0), reused 0 (delta 0), pack-reused 44
Unpacking objects: 100% (44/44), done.
Checking connectivity... done.

That is it. You are done. You can cd into the folder and check the files.

How to publish code to gist repo along with github repo

Previously I was keeping code in the same post and I was using code formatting add-ons but then I realised this is very inefficient. While searching for best way to add code to post, I discovered Github gist. It is very useful to add code snippets to blog.

However, while using gist, I stumble across a problem. I had to maintain same code at two places, one on github repository and second on gist. Many times, I was updating my old code, now this lead to problem of updating same two at two places. Github repository was getting updated by git push however, I had to manually updated gist. I did not spend much time to find solution as it was not taking much time, nonetheless this was awkward and I made a mental note to find a workaround sometime later.

Today I did a bit research and found couple of methods. Following is the most efficient method that I found.

Step#1 Create your Github repo as usual.

Please check the post Publish local repository to Github using https for creating local repository for your Github repository

Step#2 Create a Gist repository and add this as another remote

Gist are nothing but repository only. Create gist repository from https://gist.github.com/

Gist name will be first file name that you will create. Keep the first file name same as that of Github repo whose copy you want to maintain as Gist.

Now we need to add this repository as remote with name gistrepo on your local repository. I chose to add gistrepo using https as this is easiest, you can get this link as below

Use following command to add remote gistrepo

git remote add gistrepo https://gist.github.com/43XXXXXXXXXXXXXXXXXXX2.git

You can check remote by using following command

$ git remote -v
gistrepo https://gist.github.com/43e731ef9297ecfe4d727831a5e75a22.git (fetch)
gistrepo https://gist.github.com/43e731ef9297ecfe4d727831a5e75a22.git (push)
origin https://github.com/conquistadorjd/python-07-ml.git (fetch)
origin https://github.com/conquistadorjd/python-07-ml.git (push)

Step#3 Push changes to gistrepo after pushing changes to origin i.e. github repo

First push changes to your Github repository.

$ git add .
$ git commit -m "test changes"
$ git push -u origin master
Username for 'https://github.com': conquistadorjd
Password for 'https://conquistadorjd@github.com': 
Counting objects: 3, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 1.49 KiB | 0 bytes/s, done.
Total 3 (delta 1), reused 0 (delta 0)
remote: Resolving deltas: 100% (1/1), completed with 1 local object.
To https://github.com/conquistadorjd/python-07-ml.git
c813187..5c0472f master -> master
Branch master set up to track remote branch master from origin.

Once done, push these changes to your gistrepo repository. While pushing the changes, please make sure you use -f instead of -u.

$ git push -f gistrepo master
Username for 'https://gist.github.com': conquistadorjd
Password for 'https://conquistadorjd@gist.github.com': 
Counting objects: 12, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (10/10), done.
Writing objects: 100% (12/12), 2.59 KiB | 0 bytes/s, done.
Total 12 (delta 4), reused 0 (delta 0)
remote: Resolving deltas: 100% (4/4), done.
To https://gist.github.com/43e731ef9297ecfe4d727831a5e75a22.git
+ cecff3f...5c0472f master -> master (forced update)

Now go and check your gist and Github, both will be in sync and wherever you have used the code, it will be get updated.

Understanding Google Cloud platform and Services

Google cloud and varied services provided on top of it are very confusing to comprehend. It’s not just with google but you can visit AWS and azure, it’s very confusion, rather I would say, since each cloud provided is using their own nomenclature for their services, it becomes difficult to remember what does what. Of course once you start using it, it will become natural to you but when you are getting started, this is frustrating. I have just completed a course How Google does Machine Learning so I thought let me make very simple note for myself and for other to keep track of it.

There are many services which are offered by Google but for beginners everything is not needed to get started, so I will split this post in two parts.

Part-1 : Getting started

Google Cloud Platform is a suite of cloud computing services that runs on the same infrastructure that Google uses internally for its end-user products, such as Google Search and YouTube.

Compute EngineInfrastructure as a Service to run Microsoft Windows and Linux virtual machines.

App EnginePlatform as a Service to deploy Java, PHP, Node.js, Python, C#, .Net, Ruby and Go applications.

Cloud Storage – Object storage with integrated edge caching to store unstructured data.

Cloud Datalab – Tool for data exploration, analysis, visualization and machine learning. This is fully managed Jupyter Notebook service.

BigQuery – Scalable, managed enterprise data warehouse for analytics.

Cloud Natural Language – Text analysis service based on Google Deep Learning models.
https://cloud.google.com/natural-language/

Cloud Speech-to-Text – Speech to text conversion service based on machine learning.
https://cloud.google.com/speech-to-text/

Cloud Text-to-Speech – Text to speech conversion service based on machine learning.
https://cloud.google.com/text-to-speech/

Cloud Translation API – Service to dynamically translate between thousands of available language pairs
https://cloud.google.com/translate/

Cloud Vision API – Image analysis service based on machine learning
https://cloud.google.com/vision/

Cloud Video Intelligence API– Search and discover your media content with Cloud Video Intelligence
https://cloud.google.com/video-intelligence/