How to insert data from pandas to PostgreSQL

When we have done our data analysis using pandas and now need to store this analysis, we can use to_csv option. However if data is too big, it make sense to store this in database.
Let us have a look at two simple methods to store data into PostgreSQL database.

Using sqlalchemy

 

import pandas as pd
from sqlalchemy import create_engine

df =pd.read_csv('input_file.csv',parse_dates=['timestamp'],index_col=0)
#df.info()

engine = create_engine('postgresql://admin:admin@localhost:5432/ibnse')
df.to_sql('stk_1min1', engine,if_exists='append')

Using psycopg2

 

pandas as pd
import psycopg2
conn = psycopg2.connect(database='dbname', user="user", password="password", host="127.0.0.1", port="5432")
cur = conn.cursor()

df =pd.read_csv('input_file.csv',parse_dates=['timestamp'],index_col=0)
for index, row in df.iterrows():
    insertdata =  "('"+str(index)+ "','"+row[0]+"','"+str(row[1])+"','"+str(row[2])+"','"+str(row[3])+"','"+str(row[4])+"','"+str(row[5])+"','"+str(row[6])+"','"+str(row[7])+"','"+str(row[8])+"')"
    print("insertdata :",insertdata)

    try:
        cur.execute("INSERT INTO stk_1min1 values "+insertdata)
        print( "row inserted:", insertdata)
    except psycopg2.IntegrityError:
        print( "Row already exist ")
        pass
    except Exception as e:
        print( "some insert error:", e, "ins: ", insertdata)
    conn.commit()

Few points while using sqlalchemy

  • If table does not exists, it will get created.
  • If table exists and you want to append, you need to use if_exists='append',  Its wise choice to use this option in most of the cases.
  • When you use sqlalchemy, whole pandas dataframe will not be inserted even if you get unique index error for one single record.

How to read data from PostgreSQL to Pandas DataFrame

Usually for training and testing, we create pandas DataFrame from csv file but when we are working while large dataset or working with database stored in database, we need a way to fetch data into pandas DataFrame directly from database. In this article we will have a look at two methods for doing it.

Using sqlalchemy

 

import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('postgresql://user:password@localhost:5432/dbName')

df = pd.read_sql_query("SELECT * FROM public.stk_1min1 where symbol = 'TCS'",con=engine)
df.head()

 

You can write query before using pd.read_sql_query function

import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('postgresql://user:password@localhost:5432/dbName')

stk = 'TCS'
query = "SELECT * FROM public.stk_1min1 where symbol = "+"'"+stk+"'"

df = pd.read_sql_query(query,con=engine)
df.head()

Using psycopg2

 

import pandas as pd
import psycopg2
conn = psycopg2.connect(database='dbname', user="user", password="password", host="127.0.0.1", port="5432")

df = pd.read_sql_query("SELECT * FROM public.stk_1min1 where symbol = 'TCS'",con=conn,index_col=['index'])
df.head()

 

Similar to sqlalchemy example, you can write query before using read_sql_query function.

How to install jupyter notebook Ubuntu

similar to any other package, before installing anything on Ubuntu run following command

$sudo apt-get update

Once system is updated to latest packages, you can install jupyter using following command

$sudo python3.6 -m pip install jupyter
WARNING: The directory '/home/username/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
Collecting jupyter
Downloading jupyter-1.0.0-py2.py3-none-any.whl (2.7 kB)
Collecting notebook
Downloading notebook-6.0.3-py3-none-any.whl (9.7 MB)
|████████████████████████████████| 9.7 MB 76 kB/s 
Collecting ipywidgets
Downloading ipywidgets-7.5.1-py2.py3-none-any.whl (121 kB)
|████████████████████████████████| 121 kB 294 kB/s 
Collecting nbconvert
Downloading nbconvert-5.6.1-py2.py3-none-any.whl (455 kB)
|████████████████████████████████| 455 kB 192 kB/s 
Collecting ipykernel
Downloading ipykernel-5.2.1-py3-none-any.whl (118 kB)
|████████████████████████████████| 118 kB 75 kB/s 
Collecting qtconsole
Downloading qtconsole-4.7.3-py2.py3-none-any.whl (117 kB)
|████████████████████████████████| 117 kB 126 kB/s 
Collecting jupyter-console
Downloading jupyter_console-6.1.0-py2.py3-none-any.whl (21 kB)
Collecting terminado>=0.8.1
Downloading terminado-0.8.3-py2.py3-none-any.whl (33 kB)
Collecting jupyter-client>=5.3.4
Downloading jupyter_client-6.1.3-py3-none-any.whl (106 kB)
|████████████████████████████████| 106 kB 139 kB/s 
Collecting Send2Trash
Downloading Send2Trash-1.5.0-py3-none-any.whl (12 kB)
Collecting ipython-genutils
Downloading ipython_genutils-0.2.0-py2.py3-none-any.whl (26 kB)
Collecting jinja2
Downloading Jinja2-2.11.2-py2.py3-none-any.whl (125 kB)
|████████████████████████████████| 125 kB 263 kB/s 
Collecting nbformat
Downloading nbformat-5.0.6-py3-none-any.whl (170 kB)
|████████████████████████████████| 170 kB 382 kB/s 
Collecting prometheus-client
Downloading prometheus_client-0.7.1.tar.gz (38 kB)
Collecting tornado>=5.0
Downloading tornado-6.0.4.tar.gz (496 kB)
|████████████████████████████████| 496 kB 99 kB/s 
Collecting traitlets>=4.2.1
Downloading traitlets-4.3.3-py2.py3-none-any.whl (75 kB)
|████████████████████████████████| 75 kB 82 kB/s 
Collecting pyzmq>=17
Downloading pyzmq-19.0.0-cp36-cp36m-manylinux1_x86_64.whl (1.1 MB)
|████████████████████████████████| 1.1 MB 71 kB/s 
Collecting jupyter-core>=4.6.1
Downloading jupyter_core-4.6.3-py2.py3-none-any.whl (83 kB)
|████████████████████████████████| 83 kB 212 kB/s 
Collecting widgetsnbextension~=3.5.0
Downloading widgetsnbextension-3.5.1-py2.py3-none-any.whl (2.2 MB)
|████████████████████████████████| 2.2 MB 178 kB/s 
Collecting ipython>=4.0.0; python_version >= "3.3"
Downloading ipython-7.13.0-py3-none-any.whl (780 kB)
|████████████████████████████████| 780 kB 203 kB/s 
Collecting bleach
Downloading bleach-3.1.5-py2.py3-none-any.whl (151 kB)
|████████████████████████████████| 151 kB 290 kB/s 
Collecting pygments
Downloading Pygments-2.6.1-py3-none-any.whl (914 kB)
|████████████████████████████████| 914 kB 130 kB/s 
Collecting mistune<2,>=0.8.1
Downloading mistune-0.8.4-py2.py3-none-any.whl (16 kB)
Collecting pandocfilters>=1.4.1
Downloading pandocfilters-1.4.2.tar.gz (14 kB)
Collecting testpath
Downloading testpath-0.4.4-py2.py3-none-any.whl (163 kB)
|████████████████████████████████| 163 kB 239 kB/s 
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='pypi.org', port=443): Read timed out. (read timeout=15)",)': /simple/defusedxml/
Collecting defusedxml
Downloading defusedxml-0.6.0-py2.py3-none-any.whl (23 kB)
Collecting entrypoints>=0.2.2
Downloading entrypoints-0.3-py2.py3-none-any.whl (11 kB)
Collecting qtpy
Downloading QtPy-1.9.0-py2.py3-none-any.whl (54 kB)
|████████████████████████████████| 54 kB 36 kB/s 
Collecting prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0
Downloading prompt_toolkit-3.0.5-py3-none-any.whl (351 kB)
|████████████████████████████████| 351 kB 62 kB/s 
Collecting ptyprocess; os_name != "nt"
Downloading ptyprocess-0.6.0-py2.py3-none-any.whl (39 kB)
Requirement already satisfied: python-dateutil>=2.1 in /home/username/.local/lib/python3.6/site-packages (from jupyter-client>=5.3.4->notebook->jupyter) (2.8.1)
Requirement already satisfied: MarkupSafe>=0.23 in /usr/lib/python3/dist-packages (from jinja2->notebook->jupyter) (1.0)
Collecting jsonschema!=2.5.0,>=2.4
Downloading jsonschema-3.2.0-py2.py3-none-any.whl (56 kB)
|████████████████████████████████| 56 kB 106 kB/s 
Requirement already satisfied: six in /home/username/.local/lib/python3.6/site-packages (from traitlets>=4.2.1->notebook->jupyter) (1.14.0)
Collecting decorator
Downloading decorator-4.4.2-py2.py3-none-any.whl (9.2 kB)
Requirement already satisfied: pexpect; sys_platform != "win32" in /usr/lib/python3/dist-packages (from ipython>=4.0.0; python_version >= "3.3"->ipywidgets->jupyter) (4.2.1)
Collecting pickleshare
Downloading pickleshare-0.7.5-py2.py3-none-any.whl (6.9 kB)
Requirement already satisfied: setuptools>=18.5 in /home/username/.local/lib/python3.6/site-packages (from ipython>=4.0.0; python_version >= "3.3"->ipywidgets->jupyter) (45.1.0)
Collecting jedi>=0.10
Downloading jedi-0.17.0-py2.py3-none-any.whl (1.1 MB)
|████████████████████████████████| 1.1 MB 98 kB/s 
Collecting backcall
Downloading backcall-0.1.0.tar.gz (9.7 kB)
Collecting packaging
Downloading packaging-20.3-py2.py3-none-any.whl (37 kB)
Collecting webencodings
Downloading webencodings-0.5.1-py2.py3-none-any.whl (11 kB)
Collecting wcwidth
Downloading wcwidth-0.1.9-py2.py3-none-any.whl (19 kB)
Collecting attrs>=17.4.0
Downloading attrs-19.3.0-py2.py3-none-any.whl (39 kB)
Requirement already satisfied: importlib-metadata; python_version < "3.8" in /usr/local/lib/python3.6/dist-packages (from jsonschema!=2.5.0,>=2.4->nbformat->notebook->jupyter) (1.6.0)
Collecting pyrsistent>=0.14.0
Downloading pyrsistent-0.16.0.tar.gz (108 kB)
|████████████████████████████████| 108 kB 202 kB/s 
Collecting parso>=0.7.0
Downloading parso-0.7.0-py2.py3-none-any.whl (100 kB)
|████████████████████████████████| 100 kB 243 kB/s 
Collecting pyparsing>=2.0.2
Downloading pyparsing-2.4.7-py2.py3-none-any.whl (67 kB)
|████████████████████████████████| 67 kB 75 kB/s 
Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.6/dist-packages (from importlib-metadata; python_version < "3.8"->jsonschema!=2.5.0,>=2.4->nbformat->notebook->jupyter) (3.1.0)
Building wheels for collected packages: prometheus-client, tornado, pandocfilters, backcall, pyrsistent
Building wheel for prometheus-client (setup.py) ... done
Created wheel for prometheus-client: filename=prometheus_client-0.7.1-py3-none-any.whl size=41402 sha256=1877caa0c6edd08a07b1fa309bfbdda584bd76e844bdd201b86be6d7d22cafbe
Stored in directory: /tmp/pip-ephem-wheel-cache-8bi_3gi1/wheels/1d/4a/79/a3ad3f74b3495b4555359375ca33ad7b64e77f8b7a53c8894f
Building wheel for tornado (setup.py) ... done
Created wheel for tornado: filename=tornado-6.0.4-cp36-cp36m-linux_x86_64.whl size=427632 sha256=d6aafe6d2604804cf85b680683a70cef28b97f1eb209117421a48c1aa9ab7c68
Stored in directory: /tmp/pip-ephem-wheel-cache-8bi_3gi1/wheels/37/a7/db/2d592e44029ef817f3ef63ea991db34191cebaef087a96f505
Building wheel for pandocfilters (setup.py) ... done
Created wheel for pandocfilters: filename=pandocfilters-1.4.2-py3-none-any.whl size=7855 sha256=422d885d227893b61571ea6ff8f118801dc5949559cdb74af149b10804d8ebbc
Stored in directory: /tmp/pip-ephem-wheel-cache-8bi_3gi1/wheels/46/c4/40/718c6fd14c2129ccaee10e0cf03ef6c4d01d98cad5dbbfda38
Building wheel for backcall (setup.py) ... done
Created wheel for backcall: filename=backcall-0.1.0-py3-none-any.whl size=10412 sha256=dc8f81fbfca6f8b8b1321046de42f718a1f7b1cc1ce997bbc18228a2c9d3c0ff
Stored in directory: /tmp/pip-ephem-wheel-cache-8bi_3gi1/wheels/b4/cb/f1/d142b3bb45d488612cf3943d8a1db090eb95e6687045ba61d1
Building wheel for pyrsistent (setup.py) ... done
Created wheel for pyrsistent: filename=pyrsistent-0.16.0-cp36-cp36m-linux_x86_64.whl size=97738 sha256=f6ffcf09823e6aadbd67a5b06ba106699614a0b3add6f7a0ba92a1e6fa2ea49f
Stored in directory: /tmp/pip-ephem-wheel-cache-8bi_3gi1/wheels/d1/8a/1c/32ab9017418a2c64e4fbaf503c08648bed2f8eb311b869a464
Successfully built prometheus-client tornado pandocfilters backcall pyrsistent
Installing collected packages: ptyprocess, tornado, terminado, ipython-genutils, decorator, traitlets, jupyter-core, pyzmq, jupyter-client, Send2Trash, pyparsing, packaging, webencodings, bleach, pygments, mistune, jinja2, pandocfilters, attrs, pyrsistent, jsonschema, nbformat, testpath, defusedxml, entrypoints, nbconvert, wcwidth, prompt-toolkit, pickleshare, parso, jedi, backcall, ipython, ipykernel, prometheus-client, notebook, widgetsnbextension, ipywidgets, qtpy, qtconsole, jupyter-console, jupyter
Successfully installed Send2Trash-1.5.0 attrs-19.3.0 backcall-0.1.0 bleach-3.1.5 decorator-4.4.2 defusedxml-0.6.0 entrypoints-0.3 ipykernel-5.2.1 ipython-7.13.0 ipython-genutils-0.2.0 ipywidgets-7.5.1 jedi-0.17.0 jinja2-2.11.2 jsonschema-3.2.0 jupyter-1.0.0 jupyter-client-6.1.3 jupyter-console-6.1.0 jupyter-core-4.6.3 mistune-0.8.4 nbconvert-5.6.1 nbformat-5.0.6 notebook-6.0.3 packaging-20.3 pandocfilters-1.4.2 parso-0.7.0 pickleshare-0.7.5 prometheus-client-0.7.1 prompt-toolkit-3.0.5 ptyprocess-0.6.0 pygments-2.6.1 pyparsing-2.4.7 pyrsistent-0.16.0 pyzmq-19.0.0 qtconsole-4.7.3 qtpy-1.9.0 terminado-0.8.3 testpath-0.4.4 tornado-6.0.4 traitlets-4.3.3 wcwidth-0.1.9 webencodings-0.5.1 widgetsnbextension-3.5.1

 

Yeah, this is pretty long list of packages required for jupyter. !

Once installation is completed, you can start jupyter using following command

$ jupyter notebook --allow-root
[I 20:44:06.979 NotebookApp] The port 8888 is already in use, trying another port.
[I 20:44:06.982 NotebookApp] Serving notebooks from local directory: /home/username/code/quant
[I 20:44:06.982 NotebookApp] The Jupyter Notebook is running at:
[I 20:44:06.982 NotebookApp] http://localhost:8889/?token=94cc0863965bf1d2751ad8a7a4b26d08f1b9f0be4f380c9a
[I 20:44:06.982 NotebookApp] or http://127.0.0.1:8889/?token=94cc0863965bf1d2751ad8a7a4b26d08f1b9f0be4f380c9a
[I 20:44:06.982 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 20:44:06.987 NotebookApp]

To access the notebook, open this file in a browser:
file:///home/username/.local/share/jupyter/runtime/nbserver-8021-open.html
Or copy and paste one of these URLs:
http://localhost:8889/?token=94cc0863965bf1d2751ad8a7a4b26d08f1b9f0be4f380c9a
or http://127.0.0.1:8889/?token=94cc0863965bf1d2751ad8a7a4b26d08f1b9f0be4f380c9a
[W 20:44:09.233 NotebookApp] 404 GET /api/kernels/7f717081-a79b-4170-b5be-c7e548ea28c7/channels?session_id=a53b179c3d2c4deda89418767c9bb6b9 (127.0.0.1): Kernel does not exist: 7f717081-a79b-4170-b5be-c7e548ea28c7
[W 20:44:09.249 NotebookApp] 404 GET /api/kernels/7f717081-a79b-4170-b5be-c7e548ea28c7/channels?session_id=a53b179c3d2c4deda89418767c9bb6b9 (127.0.0.1) 19.55ms referer=None

and then you can access your jupyter notebooks at http://localhost:8889
Here is how it looks

Hope this helps.

How to update or insert field in Elasticsearch using Python

Update or insert is done by API. Most useful fact is that its upsert meaning, we use the same API. If field exists, it will get updated and if it does not exist, it will be inserted.
My first record is as below

import requests
import json

uri='http://localhost:9200/book/_doc/1'
headers1 ={'Content-Type': 'application/json'}
response = requests.get(uri,headers=headers1)
print("*************************************")
print("Accessing document before update : ", response.text)

Terminal output

Accessing document before update :  {
      "isbn": "9781593275846",
      "title": "Eloquent JavaScript, Second Edition",
      "subtitle": "A Modern Introduction to Programming",
      "author": "Marijn Haverbeke",
      "published": "2014-12-14T00:00:00.000Z",
      "publisher": "No Starch Press",
      "pages": 472,
      "description": "JavaScript lies at the heart of almost every modern web application, from social apps to the newest browser-based games. Though simple for beginners to pick up and play with, JavaScript is a flexible, complex language that you can use to build full-scale applications.",
      "website": "http://eloquentjavascript.net/"
    }

Now let us look at the code to update and insert the field in document.  Please note that there are multiple ways of doing update / insert.

import requests
import json
####################################################################### Insert one field in document
uri='http://localhost:9200/book/_update/1'
headers1 ={'Content-Type': 'application/json'}
query = json.dumps(
    {
    "script" : "ctx._source.price = 365.65"
    }
)
response = requests.post(uri,headers=headers1,data=query)
print("*************************************")
print(" Output", response.text)

####################################################################### Insert two fields in document
uri='http://localhost:9200/book/_update/1'
headers1 ={'Content-Type': 'application/json'}
query = json.dumps(
{
  "doc": {
    "summary":"summary3",
    "coauth":"co author name"
  }
}
)
response = requests.post(uri,headers=headers1,data=query)
print("*************************************")
print(" Output", response.text)

####################################################################### Insert nested field in document
uri='http://localhost:9200/book/_update/1'
headers1 ={'Content-Type': 'application/json'}
query = json.dumps(
    {
    "doc": {
        "country":{
        "continent":"Asia",
        "code" : 91
        },
        "coauthor_two":"Another co-author"
    }
    }
)
response = requests.post(uri,headers=headers1,data=query)
print("*************************************")
print(" Output", response.text)

####################################################################### Update field
uri='http://localhost:9200/book/_update/1'
headers1 ={'Content-Type': 'application/json'}
query = json.dumps(
{
  "doc": {
    "title":"Eloquent JavaScript, Second Edition - Updated using API from Python Script"
  }
}
)
response = requests.post(uri,headers=headers1,data=query)
print("*************************************")
print(" Output", response.text)

Once we have run above script, let us look at the document again by running below program or checking kibana.

import requests
import json
uri='http://localhost:9200/book/_source/1'
headers1 ={'Content-Type': 'application/json'}
response = requests.get(uri,headers=headers1)
print("*************************************")
print("Accessing document After update : ", response.text)

Output

Accessing document After update : {
    "isbn" : "9781593275846",
    "title" : "Eloquent JavaScript, Second Edition - Updated using API from Python Script",
    "subtitle" : "A Modern Introduction to Programming",
    "author" : "Marijn Haverbeke",
    "published" : "2014-12-14T00:00:00.000Z",
    "publisher" : "No Starch Press",
    "pages" : 472,
    "description" : "JavaScript lies at the heart of almost every modern web application, from social apps to the newest browser-based games. Though simple for beginners to pick up and play with, JavaScript is a flexible, complex language that you can use to build full-scale applications.",
    "website" : "http://eloquentjavascript.net/",
    "price" : 365.65,
    "summary" : "summary3",
    "coauth" : "co author name",
    "coauthor_two" : "Another co-author",
    "country" : {
      "continent" : "Asia",
      "code" : 91
    }

Kibana output

Elasticsearch API Search document queries

Elasticsearch has very detailed search API but its bit different for someone with RDBMS and SQL query background. Here are some of the sample queries.

search all

GET /ecomm/_search
{
    "query": {
        "match_all": {}
    }
}

Search for specific key

GET /twitter/_search
{
    "query": {
        "match": { "user":"ssss"}
    }
}

Search using URL

GET /ecomm/_search?q=Apple

Detailed search for nested value

GET /ecomm/_search
{
    "query": {
        "match" : {
            "productBaseInfoV1.productId": "MOBFKCTSYAPWYFJ5"
        }
    }
}

Search with more parameters

POST /ecomm/_search
{
    "query": {
        "match" : {
            "productBaseInfoV1.productId": "MOBFKCTSYAPWYFJ5"
        }
    },
    "size": 1,
    "from": 0,
    "_source": [ "productBaseInfoV1.productId", "productBaseInfoV1.title", "imageUrls","productDescription" ]
}

Accessing Elasticsearch API from Python Script

Elasticsearch provides easy to use API and it can be access from kibana, postman, browser and curl.  You can read here how to access elasticsearch API from these options.

In this post we will look at very simple example of accessing elasticsearch API from python. Here is simple example along with results

import requests
import json

uri='http://localhost:9200/_cat/indices'
headers1 ={'Content-Type': 'application/json'}
response = requests.get(uri,headers=headers1)
print("this is : ", response.text)

uri='http://localhost:9200/twitter/_doc/7'
headers1 ={'Content-Type': 'application/json'}
response = requests.get(uri,headers=headers1)
print("*************************************")
print("Accessing document before creating : ", response.text)

uri='http://localhost:9200/twitter/_doc/7'
headers1 ={'Content-Type': 'application/json'}
query = json.dumps({
    "query": {
            "user" : "NewUser",
            "post_date" : "2009-11-15T14:12:12",
            "message" : "trying out Elasticsearch from python"
            }
        })
response = requests.put(uri,headers=headers1,data=query)
print("*************************************")
print("Document is created ", response.text)

uri='http://localhost:9200/twitter/_doc/7'
headers1 ={'Content-Type': 'application/json'}
response = requests.get(uri,headers=headers1)
print("*************************************")
print("Accessing newly created document: ", response.text)

Here is output


$ python3 second.py 
/usr/lib/python3/dist-packages/requests/__init__.py:80: RequestsDependencyWarning: urllib3 (1.25.7) or chardet (3.0.4) doesn't match a supported version!
  RequestsDependencyWarning)
this is :  yellow open ecommpediatest               R6WypBc2RtCS_ITA40_rFw 1 1    0 0    283b    283b
yellow open test-index                   OeILxjmFRhODhztu4GgE_w 1 1    1 0   4.5kb   4.5kb
yellow open twitter                      BUmjGbLpTNCbnDIPxht9vA 1 1   13 4  24.8kb  24.8kb
yellow open bookindex                    RKq8oJKvRb2HQHxuPZdEbw 1 1    1 0   5.4kb   5.4kb
green  open .kibana_task_manager_1       pc_LXegKQWu9690vT1Z-pA 1 0    2 1  16.9kb  16.9kb
green  open kibana_sample_data_ecommerce FkD1obNSSK6mfLDsy9ILPQ 1 0 4675 0   4.9mb   4.9mb
yellow open facebook                     4Wzax6UhThm5rXi03PiG7w 1 1    2 0     7kb     7kb
green  open .apm-agent-configuration     sxoXMmsoS4mRPyjcaPByzw 1 0    0 0    283b    283b
green  open .kibana_1                    pK7pO-SCSBORonZLDi8Vew 1 0   60 2 945.9kb 945.9kb
yellow open twitter1                     3iMzcYoJR3qY9JiM2sY8_g 1 1    1 0   4.5kb   4.5kb

*************************************
Accessing document before creating :  {"_index":"twitter","_type":"_doc","_id":"7","found":false}
*************************************
Document is created  {"_index":"twitter","_type":"_doc","_id":"7","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":28,"_primary_term":2}
*************************************
Accessing newly created document:  {"_index":"twitter","_type":"_doc","_id":"7","_version":1,"_seq_no":28,"_primary_term":2,"found":true,"_source":{"query": {"user": "NewUser", "post_date": "2009-11-15T14:12:12", "message": "trying out Elasticsearch from python"}}}

We will have a look at complex queries from python at later stages.

How to use Elasticsearch API

Elasticsearch exposes REST APIs that are used by the UI components and can be called directly to configure and access Elasticsearch features.

Show current indices

http://localhost:9200/_cat/indices

Output

yellow open test-index                   OeILxjmFRhODhztu4GgE_w 1 1    1 0   4.5kb   4.5kb
yellow open twitter                      BUmjGbLpTNCbnDIPxht9vA 1 1    1 0   3.7kb   3.7kb
yellow open bookindex                    RKq8oJKvRb2HQHxuPZdEbw 1 1    1 0   5.4kb   5.4kb
green  open kibana_sample_data_ecommerce FkD1obNSSK6mfLDsy9ILPQ 1 0 4675 0   4.9mb   4.9mb
green  open .kibana_task_manager_1       pc_LXegKQWu9690vT1Z-pA 1 0    2 1  16.9kb  16.9kb
yellow open facebook                     4Wzax6UhThm5rXi03PiG7w 1 1    1 0   3.5kb   3.5kb
green  open .apm-agent-configuration     sxoXMmsoS4mRPyjcaPByzw 1 0    0 0    283b    283b
green  open .kibana_1                    pK7pO-SCSBORonZLDi8Vew 1 0   60 2 949.5kb 949.5kb

This and below mentioned commands can be run from kibana, postman, browser or cURL

Curl

$ curl "localhost:9200/_cat/indices"
yellow open ecommpediatest               R6WypBc2RtCS_ITA40_rFw 1 1    0 0    283b    283b
yellow open test-index                   OeILxjmFRhODhztu4GgE_w 1 1    1 0   4.5kb   4.5kb
yellow open twitter                      BUmjGbLpTNCbnDIPxht9vA 1 1   11 5  26.9kb  26.9kb
yellow open bookindex                    RKq8oJKvRb2HQHxuPZdEbw 1 1    1 0   5.4kb   5.4kb
green  open .kibana_task_manager_1       pc_LXegKQWu9690vT1Z-pA 1 0    2 1  16.9kb  16.9kb
green  open kibana_sample_data_ecommerce FkD1obNSSK6mfLDsy9ILPQ 1 0 4675 0   4.9mb   4.9mb
green  open .apm-agent-configuration     sxoXMmsoS4mRPyjcaPByzw 1 0    0 0    283b    283b
yellow open facebook                     4Wzax6UhThm5rXi03PiG7w 1 1    2 0     7kb     7kb
green  open .kibana_1                    pK7pO-SCSBORonZLDi8Vew 1 0   60 2 945.9kb 945.9kb
yellow open twitter1                     3iMzcYoJR3qY9JiM2sY8_g 1 1    1 0   4.5kb   4.5kb

Kibana

 

Postman

 

Browser

How to create new index

PUT /ecommpediatest

same can e achieved by create new document command

PUT /twitter1/_doc/1
{
"user" : "kimchy updated",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}

Creating or updating record

------------------- Get records
GET twitter/_doc/1
HEAD twitter/_doc/1
GET twitter/_source/10
HEAD twitter/_source/1
-------------------update API
PUT /twitter/_doc/1
{
"user" : "kimchy",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}

POST /twitter/_doc/2
{
"user" : "tom",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}
PUT /twitter/_create/2
{
"user" : "Jerry",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}
POST /twitter/_create/
{
"user" : "cartoon",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}
-------------------delete
DELETE /twitter/_doc/1
-------------------- create
PUT /twitter/_doc/1
{
"user" : "one",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}

POST /twitter/_doc/
{
"user" : "one",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}

PUT /twitter/_create/aa
{
"user" : "xxx",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}
POST /twitter/_create/ss
{
"user" : "ssss",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}
------------------------------------------------------search
GET /_search
{
"query": {
"ids" : {
"values" : ["1", "ss", "100"]
}
}
}
GET /twitter/_search
{
"query": {
"ids" : {
"values" : [11,"ss"]
}
}
}
GET /twitter/_search
{
"query": {
"match_all": {}
}
}

GET /twitter/_search
{
"query": {
"match": { "user":"ssss"}
}
}

 

 

 

How to Install and Configure Kibana on Ubuntu

A picture is worth a thousand log lines. Kibana gives you the freedom to select the way you give shape to your data. Kibana lets you visualize your Elasticsearch data and navigate the Elastic Stack so you can do anything from tracking query load to understanding the way requests flow through your apps.
Install Kibana<
Packages is signed with the Elasticsearch Signing Key. Download and install the public signing key:

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -

You may need to install the apt-transport-https package on Ubuntu before proceeding

$sudo apt-get install apt-transport-https
Reading package lists... Done
Building dependency tree
Reading state information... Done
apt-transport-https is already the newest version (1.6.12).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.

Save the repository definition to /etc/apt/sources.list.d/elastic-7.x.list:

echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-7.x.list

You can install the Kibana Debian package with:

sudo apt-get update
sudo apt-get install kibana

You can start and stop Kibana service by following commands

sudo systemctl start kibana.service
sudo systemctl stop kibana.service

Once you have started kibana service, let us check status

$ sudo systemctl status kibana.service
● kibana.service - Kibana
Loaded: loaded (/etc/systemd/system/kibana.service; disabled; vendor preset: enabled)
Active: active (running) since Tue 2020-01-07 21:25:22 IST; 9s ago
Main PID: 6226 (node)
Tasks: 11 (limit: 4915)
CGroup: /system.slice/kibana.service
└─6226 /usr/share/kibana/bin/../node/bin/node /usr/share/kibana/bin/../src/cli -c /etc/kibana/kibana.yml

Let us try to access kibana server by accessing  at http://localhost:5601

Now, we need to make following changes to file located at /etc/kibana/kibana.yml

server.port: 5601
server.host: "localhost"
elasticsearch.hosts: ["http://localhost:9200"]

Now let us try to access kibana again at http://localhost:5601

Here is sample Dashboard, this dashboard is a

a

How to Install and Configure Elasticsearch on Ubuntu

Elasticsearch is a distributed, open source search and analytics engine for all types of data, including textual, numerical, geospatial, structured, and unstructured. Elasticsearch is built on Apache Lucene and was first released in 2010 by Elasticsearch N.V. . Elasticsearch is the central component of the Elastic Stack, a set of open source tools for data ingestion, enrichment, storage, analysis, and visualization. Commonly referred to as the ELK Stack (after Elasticsearch, Logstash, and Kibana).

Elasticsearch is known for its simple REST APIs, distributed nature, speed, and scalability. You can use http methods like get , post, put, delete in combination with an url to manipulate your data.

Prerequisites

Elasticsearch is developed in java and you need to have java installed on your Ubuntu. Use following command to check if java is installed and its version.

$ java --version
openjdk 11.0.5 2019-10-15
OpenJDK Runtime Environment (build 11.0.5+10-post-Ubuntu-0ubuntu1.118.04)
OpenJDK 64-Bit Server VM (build 11.0.5+10-post-Ubuntu-0ubuntu1.118.04, mixed mode, sharing)

Now check if  $JAVA_HOME variable is configured by checking its content by following command. If it returns blank, you need to set it up. Check post How to set Java environment path $JAVA_HOME in Ubuntu for setting up $JAVA_HOME

$ echo $JAVA_HOME
/usr/java/java-1.11.0-openjdk-amd6

Setting up Elasticsearch

Packages is signed with the Elasticsearch Signing Key. Download and install the public signing key:

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -

You may need to install the apt-transport-https package on Ubuntu before proceeding

$sudo apt-get install apt-transport-https
Reading package lists... Done
Building dependency tree 
Reading state information... Done
apt-transport-https is already the newest version (1.6.12).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded./code>

Now Save the repository definition to /etc/apt/sources.list.d/elastic-7.x.list:

$ echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-7.x.list
deb https://artifacts.elastic.co/packages/7.x/apt stable main

You can install the Elasticsearch Debian package with:

sudo apt-get update 
sudo apt-get install elasticsearch

Configure Elasticsearch

Elasticsearch files are located in directory /etc/elasticsearch. There are two files

  1. elasticsearch.yml configures the Elasticsearch server settings.
  2. logging.yml it provides configuration for logging

Let us update elasticsearch.yml

sudo nano /etc/elasticsearch/elasticsearch.yml

There are multiple variables, however you need to update following variables

  • cluster.name : specifies the name of the cluster
  • node.name: specifies the name of the server (node) . Please note the double inverted commas.

Please do not change network.host. You might see that in some posts people have changed network.host 0.0.0.0 or 127.0.0.1. If you change network.host, The cluster coordination algorithm has changed in 7.0 610 and in order to be safe it requires some specific configuration. This is relaxed when you bind to localhost only, but if/when you change network.host elasticsearch enforces that your configure the cluster safely.

# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: mycluster1
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: "node-1"
#
# Add custom attributes to the node:
#
#node.attr.rack: r1


...

# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
#network.host: 192.168.0.1
#
# Set a custom port for HTTP:
#
#http.port: 9200
#
# For more information, consult the network module documentation.
#

Launch Elasticsearch

You can start and stop Elasticsearch using following command

sudo systemctl start elasticsearch.service
sudo systemctl stop elasticsearch.service

Testing Elasticsearch

You can check connectivity using curl. Install curl if you dont have it on your machine and run following command

$ curl -X GET "http://localhost:9200/?pretty"
{
  "name" : "node-1",
  "cluster_name" : "mycluster1",
  "cluster_uuid" : "Q6GqVVJfRZO6KSHQ-pFbcQ",
  "version" : {
    "number" : "7.5.1",
    "build_flavor" : "default",
    "build_type" : "deb",
    "build_hash" : "3ae9ac9a93c95bd0cdc054951cf95d88e1e18d96",
    "build_date" : "2019-12-16T22:57:37.835892Z",
    "build_snapshot" : false,
    "lucene_version" : "8.3.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

How to install tensorflow on windows

TensorFlow is very easy to implement.  Let us look how to get started with TensorFlow.

pip install tensorflow
C:\Users\ABCDEFG>pip install tensorflow
Collecting tensorflow
  Downloading https://files.pythonhosted.org/packages/05/cd/c171d2e33c0192b04560
ce864c26eba83fed888fe5cd9ded661b2702f2ae/tensorflow-1.12.0-cp36-cp36m-win_amd64.
whl (45.9MB)
    71% |██████████████████████?         | 32.6MB 119kB/s eta 0:
01:52