How to insert data from pandas to PostgreSQL

When we have done our data analysis using pandas and now need to store this analysis, we can use to_csv option. However if data is too big, it make sense to store this in database.
Let us have a look at two simple methods to store data into PostgreSQL database.

Using sqlalchemy

 

import pandas as pd
from sqlalchemy import create_engine

df =pd.read_csv('input_file.csv',parse_dates=['timestamp'],index_col=0)
#df.info()

engine = create_engine('postgresql://admin:admin@localhost:5432/ibnse')
df.to_sql('stk_1min1', engine,if_exists='append')

Using psycopg2

 

pandas as pd
import psycopg2
conn = psycopg2.connect(database='dbname', user="user", password="password", host="127.0.0.1", port="5432")
cur = conn.cursor()

df =pd.read_csv('input_file.csv',parse_dates=['timestamp'],index_col=0)
for index, row in df.iterrows():
    insertdata =  "('"+str(index)+ "','"+row[0]+"','"+str(row[1])+"','"+str(row[2])+"','"+str(row[3])+"','"+str(row[4])+"','"+str(row[5])+"','"+str(row[6])+"','"+str(row[7])+"','"+str(row[8])+"')"
    print("insertdata :",insertdata)

    try:
        cur.execute("INSERT INTO stk_1min1 values "+insertdata)
        print( "row inserted:", insertdata)
    except psycopg2.IntegrityError:
        print( "Row already exist ")
        pass
    except Exception as e:
        print( "some insert error:", e, "ins: ", insertdata)
    conn.commit()

Few points while using sqlalchemy

  • If table does not exists, it will get created.
  • If table exists and you want to append, you need to use if_exists='append',  Its wise choice to use this option in most of the cases.
  • When you use sqlalchemy, whole pandas dataframe will not be inserted even if you get unique index error for one single record.

How to read data from PostgreSQL to Pandas DataFrame

Usually for training and testing, we create pandas DataFrame from csv file but when we are working while large dataset or working with database stored in database, we need a way to fetch data into pandas DataFrame directly from database. In this article we will have a look at two methods for doing it.

Using sqlalchemy

 

import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('postgresql://user:password@localhost:5432/dbName')

df = pd.read_sql_query("SELECT * FROM public.stk_1min1 where symbol = 'TCS'",con=engine)
df.head()

 

You can write query before using pd.read_sql_query function

import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('postgresql://user:password@localhost:5432/dbName')

stk = 'TCS'
query = "SELECT * FROM public.stk_1min1 where symbol = "+"'"+stk+"'"

df = pd.read_sql_query(query,con=engine)
df.head()

Using psycopg2

 

import pandas as pd
import psycopg2
conn = psycopg2.connect(database='dbname', user="user", password="password", host="127.0.0.1", port="5432")

df = pd.read_sql_query("SELECT * FROM public.stk_1min1 where symbol = 'TCS'",con=conn,index_col=['index'])
df.head()

 

Similar to sqlalchemy example, you can write query before using read_sql_query function.