How to crop image using Python

Image cropping is very easy using python library Pillow. Following is simple program to crop image in required parts.

Most important part of cropping is to know pixel size of your image, you can crop required portion using .crop method of Image module of Pillow.

Sample program

Please note that crop takes tuple  having four values as input. These four values represent left, upper, right, lower point .

Original image

Left Half of the image

Right Half of the image

Top Right corner of the image

Here is the output of the program, please note the pixel size of each cropped image.

$ python3.6 
*** Program Started ***
im.size (1920, 1318)
im.size (960, 1318)
im.size (960, 1318)
im.size (960, 659)
*** Program Ended ***

How to compress images using Python

Python Library Pillow can be used very effectively to compress images. While doing some research on this I found that .JPG files can be compress very effectively however this does not work well with .PNG files. Here is sample program to reduce file size of an image.

Output of the program while using .JPG file as input and output

$ python3.6 
*** Program Started ***
Input file size : (5456, 3632)
Input file name : 05_compress_image_01_input.jpg
Input Image Size : 1611664
Output file size : (5456, 3632)
Output file name : 05_compress_image_01_output.jpg
Output Image Size : 443479
*** Program Ended ***

Output of the program while using .PNG files as input and output

$ python3.6 
*** Program Started ***
Input file size : (1920, 1282)
Input file name : 05_compress_image_01_input.png
Input Image Size : 3683320
Output file size : (1920, 1282)
Output file name : 05_compress_image_01_output.png
Output Image Size : 3619363
*** Program Ended ***

As you can see, while using .PNG files for input and output, there is hardly any change in file size however when you used .JPG files, output files is of 27% of original file size. Your percentage reduction might be different based on the file that you have used.

Please note input and out file dimensions, number of pixels stays the same.

How to resize image using Python

Python library Pillow can be used to resize images. Please note resize does not mean compressing image. Yes, reducing pixels can lead to reduction in file size in terms of KBs however it will not be significant. If you need to compress image, please check How to compress image using python.

While resizing the image file, you need to maintain the aspect ration, else image might get distorted, we will see an example of the same.

Here is sample program to resize.

here is the output

$ python3.6 
*** Program Started ***
im.size (1920, 1318)
im.size (960, 659)
*** Program Ended ***

Here we have maintained the aspect ratio so the files will look similar


Now lets look at resize by not maintaining the aspect ratio

Output of the program

$ python3.6 
*** Program Started ***
im.size (1920, 1318)
im.size (500, 480)
*** Program Ended ***

You can see the issue with changing of aspect ratio.



Size comparison 

How to add text on image using Python

As a part of image processing, we sometimes need to write text on the image file. Pillow is a python library which can be used to add text on images using python. Using Pillow, which us a fork of PIL, is very easy for these kind of image processing activities.

Let us have a look at adding simple text on image file.

Please note that text location is determined by parameter position, its tuple with two parameters and it represents x and y axis position. In below example I have used position = (50, 50).

Now if you are wondering what these values represent, these are pixel numbers with (0,0) being top left corner.  If you image size is (640, 480) and you want to put any character exactly at center, you need to provide position as (320,240). Hope this clarifies the parameter

Here is the output

As you can see the text added does not look good sine we have not added any formatting to text.  Now let us look at formatted text  on image. Here is program with additional parameters for font

Here we have added two additional parameters, font and color, you can plan your desired font into font directory and can use it for adding text to image file.

Note: I tried using this for non english text e.g. devnagari text but it did not work.

Here is the output.

How to create blank image using Pillow, Python

Sometimes we need a blank image file to be created during execution of program, during processing required text or anything else can be added. Following is the sample program to create blank file using PIL, python.

You can decide file size and color as per your requirement.

Here is the output image

How to read image using Pillow, Python and get image attributes

Pillow is the friendly PIL fork. PIL is the Python Imaging Library. This is the first article in series of image processing articles using python.

In this article we will see how to read file using pillow and get basic attributes.

Here is simplest program to read image file using pillow and get basic attributes

Output of this program

*** Program Started ***
im object: <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1920x1285 at 0x7F7FB79F6518>
format : JPEG
size : (1920, 1285)
mode : RGB
filename : /home/conquistador/code/github/python-01-utilities/image/input/01_read_image.jpg
width : 1920
height : 1285
info : {'jfif': 257, 'jfif_version': (1, 1), 'jfif_unit': 0, 'jfif_density': (1, 1), 'progressive': 1, 'progression': 1}
*** Program Ended ***


How to check file size in Python

Many times while doing file processing python, we need to know file size in bytes/KBs/MBs. You can get file size using multiple methods, following are two simple methods to get file size using os module.

I have run this file for two inputs,one is image and another is empty file.

Here is the output of the program

$ python3.6 
*** Program Started ***
Input file is not empty
File size (in Bytes) : 147162
File size (in Bytes) : 147162
Input file is empty
File size (in Bytes) : 0
File size (in Bytes) : 0
*** Program Ended ***

Actual file size

I have added a check to see if file size is empty, you might need to check file size before doing any processing on files.

Generating data for Linear Regression using NumPy

We have already seen how to generate random numbers in previous article, here we will have a look at how to generate data in specific format for linear regression.

To test data for linear regression, we will need a data which has somewhat linear relationship and one set of random data.  Please find below code to generate data having some linear relation and random data using Python and NumPy. I have provided graphs which will help you understand the data created by using these programs.

Data with Linear Trend for Linear Regression

Data without any Trend for Linear Regression

You can use this as an input data while training your model.

Generating Random Numbers With NumPy

Many times we need some data for testing or we need some random numbers. NumPy can be very effective in generating random integers, floats or random values between 0 and 1. You can fetch truly random values as well as values in normal distribution as well.

Following program has multiple methods of creating random number for use in program

Brief History of Machine Learning

The term “Machine Learning” is coined by Arthur Samuel in 1959 while at IBM.

Brief History of ML

Date Details
1950 Alan Turing creates the “Turing Test” to determine if a computer has real intelligence. To pass the test, a computer must be able to fool a human into believing it is also human.
1950 Arthur Samuel wrote the first computer learning program. The program was the game of checkers, and the IBM computer improved at the game the more it played, studying which moves made up winning strategies and incorporating those moves into its program.
1957 Frank Rosenblatt designed the first neural network for computers (the perceptron)
1967 The “nearest neighbor” algorithm was written, allowing computers to begin using very basic pattern recognition. This could be used to map a route for traveling salesmen, starting at a random city but ensuring they visit all cities during a short tour.
1979 Students at Stanford University invent the “Stanford Cart” which can navigate obstacles in a room on its own.
1981 Gerald Dejong introduces the concept of Explanation Based Learning (EBL), in which a computer analyses training data and creates a general rule it can follow by discarding unimportant data.
1985 Terry Sejnowski invents NetTalk, which learns to pronounce words the same way a baby does.
1997 IBM’s Deep Blue beats the world champion at chess.
2006 Geoffrey Hinton coins the term “deep learning” to explain new algorithms that let computers “see” and distinguish objects and text in images and videos.
2008 DJ Patil and Jeff Hammerbacher coined the term “Data Scientist”
2011 IBM’s Watson beats its human competitors at Jeopardy.
2012 Google’s X Lab develops a machine learning algorithm that is able to autonomously browse YouTube videos to identify the videos that contain cats.
2014 Facebook FB develops DeepFace, a software algorithm that is able to recognize or verify individuals on photos to the same level as humans can.
2016 Google’s artificial intelligence algorithm beats a professional player at the Chinese board game Go, which is considered the world’s most complex board game and is many times harder than chess.


According to Michael I. Jordan, the ideas of machine learning, from methodological principles to theoretical tools, have had a long pre-history in statistics. He also suggested the term data science as a placeholder to call the overall field. You can refer to below, one the most famous venn diagram for Data Science.

How ML is different from AI ?

In the early days of AI, an increasing emphasis on the logical, knowledge-based approach caused a rift between AI and machine learning. By 1980, expert systems had come to dominate AI, and statistics was out of favor.

Machine learning, reorganized as a separate field, started to flourish in the 1990s. The field changed its goal from achieving artificial intelligence to tackling solvable problems of a practical nature. It shifted focus away from the symbolic approaches it had inherited from AI, and toward methods and models borrowed from statistics and probability theory.[11] It also benefited from the increasing availability of digitized information, and the ability to distribute it via the Internet.

Here is another famous venn diagram.

Hal Varian, Google’s chief economist, predicted in 2008 that the job of statistician will become the “sexiest” around. Data, he explains, are widely available; what is scarce is the ability to extract wisdom from them. Data are becoming the new raw material of business: an economic input almost on a par with capital and labour.

Machine Learning is a peer-reviewed scientific journal, published since 1986

Further reading

  2. A Very Short History Of Data Science
  3. Data, data everywhere