How to do image processing in Python: Step-by-Step Guide

published on 17 February 2024

Working with images is an integral part of many technology solutions today. Most developers would agree that processing images in Python can be challenging initially.

This article will provide a step-by-step guide to mastering image processing in Python. You'll learn the fundamentals, essential techniques, and even advanced methods to build real-world image processing applications.

We'll cover everything from setting up the Python environment to manipulating, enhancing, and segmenting images. You'll also see how to develop projects for classification, detection, and even building an image search engine.By the end, you'll have the skills to tackle any image processing task in Python.

Introduction to Image Processing with Python

Image processing refers to various techniques that allow computers to understand and modify digital images. It involves analyzing pixel information to perform operations like identifying objects, detecting edges, adjusting brightness/contrast, applying filters, recognizing text, etc.

Python is a popular language for image processing due to its extensive libraries, simple syntax, and active developer community. Key libraries like OpenCV, PIL/Pillow, scikit-image, and more enable you to work with images in Python.

Understanding the Basics of Image Processing

Image processing relies on analyzing pixel data from digital images to identify and modify elements within them. Key concepts include:

  • Image acquisition: Capturing or importing images via cameras, scanners etc.
  • Preprocessing: Transforming images before analysis (resizing, rotation, noise removal etc.).
  • Feature detection: Identifying pixels/regions of interest like edges, corners or objects.
  • Analysis: Extracting meaningful information from images using the detected features.
  • Manipulation: Transforming images based on the extracted information (filtering, morphing etc.).

The Advantages of Python in Image Processing

Python is a preferred language for image processing due to:

  • Extensive libraries like OpenCV, PIL/Pillow, scikit-image etc. offering specialized functionality.
  • Simple and readable code thanks to its clean syntax. Easy for beginners to adopt.
  • Vibrant developer community providing abundant code examples and troubleshooting support.
  • Interoperability with languages like C++ for performance-critical operations.
  • Rapid prototyping enabled by Python's interpreted nature.

Overview of Python Image Processing Libraries

Some key image processing libraries in Python include:

  • OpenCV: Comprehensive library with over 2500 algorithms ranging from facial recognition to shape analysis.
  • PIL/Pillow: Offers basic image handling and processing functionality.
  • scikit-image: Implements algorithms for segmentation, filtering, feature detection etc.
  • Mahotas: Specialized library for computer vision operations.
  • SimpleCV: Provides an easy interface to OpenCV for rapid prototyping.

With these mature libraries, Python makes an excellent choice for developing image processing and computer vision applications.

How do I start image processing in Python?

To get started with image processing in Python, follow these key steps:

Import Required Libraries

The main library used for image processing in Python is OpenCV (Open Source Computer Vision Library). Other useful libraries include scikit-image, Pillow, matplotlib, etc.

import cv2
import numpy as np
from skimage.io import imread
import matplotlib.pyplot as plt

Load the Image

Use imread() from scikit-image or cv2.imread() from OpenCV to load images into Python.

img = imread('image.jpg')

Perform Image Processing Techniques

There are many image processing techniques like blurring, sharpening, thresholding, filtering, edge detection etc. that can be applied.

For example, to convert an image to grayscale:

gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

Save/Display Result

Use matplotlib to display images. To save processed images, use cv2.imwrite().

plt.imshow(gray_img, cmap='gray') 
plt.show()

cv2.imwrite('gray_image.jpg', gray_img) 

This covers the basic workflow to load, process and visualize images in Python. Check out OpenCV and scikit-image documentation for more image processing operations.

Is image processing with Python easy?

Python makes image processing very accessible due to its extensive libraries and ready-made functions. For example, the OpenCV library provides over 500 functions for common image processing tasks like:

  • Image resizing and rotation
  • Blurring and sharpening
  • Edge detection
  • Object detection

You don't need to code these from scratch - just call the function and pass in your image. This makes development much faster compared to lower-level languages like C++.

Here's a simple example to resize an image with OpenCV in 5 lines of Python:

import cv2
img = cv2.imread('image.jpg')
resized = cv2.resize(img, (100, 100)) 
cv2.imwrite('resized.jpg', resized)

So while you still need some programming knowledge, Python and libraries like OpenCV, scikit-image and Pillow make image processing tasks straightforward for developers at any level.

The key benefits are:

  • Simple syntax and readability
  • Extensive libraries for common tasks
  • High-level functions instead of coding from scratch
  • Rapid prototyping and development

This makes Python a popular choice for computer vision and image processing.

What is the Python tool for image processing?

Pillow (also known as PIL) is the most widely used Python library for image processing. Here are some key things to know about Pillow:

  • Open-source library that builds on the now-discontinued PIL (Python Imaging Library)
  • Provides extensive support for different image formats like JPEG, PNG, GIF, BMP and TIFF
  • Useful for basic image manipulation tasks like resizing, cropping, rotating, blurring etc.
  • Has image enhancement capabilities like contrast adjustment, sharpening, color space conversions etc.
  • Supports creating thumbnails, applying filters, drawing shapes and text onto images
  • Integrates well with popular Python data analysis libraries like NumPy and SciPy

In summary, Pillow offers a versatile toolkit to load, manipulate and save images for various applications using Python. Its simple API, maturity as a library and integration with NumPy make it a convenient choice for developers looking to integrate image processing capabilities into their Python programs.

Which algorithm is used for image processing in Python?

Python has several algorithms and libraries that are commonly used for image processing tasks. Some of the most popular options include:

  • SciPy - This scientific computing library contains modules for image processing like binary morphology, filtering, interpolation, etc. It is useful for tasks like image enhancement, restoration, and segmentation.

  • OpenCV - The OpenCV library is widely used for computer vision and image processing. It provides algorithms for tasks ranging from facial recognition to image stitching. Useful for object detection, classification, and tracking.

  • scikit-image - Also known as skimage, this library focuses specifically on image processing. It has tools for segmentation, denoising, feature extraction, registration and more. Easy to use and integrate into machine learning workflows.

  • Pillow - Pillow is a popular Python imaging library used for basic image manipulation like resizing, cropping, filtering, color space conversions etc. Handy for preparing images for input/output.

So in summary, SciPy and scikit-image are good for scientific image analysis while OpenCV focuses on computer vision. Pillow provides general utility functions for image handling. The choice depends on the specific task - classification, object recognition, enhancement etc. But all these libraries complement each other.

Preparing the Python Image Processing Environment

Installing Python and PIP for Image Processing

To get started with image processing in Python, you'll need to have Python and PIP (Python package manager) installed on your system. Here are step-by-step instructions for installation:

  1. Download the latest Python release from python.org. Make sure to download version 3.6 or higher.
  2. Follow the installation wizard, customizing any options as desired. Make sure Python is added to your system's PATH.
  3. Open a new command prompt window and run pip --version to confirm PIP is installed with Python. If not, install it from this page.

Once Python and PIP are installed, you have the base environment ready for image processing libraries.

Installing Essential Python Image Processing Libraries

The main libraries we'll use are:

  • OpenCV - for core image processing operations
  • NumPy - provides multidimensional array data structures
  • SciPy - used for scientific computing and technical computing capabilities
  • Pillow - adds support for image file reading/writing

To install them:

  1. At the command prompt, run: pip install opencv-python
  2. Run: pip install numpy scipy
  3. Run: pip install pillow

This will download and install the latest versions of these important libraries.

Other useful optional libraries like scikit-image, Mahotas, SimpleITK can also be installed via PIP.

Importing Libraries for Image Processing in Python

Once the libraries are installed, we can import them into our Python scripts.

For example:

import cv2
import numpy as np
from PIL import Image
import scipy.ndimage

We use aliases like cv2 for OpenCV and np for NumPy to simplify later coding.

The environment is now ready for loading images, applying filters, transformations and running analysis algorithms!

sbb-itb-ceaa4ed

Fundamentals of Working with Images in Python

Python provides various libraries for working with images, such as OpenCV, PIL/Pillow, scikit-image, etc. This section will introduce some core concepts and techniques for handling images in Python.

Loading and Handling Images with OpenCV and PIL

To load an image in Python using OpenCV, we use the cv2.imread() function. For example:

import cv2
img = cv2.imread('image.jpg')

Similarly, with the Python Imaging Library (PIL), we use the Image.open() method:

from PIL import Image
img = Image.open('image.jpg')

These functions load the image data into a NumPy array or PIL Image object respectively, which provides various properties and pixel data access.

Some key attributes when working with loaded image data:

  • shape: Access width, height and channels
  • size: Width and height dimensions
  • dtype: Data type of pixels
  • getpixel() / item(): Get value of a pixel

Efficient Image Storing Techniques with OpenCV and PIL

To save an image to disk after processing, OpenCV provides cv2.imwrite():

cv2.imwrite('new_image.jpg', img) 

And with PIL:

img.save('new_image.jpg')

Some best practices for efficient image saving:

  • Use compressed formats like JPG, PNG depending on image type
  • Adjust quality parameter for best compression/quality trade-off
  • Store normalized float arrays before saving for better precision

Visualizing Images with Matplotlib in Python

The Matplotlib library provides simple visualization of images using plt.imshow():

import matplotlib.pyplot as plt

plt.imshow(img)
plt.show() 

Some parameters that help enhance visualization:

  • cmap: Colormap for intensity values
  • interpolation: Algorithm for pixel interpolation

This allows inspection of images at various stages of processing pipelines.

Essential Image Manipulation Techniques in Python

Image processing is an important capability in Python, enabling tasks like resizing, cropping, rotating, and otherwise manipulating images. This section will cover some of the essential image manipulation techniques using popular Python libraries like OpenCV, PIL/Pillow, NumPy, and SciPy.

Image Resizing with Python Libraries

Resizing images is a common requirement in applications like creating thumbnails, fitting images to specific dimensions, or scaling for display purposes.

The OpenCV library provides simple methods like cv2.resize() to resize images. You can specify the output dimensions directly:

import cv2

img = cv2.imread('image.jpg')
resized = cv2.resize(img, (100, 200)) 

The PIL/Pillow library also offers flexible image resizing with Image.resize(), allowing both pixel dimensions or percentage scaling:

from PIL import Image

img = Image.open('image.jpg')
resized = img.resize((100, 100)) # pixels
resized = img.resize((50, 50)) # 50% scale  

Both libraries make image resizing straightforward in Python.

Cropping Images Using Python

Cropping extracts a region of interest from an image, a useful technique for focusing on key parts or removing unwanted areas.

NumPy array slicing provides an easy way to crop in OpenCV and Pillow. If img is a NumPy array, we can extract a 100x100 pixel square from x=50, y=50 like:

cropped = img[50:150, 50:150] 

Alternatively, Pillow's Image.crop() method allows cropping by pixel coordinates:

box = (50, 50, 150, 150)
cropped = img.crop(box)  

This selects the same region as the NumPy slicing. Both approaches provide simple ways to implement cropping.

Image Rotation and Flipping Techniques

Rotating or flipping images may be required in applications like correcting orientations or generating augmented data.

OpenCV provides the cv2.rotate() method for rotating images by 90 degree increments or an arbitrary angle:

rotated90 = cv2.rotate(img, cv2.ROTATE_90_CLOCKWISE)
rotated30 = cv2.rotate(img, 30)  

Similarly, Pillow offers Image.rotate() and Image.transpose() for rotations:

rotated90 = img.rotate(90)  
flipped = img.transpose(Image.FLIP_LEFT_RIGHT)

These functions enable flexible image rotation and flipping manipulations.

Overall, Python imaging libraries like OpenCV and PIL provide powerful yet easy to use tools for essential image processing techniques, from resizing and cropping to rotations, making them very useful for tasks like data augmentation and image correction.

Advanced Image Filtering and Enhancement in Python

Image processing techniques like filtering and enhancement allow you to manipulate images in Python to achieve various effects. This guide will demonstrate some advanced methods using the OpenCV library.

Image Blurring Techniques with OpenCV

Applying blur effects can be useful for reducing image noise. OpenCV provides several blurring techniques:

  • Linear filters - Simple averaging of pixel neighborhoods. Easy to apply but produces unnatural looking results.
  • Gaussian blur - Uses a Gaussian kernel to produce more natural blurs. Adjustable kernel size allows control over blur intensity. Useful for smoothing noise while preserving edges.

Here is an example applying a 15 x 15 Gaussian blur in OpenCV Python:

import cv2

image = cv2.imread('image.jpg')
blurred = cv2.GaussianBlur(image, (15, 15), 0) 
cv2.imwrite('blurred.jpg', blurred)

This smooths the image while avoiding distortion artifacts.

Sharpening Images with Convolutional Filters

Sharpening brings images into better focus. Convolutional filters accentuate edges and fine details.

Some OpenCV sharpening filter options:

  • Unsharp masking - Boosts edge contrast for perceived sharpness.
  • Laplacian filters - Detects rapid changes in pixel values to emphasize edges.
  • High-pass filters - Retain high frequency details while suppressing lower frequencies.

Here's an example unsharp mask in OpenCV:

import cv2
import numpy as np

image = cv2.imread('image.jpg')
kernel = np.array([[0, -1, 0], 
                   [-1, 5,-1],
                   [0, -1, 0]])
sharpened = cv2.filter2D(image, -1, kernel)

This brings out finer details for improved clarity.

Edge Detection in Python Using Canny Algorithm

The Canny algorithm is widely used for edge detection. It applies Gaussian smoothing to reduce noise, computes intensity gradients to highlight edges, then suppresses weak or disconnected edges.

Here is Canny edge detection in OpenCV Python:

import cv2

image = cv2.imread('image.jpg')
edges = cv2.Canny(image, 100, 200)
cv2.imwrite('canny_edges.jpg', edges)

This produces a clear edge map isolating prominent contours in the image.

Advanced filters like these enable effective image analysis and manipulation with OpenCV in Python.

Exploring Image Segmentation Techniques with Python

Image segmentation is an important technique in image processing and computer vision that involves partitioning an image into multiple segments. This allows easier analysis of the image contents by simplifying representation into something more meaningful and easier to analyze.

Python offers simple and powerful tools to perform image segmentation thanks to libraries like OpenCV, scikit-image, and others. In this section, we'll explore some of the popular image segmentation techniques and how to implement them in Python.

Applying Thresholding Techniques in Python

Thresholding is one of the simplest segmentation methods. It converts a grayscale image to a binary image by setting pixel values above a threshold to white and values below to black. This separates the image into foreground and background regions.

Here is an example using OpenCV's threshold function:

import cv2
import numpy as np

img = cv2.imread('image.jpg', 0)
ret, thresh = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)

We can also use adaptive thresholding which calculates the threshold for smaller regions, giving better results for images with varying illumination:

thresh_adapt = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_MEAN_C, 
                                    cv2.THRESH_BINARY, 11, 2)

Segmentation with Watershed Algorithm in Python

The watershed algorithm treats an image like a topographic map, with pixel intensities representing heights. It then finds "catchment basins" and "watershed ridge lines" to segment the image.

We can use OpenCV's implementation in Python:

import numpy as np
import cv2
from matplotlib import pyplot as plt

img = cv2.imread('coins.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
ret, thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU) 

# Noise removal 
kernel = np.ones((3,3),np.uint8)
opening = cv2.morphologyEx(thresh,cv2.MORPH_OPEN,kernel, iterations = 2)

# Apply watershed
sure_bg = cv2.dilate(opening,kernel,iterations=3)
dist_transform = cv2.distanceTransform(opening,cv2.DIST_L2,5)
ret, sure_fg = cv2.threshold(dist_transform,0.7*dist_transform.max(),255,0)
sure_fg = np.uint8(sure_fg)
unknown = cv2.subtract(sure_bg,sure_fg)
ret, markers = cv2.connectedComponents(sure_fg)
markers = markers+1
markers[unknown==255] = 0
markers = cv2.watershed(img,markers)
img[markers == -1] = [0,255,0]

This performs several pre and post-processing steps on the image before applying watershed. The final segmented image separates each coin successfully.

Foreground Extraction with GrabCut Algorithm

GrabCut is an interactive segmentation method. It allows a user to draw an initial bounding box around the foreground object to extract. It then iteratively refines the segmentation based on pixel color and texture features.

Here is an example with OpenCV:

import numpy as np 
import cv2
from matplotlib import pyplot as plt

img = cv2.imread('messi.jpg')
mask = np.zeros(img.shape[:2],np.uint8) 

bgdModel = np.zeros((1,65),np.float64)
fgdModel = np.zeros((1,65),np.float64)

rect = (50,50,450,290)
cv2.grabCut(img,mask,rect,bgdModel,fgdModel,5,cv2.GC_INIT_WITH_RECT)

mask = np.where((mask==2)|(mask==0),0,1).astype('uint8')
img = img*mask[:,:,np.newaxis]

We initialize a rectangular region around Messi. GrabCut then evolves the segmentation to tightly fit just the foreground object.

Developing Python Image Processing Projects

Image processing is an exciting field with many real-world applications. Here are some ideas for Python image processing projects you can develop to put your skills to use:

Image Classification Projects with Python and OpenCV

Image classification involves training machine learning models to categorize images into different classes. Here are some project ideas:

  • Build a custom image classifier to detect specific objects. Gather images of those objects, label them, and train a convolutional neural network model with OpenCV and Python to recognize them. This could be used for quality control in manufacturing, identifying wildlife with camera traps, or even detecting ripe produce.

  • Create a classifier that can identify plant diseases from leaf images. Collect images of healthy and infected plant leaves, label them by disease type, and train a model to categorize new leaf images by disease. This could help farmers identify crop infections early to prevent spread.

  • Develop an app that identifies dog breeds from user-submitted photos. Use transfer learning with a pre-trained model like ResNet50 to retrain the final layer, adding new output classes for different dog breeds. Capture images of dogs to train classifier.

The key steps are gathering a dataset, labeling images, training/validating/testing models, and exporting the model to production. Use data augmentation, hyperparameter tuning, and techniques like transfer learning to improve accuracy.

Object Localization and Detection with Python

Locating and drawing bounding box regions around objects in images is another useful application of computer vision. Project ideas include:

  • Face detection app that draws boxes around faces in images. Use Haar cascades with OpenCV to identify facial features. Could be used to automatically tag people in photos.

  • Traffic camera analyzer that highlights all vehicles in a traffic video feed. Use background subtraction and contour detection to identify cars and trucks and draw boxes around them. Useful for automated traffic monitoring.

  • Product scanner that locates retail products on store shelves. Train an object detection model on product images and apply it to shelf images to identify and highlight items. Assist with inventory audits and checking stock levels.

The key techniques are training object detection models like SSD and YOLOv3 or using Haar cascades for things like faces. Outputs are bounding box regions identifying object locations.

Creating an Image Search Engine with Python

Building a reverse image search engine lets people discover similar images. Project ideas include:

  • Fashion image search site for finding clothing and accessory ideas. Allow image uploads and return visually similar catalog images, linking to shopping options.

  • Interior design search tool for matching furniture and decor styles. Index interior images and return the most similar images from the database to user uploads.

  • Plagiarism checker that compares essay submissions against web sources to detect copied work. Use image hashing to compare incoming images/docs to indexed original content.

Use perceptual image hashing to give images a fingerprint. Index the hashes for storage and fast lookup. Calculate hash of search images and find closest matches in index using a distance metric like Hamming distance.

The key skills are building an image database, generating hashes, indexing for search, and writing matching logic. These allow building versatile search apps.

Conclusion: Mastering Python Image Processing

Python is a versatile programming language that offers powerful image processing capabilities. By following this tutorial, you have learned key image processing techniques in Python:

  • Image resizing, rotation, translation, shearing and normalization using OpenCV and Pillow to manipulate image properties
  • Applying filters like blurring and edge detection to alter image appearance
  • Utilizing morphological operations for advanced image transformations
  • Detecting and localizing objects in images with OpenCV and deep learning
  • Working with different color spaces and channel operations

With these fundamentals, you can now confidently take on more advanced Python computer vision and image analysis projects. Check out the OpenCV and Pillow documentation to continue expanding your skills. Additionally, active communities like PyImageSearch provide code examples and applied tutorials on cutting-edge techniques.

By mastering Python image processing, you open up possibilities in diverse fields like medical imaging, satellite imagery analysis, machine inspection systems, facial recognition, and more. This versatile skill set will serve you well in both research and industry applications.

Related posts

Read more