0

How To Extract Text from an Image Using Python?

Normally, when it comes to image to text conversion, OCR-based tools and applications are used. However, if you feel as if these software are not very accurate or safe to use, you can take things into your own hands and do the same thing manually.

In this post, we will be looking at the code that you can use in Python to convert an image to text without using conventional tools and applications. We will be using Tesseract OCR for Windows for this process.

Steps to Extract Text from Image Using Python

Following are the steps to extract text from image using a python application:

Install Tesseract OCR on Windows

You can download the executable file of a Tesseract for Windows, either 32-bit or 64-bit, depending on your operating system. As a next step, you must configure the Tesseract path in the Environment Variables window under System Variables.

Installing Modules

After downloading the Tesseract, it is necessary to install three modules: pytesseract, opencv-python, and Tesseract. The Pytesseract wrapper is designed to provide an interface to the Tesseract-OCR engine. Furthermore, it can be used as a standalone invocation script to invoke Tesseract.

 

All image formats supported by Pillow and Leptonica libraries, including PNG, JPEG, GIF, and BMP, can be read by this application. Using the pip tool, you will be able to install these packages.

 

pip install opencv-python

pip install pytesseract

pip install tesseract

Python Code to Extract Text From Image Using Tesseract

Let us assume that you have the following test image that is located in the same working directory. As a first step, you have created a Python file and imported all the required modules.

 

# text recognition

import cv2

import pytesseract

 

As a next step, you need to use the imread() function in order to load the test image from the specified location.

 

# read image

img = cv2.imread(‘quotes.jpg’)

 

Now, you have set the custom options for the configuration.

 

# configurations

config = (‘-l eng –oem 1 –psm 3’)

 

It is recommended that the following be added to your System variables PATH if you have not yet done so.

 

# pytessercat

pytesseract.pytesseract.tesseract_cmd = ‘C:/Program Files/Tesseract-OCR/tesseract.exe’

 

As a next step, you should convert from Image to String using the function image_to_string().

 

text = pytesseract.image_to_string(img, config=config)

 

In the final step, you can print the text that has been extracted from the image.

 

# print text

text = text.split(‘\n’)

print(text)

Complete Code: Extract Text from Image Using Python

Now that you should merge all the above code, let’s run it.

 

# text recognition

import cv2

import pytesseract

 

# read image

img = cv2.imread(‘quotes.jpg’)

 

# configurations

config = (‘-l eng –oem 1 –psm 3’)

 # pytessercat

pytesseract.pytesseract.tesseract_cmd = ‘C:/Program Files/Tesseract-OCR/tesseract.exe’

text = pytesseract.image_to_string(img, config=config)

 

# print text

text = text.split(‘\n’)

print(text)

 

As a result of the above code, the following output is produced:

This dictionary contains information about the image you entered. A Tesseract scanner is ideal for clean scanning documents, and the image’s text can be easily converted to Word or any other format needed.

 

Extract Image and Save to text file

As a first step, you converted the image to grayscale and then specified the kernel’s shape and size. In the next step, you located the contours and looped over them, cutting the rectangle area. As a next step, you passed the rectangle area onto Pytesseract for extraction of text and subsequent writing to the text file.

 

# import modules

import cv2

import pytesseract

 

# read image

img = cv2.imread(‘quotes.png’)

 

# set configurations

config = (‘-l eng –oem 1 –psm 3’)

 

# Convert the image to gray scale

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

 

# OTSU threshold performing

ret, threshimg = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU | cv2.THRESH_BINARY_INV)

 

# Specifying kernel size and structure shape. 

rect_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (18, 18))

 

# Appplying dilation on the threshold image

dilation = cv2.dilate(threshimg, rect_kernel, iterations = 1)

 

# getting contours

img_contours, hierarchy = cv2.findContours(dilation, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)

 

# Loop over contours and crop and extract the text file

for cnt in img_contours:

    x, y, w, h = cv2.boundingRect(cnt)

     

    # Drawing a rectangle

    rect = cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2)

     

    # Cropping the text block 

    cropped_img = img[y:y + h, x:x + w]

     

    # Open the text file in append mode

    file = open(“recognized.txt”, “a”)

     

    # Applying tesseract OCR on the cropped image

    text = pytesseract.image_to_string(cropped_img)

     

    # Appending the text into file

    file.write(text)

    file.write(“\n”)

     

    # Close the file

    File.close

 

Here is an example of what the code above will produce:

Conclusion

And there you have it. That is the code that you can use to convert images to text using Python. If you find the process difficult and time-taking, you can do the same thing with an OCR tool. But then again, for some users, the privacy factor in using these tools can be a deal-breaker.

Leave a Reply

Your email address will not be published. Required fields are marked *