RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://www.geeksforgeeks.org/python/text-detection-and-extraction-using-opencv-and-ocr/ below:

Text Detection and Extraction using OpenCV and OCR

Last Updated : 12 Jul, 2025

Optical Character Recognition (OCR) is a technology used to extract text from images which is used in applications like document digitization, license plate recognition and automated data entry. In this article, we explore how to detect and extract text from images using OpenCV for image processing and Tesseract OCR for text recognition.

Before we start we need to install required libraries using following commands:

!pip install opencv-python

!pip install pytesseract

!sudo apt-get install tesseract-ocr

Step 1: Importing Required Packages

Import the required Python libraries like OpenCV, pytesseract and matplotlib.

Python


 import cv2
import pytesseract
from matplotlib import pyplot as plt

Step 2: Loading Image

Now we will load the image using cv2.imread() function. OpenCV loads images in BGR format but matplotlib expects RGB format so we convert it using cv2.cvtColor().

Python


 image_path = "/mnt/data/08d702e9-fb88-4a5b-98ce-9ad0bc19afd9.png"
image = cv2.imread(image_path)
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

Step 3: Converting Image to Grayscale

Convert the input image to grayscale to simplify the image and remove color information.

Python


 gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
print("Grayscale Image:")
cv2_imshow(gray)

Step 4: Displaying Original Image

Let's see original image with the help of matplotlib.

Python


 plt.figure(figsize=(10, 6))
plt.imshow(image_rgb)
plt.title("Original Image")
plt.axis("off")
plt.show()

Output:

Sample Image Step 5: Extracting Text from Image

In this step we extract text from the image with the help of pytesseract.image_to_string().

Python


 extracted_text = pytesseract.image_to_string(image_rgb)
print(" Extracted Text:\n")
print(extracted_text)

Output:

extracted text Step 6: Drawing Bounding Boxes Around Detected Text

Now we loop through each detected word and draw a blue box around it using cv2.reactangle.

Python


 data = pytesseract.image_to_data(image_rgb, output_type=pytesseract.Output.DICT)

n_boxes = len(data['level'])
for i in range(n_boxes):
    (x, y, w, h) = (data['left'][i], data['top'][i], data['width'][i], data['height'][i])
    cv2.rectangle(image_rgb, (x, y), (x + w, y + h), (255, 0, 0), 2)

Finally this shows the processed image with bounding boxes around the detected text.

Python


 plt.figure(figsize=(10, 6))
plt.imshow(image_rgb)
plt.title("Image with Text Bounding Boxes")
plt.axis("off")
plt.show()

Output:

Output Image

Get complete notebook link here :

Notebook: click here.
Sample Image: click here.

Text Detection and Extraction using OpenCV and OCR

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4