Last Updated : 06 May, 2025
Speech recognition means converting spoken words into text. It used in various artificial intelligence applications such as home automation, speech to text, etc. In this article, you’ll learn how to do basic speech recognition in Python using the Google Speech Recognition API.
Step 1: Install Required LibraryWe’ll use the SpeechRecognition library in Python. To install it open your terminal or command prompt and run:
Step 2: Upload your Audio File!pip install SpeechRecognition
When you run the below code it will ask you to upload a file from your computer. Use a clear .wav file for best results. You can download sample audio file from here.
Python
from google.colab import files
uploaded = files.upload()
Step 3: Convert Audio to Text
If you already have an audio file like a .wav file, you can use this method instead:
Python
import speech_recognition as sr
recognizer = sr.Recognizer()
filename = list(uploaded.keys())[0]
with sr.AudioFile(filename) as source:
print("Reading audio...")
audio_data = recognizer.record(source)
try:
print("\nRecognized Text:")
text = recognizer.recognize_google(audio_data)
print(text)
except sr.UnknownValueError:
print("Sorry, could not understand the audio.")
except sr.RequestError:
print("Could not connect to Google API.")
Output:
Speech to textSpeech recognition in Python is very easy with the help of Google Speech API. You can use your voice to control programs, take notes or even build voice assistants.
You can download source code from here.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4