Convert text into natural-sounding speech using an API powered by the best of Google’s AI technologies.
New customers get up to $300 in free credits to try Text-to-Speech and other Google Cloud products.
Improve customer interactions with intelligent, lifelike responses
Engage users with voice user interface in your devices and applications
Personalize your communication based on user preference of voice and language
Benefits
High fidelity speech
Deploy Google’s groundbreaking technologies to generate speech with humanlike intonation. Built based on DeepMind’s speech synthesis expertise, the API delivers voices that are near human quality.
One-of-a-kind voice
Create a unique voice to represent your brand across all your customer touchpoints, instead of using a common voice shared with other organizations.
Demo
Put Text-to-Speech into action
Type what you want, select a language then click “Speak It” to hear.
Generate a solution
What problem are you trying to solve?
What you'll get:
check_smallStep-by-step guide
check_smallReference architecture
check_smallAvailable pre-built solutions
This service was built with
Vertex AI. You must be 18 or older to use it. Do not enter sensitive, confidential, or personal info.
Key features
Chirp 3: HD voices
Build engaging agents using the latest spontaneous conversational voices based on AudioLM. These voices offer high-quality audio, low-latency streaming, and natural-sounding speech, incorporating human disfluencies and accurate intonation.
Studio voices
Dazzle your listeners with professionally narrated content recorded in a studio-quality environment. Make sure to put your headphones on.
You can now generate dialogues with multiple speakers to create your most interactive scenarios.
Neural2 voices
Internationalize your voice experience with ready to use voices powered by the latest research behind Custom Voice.
Custom Voice
Train a custom voice model using your own audio recordings to create a unique and more natural sounding voice for your organization. You can define and choose the voice profile that suits your organization and quickly adjust to changes in voice needs without needing to record new phrases.
Text and SSML support
Customize your speech with SSML tags that allow you to add pauses, numbers, date and time formatting, and other pronunciation instructions.
What's new
What's new
Sign up for Google Cloud newsletters to receive product updates, event information, special offers, and more.
Documentation
Text-to-Speech basicsA guide to the fundamental concepts of using the Text-to-Speech API.
Quickstart: Using the command lineSet up your Google Cloud project and authorization and make a request for Text-to-Speech to create audio from text.
Supported voices and languagesBrowse guides and resources for this product.
Custom Voice (beta) overviewLearn how you can create a unique and more natural-sounding voice with Custom Voice using your own studio-quality audio recordings.
WaveNet and other synthetic voicesLearn about the different synthetic voices available for use in Text-to-Speech, including the premium WaveNet voices.
Speaking addresses with SSMLThis tutorial demonstrates how to use Speech Synthesis Markup Language (SSML) to speak a text file of addresses.
Not seeing what you’re looking for? View all product documentationUse cases
Use cases Voice generation in devicesEnable natural communications with your users by empowering your devices to speak humanlike voices as a text reader. Build an end-to-end voice user interface together with Speech-to-Text and Natural Language to improve user experience with easy and engaging interactions.
Accessible EPGs (Electronic Program Guides)Easily have the EPGs read text aloud to provide a better user experience to your customers and meet accessibility requirements for your services and applications. Try the EPG demo.
Easily implement text-to-speech functionality in EPGs to provide a better user experience to your customers and meet accessibility requirements for your services and applications.
View all technical guidesAll features
All features Custom VoiceTrain a custom speech synthesis model using your own audio recordings to create a unique and more natural-sounding voice for your organization. You can define and choose the voice profile that suits your organization and quickly adjust to changes in voice needs without needing to record new phrases. Learn more.
Long audio synthesis Voice and language selectionChoose from an extensive selection of 220+ voices across 40+ languages and variants, with more to come soon.
WaveNet voicesTake advantage of 90+ WaveNet voices built based on DeepMind’s groundbreaking research to generate speech that significantly closes the gap with human performance.
Text and SSML supportCustomize your speech with SSML tags that allow you to add pauses, numbers, date and time formatting, and other pronunciation instructions.
Pitch tuningPersonalize the pitch of your selected voice, up to 20 semitones more or less than the default.
Speaking rate tuningAdjust your speaking rate to be 4x faster or slower than the normal rate.
Volume gain controlIncrease the volume of the output by up to 16db or decrease the volume up to -96db.
Integrated REST and gRPC APIsEasily integrate with any application or device that can send a REST or gRPC request including phones, PCs, tablets, and IoT devices (for example cars, TVs, speakers).
Audio format flexibilityConvert text to MP3, Linear16, OGG Opus, and a number of other audio formats.
Audio profilesOptimize for the type of speaker from which your speech is intended to play, such as headphones or phone lines.
Pricing
PricingText-to-Speech is priced based on the number of characters sent to the service to be synthesized into audio each month. The first 1 million characters for WaveNet voices are free each month. For Standard (non-WaveNet) voices, the first 4 million characters are free each month. After the free tier has been reached, Text-to-Speech is priced per 1 million characters of text processed.
If you pay in a currency other than USD, the prices listed in your currency on Google Cloud SKUs apply.
Take the next stepNew customers get $300 in free credits to try Text-to-Speech and other Google Cloud products.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4