Many Google products involve speech recognition. For example, Google Assistant allows you to ask for help by voice, Gboard lets you dictate messages to your friends, and Google Meet provides auto captioning for your meetings.
Speech technologies increasingly rely on deep neural networks, a type of machine learning that helps us build more accurate and faster speech recognition models. Generally deep neural networks need larger amounts of data to work well and improve over time. This process of improvement is called model training.
What technologies we use to train speech modelsGoogle’s speech team uses 3 broad classes of technologies to train speech models: conventional learning, federated learning, and ephemeral learning. Depending on the task and situation, some of these are more effective than others, and in some cases, we use a combination of them. This allows us to achieve the best quality possible, while providing privacy by design.
Conventional learningConventional learning is how most of our speech models are trained.
How conventional learning works to train speech modelsWhen training on equal amounts of data, supervised training typically results in better speech recognition models than unsupervised training because the annotations are higher quality. On the other hand, unsupervised training can learn from more audio samples since it learns from machine annotations, which are easier to produce.
How your data stays privateLearn more about how Google keeps your data private.
Federated learningFederated learning is a privacy preserving technique developed at Google to train AI models directly on your phone or other device. We use federated learning to train a speech model when the model runs on your device and data is available for the model to learn from.
How federated learning works to train speech modelsWith federated learning, we train speech models without sending your audio data to Google’s servers.
.
Ephemeral learningEphemeral Learning is a privacy preserving technique we use when the speech model runs on Google’s servers.
How ephemeral learning works to train speech modelsWith ephemeral learning, your audio data samples are:
We’ll continue to use all 3 technologies, often in combination for higher quality. We’re also actively working to improve both federated and ephemeral learning for speech technologies. Our goal is to make them more effective and useful, and in ways that preserve privacy by default.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.3