RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://www.spokestack.io/docs/python/getting-started below:

Getting Started - Spokestack

This guide will get you up and running with Spokestack for Python, and youâll have a voice interface in your application in no time.

Installation System Dependencies

There are some system dependencies that need to be downloaded in order to install spokestack via pip.

macOS

brew install lame portaudio

Debian/Ubuntu

sudo apt-get install portaudio19-dev libmp3lame-dev

Windows

We currently do not support Windows 10 natively, and recommend you install Windows Subsystem for Linux (WSL) with the Debian dependencies. However, if you would like to work on native Windows support, we gladly accept pull requests.

Another potential avenue for using Spokestack on Windows 10 is via anaconda. PortAudio can be installed via conda, but Lame cannot. Hence, microphone input will be supported, but text-to-speech will not.

Installation with pip

Once system dependencies have been satisfied, you can install the library with the following.

Setup

We use pyenv for virtual environments.

pyenv install 3.8.6
pyenv virtualenv 3.8.6 spokestack
pyenv local spokestack
pip install -r requirements.txt

Install Tensorflow

This library requires a way to run TFLite models. There are two ways to add this ability. The first is installing the full Tensorflow library:

In use cases where you require a small footprint, such as on a Raspberry Pi or similar Internet of Things (IOT) devices, you will want to install the TFLite Interpreter. You can install it for your platform by following the instructions.

Integration

In order for your application to use Spokestackâs features, there are a few things you will need:

A free Spokestack Account
Audio Input Device
A SpeechPipeline Instance
Audio Output Device

1. Spokestack Account

Go to spokestack.io to set up your own account (itâs free!). Once youâve got that, go grab one of our free NLU models. Weâll use the Highlow one in this example, but you can choose another, or create your own

Once youâve downloaded your NLU, unzip nlu.tar.gz with the three files inside (metadata.json, nlu.tflite, vocab.txt). The location of the directory isnât important, because we will pass the path on initialization.

2. Audio Input Device

The PyAudioInput class will use the system default audio input device. Most personal computers have some form of microphone, but in the case of an embedded device, you may need to purchase a small USB microphone.

3. SpeechPipeline Instance

Spokestackâs speech pipeline handles collecting audio from the input device and transcribing speech directed at your app. The SpeechPipeline guide has a detailed explanation of how to set up the pipeline, so we will show the quickest way here using a profile, which configures the pipelineâs components for a specific use case. The profile we use here includes wake word activation and speech transcription using Spokestackâs cloud ASR.

from spokestack.profile.wakeword_asr import WakewordSpokestackASR

pipeline = WakewordSpokestackASR.create(
    "spokestack_id", "spokestack_secret", model_dir="path_to_tflite_model_dir"
)
pipeline.start()

From text to meaning

Translating the text into an action is the job of the Natural Language Understanding (NLU) component. A great thing about Spokestack NLU models is that they run entirely on device. The NLU can be initialized like this:

from spokestack.nlu.tflite import TFLiteNLU


nlu = TFLiteNLU("path_to_tflite_model_dir")

Input to the NLU model is the ASR transcript. The transcript can be accessed as a property of SpeechContext. Below is a sample event handler for running inference on the speech transcript.

@pipeline.event
def on_recognize(context)
    results = nlu(context.transcript)

Some useful links for configuring Spokestackâs NLU:

Talking back to your users

If you want the full smart speaker experience, you will need to give your application a voice. This can be achieved with text-to-speech (TTS). For more information on TTS, see the TTS concept guide. TTS playback uses the PyAudioOutput class, which plays audio with the default speaker for the device. Like NLU, TTS can be used in an event handler. Take a look at the example below, which simply speaks what the ASR heard.

@pipeline.event
def on_recognize(context):
    tts.synthesize("welcome to spokestack")

Conclusion

Thatâs all there is to setting up an application with Spokestack. Your Python application can now accept and respond to voice commands.

Thank you for taking the time to read this!

Related Resources

Want to dive deeper into the world of Android voice integration? We've got a lot to say on the subject:

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4