RetroSearch Browse

Showing content from https://docs.espressif.com/projects/esp-sr/en/latest/esp32s3/wake_word_engine/README.html below:

WakeNet Wake Word Modelï

WakeNet is a wake word engine built upon neural network for low-power embedded MCUs. Currently, WakeNet supports up to 5 wake words.

Overviewï

Please see the flow diagram of WakeNet below:

Speech Feature

We use MFCC method to extract the speech spectrum features. The input audio file has a sample rate of 16KHz, mono, and is encoded as signed 16-bit. Each frame has a window width and step size of 30ms.

Neural Network
Now, the neural network structure has been updated to the ninth edition, among which:
- WakeNet1, WakeNet2, WakeNet3, WakeNet4, WakeNet6, and WakeNet7 had been out of use.
- WakeNet5 only supports ESP32 chip.
- WakeNet8 and WakeNet9 only support ESP32-S3 chip, which are built upon the Dilated Convolution structure.

Keyword Triggering Method:

For continuous audio stream, we calculate the average recognition results (M) for several frames and generate a smoothing prediction result, to improve the accuracy of keyword triggering. Only when the M value is larger than the set threshold, a triggering command is sent.

Use WakeNetï

Resource Occupancyï

For the resource occupancy for this model, see Resource Occupancy.

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4