WakeNet is a wake word engine built upon neural network for low-power embedded MCUs. Currently, WakeNet supports up to 5 wake words.
OverviewïPlease see the flow diagram of WakeNet below:
We use MFCC method to extract the speech spectrum features. The input audio file has a sample rate of 16KHz, mono, and is encoded as signed 16-bit. Each frame has a window width and step size of 30ms.
Now, the neural network structure has been updated to the ninth edition, among which:
WakeNet1, WakeNet2, WakeNet3, WakeNet4, WakeNet6, and WakeNet7 had been out of use.
WakeNet5 only supports ESP32 chip.
WakeNet8 and WakeNet9 only support ESP32-S3 chip, which are built upon the Dilated Convolution structure.
For continuous audio stream, we calculate the average recognition results (M) for several frames and generate a smoothing prediction result, to improve the accuracy of keyword triggering. Only when the M value is larger than the set threshold, a triggering command is sent.
Select WakeNet model
Run WakeNet
WakeNet is currently included in the AFE, which is enabled by default, and returns the detection results through the AFE fetch interface.
If users do not need WakeNet, please use:
afe_config->wakeNet_init = False.If users want to enable/disable WakeNet temporarily, please use:
afe_handle->disable_wakenet(afe_data) afe_handle->enable_wakenet(afe_data)
For the resource occupancy for this model, see Resource Occupancy.
Provide feedback about this document
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4