A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from http://reference.wolfram.com/language/ref/netencoder/AudioMelSpectrogram.html below:

AudioMelSpectrogram—Wolfram Language Documentation

WOLFRAM Consulting & Solutions

We deliver solutions for the AI era—combining symbolic computation, data-driven insights and deep technology expertise.

WolframConsulting.com

NET ENCODER "AudioMelSpectrogram" (Net Encoder)

NetEncoder["AudioMelSpectrogram"]

represents an encoder that converts an audio file or object into its mel-frequency spectrogram.

NetEncoder[{"AudioMelSpectrogram","param"->val,}]

represents an encoder with specific parameters for preprocessing and feature computation.

Details Examplesopen allclose all Basic Examples  (1)

Create a mel-spectrogram NetEncoder:

Create an Audio object:

Apply the encoder to the Audio object:

Plot the result:

Scope  (3)

NetEncoder["AudioMelSpectrogram"] can encode either File or Audio objects. Create a mel-spectrogram encoder:

Apply the encoder to a File object:

Apply the encoder to an in-core Audio object:

Apply the encoder to an out-of-core Audio object:

Create a list of Audio objects:

NetEncoder["AudioMelSpectrogram"] maps across a batch of inputs:

Create a mel-spectrogram NetEncoder:

Attach the encoder to the input of a net:

Apply the net to an Audio object:

Parameters  (9) "Normalization"  (1)

Create an Audio object:

Use an encoder with "Normalization"->None to avoid any normalization:

Since the normalization is applied to the signal before the spectrogram is computed, there are no guarantees on the bounds of the result:

Use an encoder with "Normalization"->Automatic to normalize the maximum absolute value of the waveform samples to 1.:

Find the minimum and maximum value of the result:

"SampleRate"  (2)

Create an Audio object:

Using an encoder with "SampleRate"8000 resamples the signal to 8000Hz before performing the short-time Fourier transform:

The "SampleRate" parameter affects the computation of the default window size:

An encoder with a lower sample rate than the original audio will result in a shorter window length:

An encoder with a higher sample rate than the original audio will result in a longer window length:

"TargetLength"  (1)

Create an Audio object:

Using an encoder with "TargetLength"All returns the mel-spectrogram for all the data:

Using an encoder with "TargetLength"10 zero-pads the output to be of length 10:

Using an encoder with "TargetLength"2 takes only the first two partitions:

"WindowSize"  (1)

The partition length is automatically computed to be 25ms:

Using an encoder with "WindowSize"600 returns the mel-spectrogram using partitions of 600 samples:

"Offset"  (1)

Create an Audio object:

The partition offset is automatically computed to be 1/3 of the partition length:

Using an encoder with "Offset"10 returns the mel-spectrogram computed using partitions with an offset of 10 samples:

"MinimumFrequency"  (1)

Create an Audio object:

The minimum frequency is automatically computed to be Ceiling[sr/ws], where sr is the sample rate "SampleRate" and ws is the partition length "WindowSize":

Using an encoder with "MinimumFrequency"2000 returns the mel-spectrogram computed using filters whose minimum frequency is 2000Hz:

"MaximumFrequency"  (1)

Create an Audio object:

The maximum frequency is automatically computed to be Round[Min[8000,sr/2]]], where sr is the sample rate "SampleRate":

Using an encoder with "MaximumFrequency"2000 returns the mel-spectrogram computed using filters whose maximum frequency is 2000Hz:

"NumberOfFilters"  (1)

Create an Audio object:

By default, 40 filters are used for the computation of the mel-spectrogram:

Using an encoder with "NumberOfFilters"128 returns the mel-spectrogram computed using 128 filters:

Properties & Relations  (1)

Create an Audio object:

Create a mel-spectrogram NetEncoder:

The length of the result can be computed as Ceiling[length/offset], where length is the length of the signal after resampling and offset is the "Offset" parameter of the encoder:


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4