Audio Toolbox
Design and analyze speech, acoustic, and audio processing systems
Have questions? Contact sales.
Have questions? Contact sales.
Audio Toolbox™ provides tools for audio processing, speech analysis, and acoustic measurement. It includes algorithms for processing audio signals such as equalization and time stretching, estimating acoustic signal metrics such as loudness and sharpness, and extracting audio features such as MFCC and pitch. It also provides advanced machine learning models, including i-vectors, and pretrained deep learning networks, including VGGish and CREPE. Toolbox apps support live algorithm testing, impulse response measurement, and signal labeling. The toolbox provides streaming interfaces to ASIO, CoreAudio, and other sound cards; MIDI devices; and tools for generating and hosting VST and Audio Units plugins.
With Audio Toolbox you can import, label, and augment audio data sets, as well as extract features to train machine learning and deep learning models. The pre-trained models provided can be applied to audio recordings for high-level semantic analysis.
You can prototype audio processing algorithms in real time or run custom acoustic measurements by streaming low-latency audio to and from sound cards. You can validate your algorithm by turning it into an audio plugin to run in external host applications such as Digital Audio Workstations. Plugin hosting lets you use external audio plugins as regular MATLAB® objects.
Read and write audio samples from and to sounds cards (such as USB or Thunderbolt™) using standard audio drivers (such as ASIO, WASAPI, CoreAudio, and ALSA) across Windows®, Mac®, and Linux® operating systems.
Process live audio in MATLAB with milliseconds of round-trip latency.
Use deep learning to carry out complex signal processing tasks and extract audio embeddings with a single line of code. Access established pre-trained networks like YAMNet, VGGish, CREPE, and OpenL3 and apply them with the help of preconfigured feature extraction functions.
Transform signals into time-frequency representations like Mel, Bark, and ERB spectrograms. Compute cepstral coefficients such as MFCC and GTCC, and scalar features such as pitch, harmonicity, and spectral descriptors. Extract high-level features and signal embeddings using pre-trained deep learning models (VGGish, OpenL3) and the i-vector system. Accelerate feature extraction with compatible GPU cards.
Train state-of-the art machine learning with your audio data sets. Use established systems of models, such as i-vectors, for applications like speaker identification and verification. Learn from working examples how to design and train advanced neural networks and layers for audio, speech, and acoustics applications.
Read, partition, and preprocess large collections of audio recordings. Annotate audio signals manually with apps. Identify and segment regions of interests automatically using pre-trained machine learning models.
Set up randomized data augmentation pipelines using combinations of pitch shifting, time stretching, and other audio processing effects. Create synthetic speech recordings from text using text-to-speech cloud-based services.
Model and apply parametric EQ, graphic EQ, shelving, and variable-slope filters. Design and simulate digital crossover, octave, and fractional-octave filters.
Model and apply dynamic range processing algorithms such as compressor, limiter, expander, and noise gate. Add artificial reverberation with recursive parametric models.
Design and simulate system models using libraries of audio processing blocks for Simulink®. Tune parameters and visualize system behavior using interactive controls and dynamic plots.
Automatically create user interfaces for tunable parameters of audio processing algorithms. Test individual algorithms with the Audio Test Bench app and tune parameters in running programs with auto-generated interactive controls.
Interactively change parameters of MATLAB algorithms by using MIDI control surfaces. Control external hardware or respond to events by sending and receiving any type of MIDI message.
Apply sound pressure level (SPL) meters and loudness meters to recorded or live signals. Analyze signals with octave and fractional-octave filters. Apply standard-compliant A-, C-, or K-weighting filters to raw recordings. Measure acoustic sharpness, roughness, and fluctuation strength.
Measure impulse and frequency responses of acoustic and audio systems with maximum-length sequences (MLS) and exponential swept sinusoids (ESS). Get started with the Impulse Response Measurer app. Automate measurements by programmatically generating excitation signals and estimating system responses.
Convolve signals with long impulse responses efficiently using frequency domain overlap-and-add or overlap-and-save implementations. Trade off latency for computational speed using automatic impulse response partitioning.
Encode and decode different ambisonic formats. Interpolate spatially sampled head-related transfer functions (HRTF).
Generate VST plugins, AU plugins, and standalone executable plugins directly from MATLAB code without requiring manual design of user interfaces. For more advanced plugin prototyping, generate ready-to-build JUCE C++ projects (requires MATLAB Coder™).
Use external VST and AU plugins as regular MATLAB objects. Change plugin parameters and programmatically process MATLAB arrays. Alternatively, automate associations of plugin parameters with user interfaces and MIDI controls. Host plugins generated from your MATLAB code for increased execution efficiency.
With MathWorks® coder products, generate C and C++ source code from signal processing and machine learning algorithms provided as toolbox functions, objects, and blocks. Generate CUDA source code from select feature extraction functions like mfcc
and melSpectrogram
.
Prototype audio processing designs on Raspberry Pi™ by using on-board or external multichannel audio interfaces. Create interactive control panels as mobile apps for Android® or iOS devices.
Prototype audio processing designs with single-sample inputs and outputs for adaptive noise control, hearing aid validation, or other applications requiring minimum round-trip DSP latency. Automatically target Speedgoat audio machines and ST Discovery boards directly from Simulink models.