Main Content

Speech Transcription and Synthesis

Use third-party APIs for text-to-speech and speech-to-text

Audio Toolbox™ provides examples for small-vocabulary recognition and sound synthesis. To perform general text-to-speech and speech-to-text, Audio Toolbox provides interfaces to popular third-party APIs. Supported APIs include Google® Speech, IBM® Watson Speech, and Microsoft® Azure Speech. To use this functionality, you must download the Audio Toolbox extended functionality for text2speech and speech2text from File Exchange.

Once you install the speech-to-text functionality, you can interact with it graphically in the Audio Labeler app to quickly label regions of speech.


Audio LabelerDefine and visualize ground-truth labels


Label Audio Using Audio Labeler

Interactively define and visualize ground-truth labels for audio datasets.

Featured Examples