Audio Toolbox™ provides examples for small-vocabulary recognition and sound synthesis. To perform general text-to-speech and speech-to-text, Audio Toolbox provides interfaces to popular third-party APIs. Supported APIs include Google® Speech, IBM® Watson Speech, and Microsoft® Azure Speech. To use this functionality, you must download the Audio Toolbox extended functionality for text2speech and speech2text from File Exchange.
Once you install the speech-to-text functionality, you can interact with it graphically in the Audio Labeler app to quickly label regions of speech.
Audio Labeler | Define and visualize ground-truth labels |
Label Audio Using Audio Labeler
Interactively define and visualize ground-truth labels for audio datasets.