Audio Toolbox™ provides the pretrained VGGish and YAMNet networks. Use
the vggish
and yamnet
functions
to interact directly with the pretrained networks. The classifySound
function performs required preprocessing
and postprocessing for YAMNet so that you can locate and classify sounds
into one of 521 categories. You can explore the YAMNet ontology using
the yamnetGraph
function. The vggishFeatures
function performs the necessary
preprocessing and postprocessing for VGGish so that you can extract
feature embeddings to input to machine learning and deep learning
systems.
This functionality requires Deep Learning Toolbox™.