Code covered by the BSD License  

Highlights from
Manual Audio Annotation

image thumbnail
from Manual Audio Annotation by Theodoros Giannakopoulos
AudioAnnotation Demo v.1.0 is an open source demo implemented in Matlab(R) for manual segmentation a

Audio Annotation Demo

Audio Annotation Demo


General Information

Name Audio Annotation Demo
Version 1.0
Release Date September 2007
Implemented by Theodoros Giannakopoulos
Institution Dept of Informatics and Telecommunications, University of Athens, Greece
Contact tyiannak@di.uoa.gr

This demo version is provided only for educational purposes without any warranty. Your feedback will help us improve the system in future versions.

MORE INFO: www.di.uoa.gr/~tyiannak


Documentation

1. Introduction

AudioAnnotation Demo v.1.0 is an open source demo implemented in Matlab(R) for manual segmentation and annotation of audio files. It also provides the ability of calculating and plotting basic audio features (e.g. short time energy, zero crossing rate) of the selected audio segments. In order to execute the GUI just run AudioAnnotationMain.m in a Matlab workspace.

2. GUI Description

The following main areas are defined in the GUI:

File Info: Used for loading .wav files and presenting the file path and other audio information (e.g. sampling rate)

Current Segment: Presents time limits of the current segment, button for playing current segment and volume control

Labelling: Selection of current segment's label and button for updating the annotation file.

Important note: The class selection combo box has 11 different class names. You can set your own class names NOT manually through Matlab's GUIDE but if you change the handles.classNames (line 68, AudioAnnotationMain.m).

Feature Extraction: Feature and Statistic combo boxes and button for computation and plotting of the selected feature.

Signal Browsing: Plot of the selected segment's signal, time and window sliders.

Class Distribution: Histogram of the current class distribution

Gui Example

3. File I/O

Input: The program reads only .wav files. In order to load an audio stream stored in a .wav file, press the "Open Wav File" button.

Output: The manual annotation data are stored in two different kinds of files: a) in an XML file, according to the MPEG-7 annotation standard and b) in a .mat (Matlab Binary) file. Regarding the second case, in the mat file a variable called flagsReal is stored: this is an array that contains a class number (i.e. 1 for music, 2 for speech etc) for each 100 mseconds window (0 is for unclassified segments). These files are automatically updated each time the Update Annotation File is pressed. You can use these files as round truth, in order to check your segmentation-classification algorithm's performance.

4. Feature Calculation

You can choose a feature sequence (Energy or Zero Crossing Rate) and the respective statistic. When you press the "Feature" button, the feature sequence (and statistic) is calculated and plotted, for the currently selected signal area:

5. Future Work

The present version is subject to improvement. In particular, more audio features will be added, better GUI elements and maybe a semi-automatic approach at the annotation process. For any ideas - suggestions please contact me at tyiannak@di.uoa.gr.

Contact us at files@mathworks.com