Code covered by the BSD License  

Highlights from
HTK MFCC MATLAB

5.0
5.0 | 10 ratings Rate this file 277 Downloads (last 30 days) File Size: 307 KB File ID: #32849 Version: 1.2
image thumbnail

HTK MFCC MATLAB

by

 

11 Sep 2011 (Updated )

Mel frequency cepstral coefficient feature extraction that closely matches that of HTK's HCopy.

| Watch this File

File Information
Description

Computes mel frequency cepstral coefficient (MFCC) features from a given speech signal. The speech signal is first preemphasised using a first order FIR filter with preemphasis coefficient. The preemphasised speech signal is subjected to the short-time Fourier transform analysis with a specified frame duration, frame shift and analysis window function. This is followed by magnitude spectrum computation, followed by filterbank design with M triangular filters uniformly spaced on the mel scale between lower and upper frequency limits. The filterbank is applied to the magnitude spectrum values to produce filterbank energies (FBEs). Log-compressed FBEs are then decorrelated using the discrete cosine transform to produce cepstral coefficients. Final step applies sinusoidal lifter to produce liftered MFCCs that closely match those produced by HTK. Demo scripts are included.

Acknowledgements

Triangular Filterbank, File I/O For Cell Arrays, and Framing Routines inspired this file.

Required Products Signal Processing Toolbox
MATLAB release MATLAB 7.10 (R2010a)
Other requirements HTK, RASTAMAT, VOICEBOX
Tags for This File   Please login to tag files.
Please login to add a comment or rating.
Comments and Ratings (15)
29 Jul 2015 Kamil Wojcicki

Yibo, the overlap used is as defined in:

Huang, X., Acero, A., Hon, H., 2001. Spoken Language Processing: A guide to theory, algorithm, and system development. Prentice Hall, Upper Saddle River, NJ, USA (pp. 314-315).

Comment only
29 Jul 2015 Kamil Wojcicki

Olessya, what is the dimensionality of your input vector? i.e., what is size(speech)? It must be a vector and not a matrix.

Comment only
28 Jul 2015 Olessya Medvedeva

hi, i am trying to use your code but it gives me the usage error:
[ MFCCs, FBEs, frames ] = ...
mfcc( speech, fs, Tw, Ts, alpha, @hamming, [LF HF], M, C+1, L );
Error using vec2frames (line 83)
usage: [ frames, indexes ] = vec2frames( vector, frame_length, frame_shift, direction, window, padding );

Error in mfcc (line 151)
frames = vec2frames( speech, Nw, Ns, 'cols', window, false );

Do you have any idea what the problem might be? Thank you

26 Jul 2015 Yibo Yang

Quick question Kamil: how can I tweak the trifbank code so that I can generate triangular filters with, say, 50% overlaps in the mel scale?
Thanks for your work!

18 Jun 2015 Kamil Wojcicki

Brittany, are you using the provided example.m with sp10.wav, or your own audio files? If the audio file you are using happens to have long sections of zero only samples, that could explain NaN MFCC values. If that is the case, you could add some very low level noise to your audio samples, e.g.,

speech = speech + randn(size(speech))*1E-10;

Hope this helps.

Comment only
18 Jun 2015 Brittany Davis

I get some NaN values in the MFCC variable. Why is that so?

Comment only
24 Jul 2014 clarissa yong

Does anyone know which file should I run to achieve the final outcome? please help,thanks!!

13 Jul 2014 Adnan Farooq

In case of sequence of images.. how can we use MFCC? 1-> we convert each frame to 2D/3D to 1D vector. but i am confuse how can i use these parameters ?
"fs, Tw, Ts, alpha, window, R, M, N, L"

Comment only
16 Mar 2014 Agus Reza

Telkom University Indonesia - was here :D

06 Jan 2014 wuhan institute of technology  
31 May 2013 Christophe  
16 May 2013 yingxue wang

NOT BAD

04 Feb 2013 Saurabh Verma  
21 Dec 2012 Lehigh

Lehigh (view profile)

very good!

05 Sep 2012 FJK

FJK (view profile)

 
Updates
19 Sep 2011 1.2

Title change

Contact us