File Exchange

image thumbnail

Useful Matlab Functions for Speaker Recognition Using Adapted Gaussian Mixture Model

version (3.91 KB) by Md Sahidullah
This submission includes useful MATLAB functions for speaker recognition using adapted GMM.


Updated 06 Jun 2011

View License

Implementation details of (i)-(iii) can be found in [1].

The fourth function (gmm2sv.m) is to concatenate the means (i.e. centers) of GMM. The concatenated mean of adapted GMM is known as GMM supervector (GSV) and it is used in GMM-SVM based speaker recognition system. Details of GMM-SVM based speaker recognition system can be found in [2].

These codes require Netlab toolbox.

[1] D. A. Reynolds, T. F. Quatieri, and R. B. dunn, "Speaker verification using adapted Gaussian mixture models", Digital signal processing, vol. 10, pp. 19--41, 2000.
[2] Campbell, W.M.; Sturim, D.E.; Reynolds, D.A.; , "Support vector machines using GMM supervectors for speaker verification," Signal Processing Letters, IEEE , vol.13, no.5, pp. 308- 311, May 2006.

Cite As

Md Sahidullah (2020). Useful Matlab Functions for Speaker Recognition Using Adapted Gaussian Mixture Model (, MATLAB Central File Exchange. Retrieved .

Comments and Ratings (8)

@sara chellali: This script is for type 'diag' only. "covars 12x12x64" indicates you have used full-covariance based GMM.

type 'gmm'
covar 'diag'
nin 12
ncentres 64
priors 1x64
covars 12x12x64
centres 64x12
nwts 1600

when i run this code, i have these errors:

Matrix dimensions must agree.

Error in gmmactiv (line 46)
a(:, j) = exp(-0.5*sum((diffs.*diffs)./(ones(ndata, 1) * mix.covars(j,:)), 2)) ./ (normal*s(j));

Error in gmmpost (line 25)
a = gmmactiv(mix, x);

Error in gmmmap (line 28)
[post, act] = gmmpost(mix, data); %Computes class posterior probabilities

@Sandeep: The choice of relevance factor is ad-hoc. It is usually chosen between 8 to 14 for NIST SREs where the speech utterance length is higher (typically, more than 60 seconds). However, for text-dependent speaker recognition corpus with short segments, a lower value (between 1 to 4) of relevance factor is used.
I suggest setting this value based on the performance in development set.

what is the valure of relevance factor for map adaptation


I think you can. But I would try some implementation of PLDA or deep learning.


This one is intended to be used on speaker ID, but you could adapt it easily, since the important functions take features as arguments:


Can use for image recognition?


Description is updated.

MATLAB Release Compatibility
Created with R2009a
Compatible with any release
Platform Compatibility
Windows macOS Linux

Inspired by: Netlab