4.43333

4.4 | 31 ratings Rate this file 334 downloads (last 30 days) File Size: 9.77 KB File ID: #8636

EM_GM

by Patrick Tsui

 

04 Oct 2005 (Updated 06 Mar 2006)

No BSD License  

An expectation maximization algorithm for learning a multi-dimensional Gaussian mixture.

Download Now | Watch this File

File Information
Description

Although EM algorithm for Gaussian mixture (EM_GM) learning is well known, 3 major MATLAB EM_GM codes are found on the web. However, they either have errors or not easy to incorporate into other MATLAB codes. Therefore, I decide to write my own EM_GM and share it. My EM_GM is designed as a single file function (i.e. all sub functions are included in the same file) for convenience and portability.

Detail descriptions of all inputs and outputs are included in the file. EM_GM can be controlled to plot 1D or 2D problems and display CPU time used as well as number of iterations.

Example:
X = zeros(600,2);
X(1:200,:) = normrnd(0,1,200,2);
X(201:400,:) = normrnd(0,2,200,2);
X(401:600,:) = normrnd(0,3,200,2);
[W,M,V,L] = EM_GM(X,3,[],[],1,[])

Acknowledgements
This submission has inspired the following:
Plot_GM, Fast EM_GM
Required Products Statistics Toolbox
MATLAB release MATLAB 6.5 (R13)
Tags for This File  
Everyone's Tags
Tags I've Applied
Add New Tags Please login to tag files.
Comments and Ratings (34)
12 Oct 2005 Mba Emmanuel Ikechukwu

It sounds very interesting and vry resourseful

16 Oct 2005 Michael Boedigheimer

As the author suggests, this routine estimates the parameters for a mixture of multivariate normal distributions (ala Aaron D'Souza's article from USC, "Using EM To Estimate A Probablity[sic] Density With A Mixture of Gaussians"). Runs quickly and produces reasonable answers.
Wish that the functions were factored into separate files so I could use the nice ellipse plotting and other functions as stand-alones. But that's nitpicking.

15 Nov 2005 Yan Yang

Thanks for sharing!! I am really glad to have found this EM matlab code. I've tested it over randomly generated data in Matlab and this function works well. It runs a little bit slow when the data size gets larger and when there is no initialization for the parameters, but I think that is because of the iterative nature of the algorithm, not the implementation. Thanks again.

21 Nov 2005 Jean-Christophe Terrillon

This is the first time that I could find a stand alone function for Gaussian mixture estimation.Compact and professionally written, extremely useful. It saved my day at a critical time in my research. Thank you Patrick!

21 Dec 2005 glinda ma

It's very useful for me. Thanks a lot.

03 Apr 2006 wajih ouertani

Really Compact and professionally written, extremely useful . Thanks for sharing !

16 Apr 2006 Zhenhai Wang

Terrific job, Patrick. Thanks a lot for sharing

04 May 2006 I and I, Juh

I thougth I'd had to writte something like this my-self. I would then have spend a lot of time, and it would never have been as perfect.

Thank you so much !!!

08 May 2006 Ed Lawson

Outstanding. Works just as expected.

31 Aug 2006 Mikel Rodriguez

Mr. Tsui's implementation does indeed resolve several errors found in several implementations that I have reviewed.

08 Dec 2006 Gang Liang

First, thanks for sharing the code with the community.

The Likelihood function implemented is problematic. The short cut to compute the likelihood function of a Gaussian mixture model is not correct even though it might work well in many cases...

15 Feb 2007 Pete Santago

I tried the one from the classification toolbox mentioned in the Duda, Hart, and Stork text. It was impossible to figure out.

This one worked right off. Thanks.

06 Mar 2007 Joan kees

really great

15 Mar 2007 sujeesh kumar  
19 Mar 2007 Martin Hemberg

Excellent script, well documented and easy to use.

05 Apr 2007 Nick van Rodijnen

The algorithm converts very well to a good solution and is reasonable stable if applied multiple times to the same data. It is even faster than the greedy EM algorithm. My compliments

15 May 2007 P. Baruah

Thanks to author of this code! it was very helpful in understanding the EM process.

17 May 2007 Dima Damen

Thank you... well documented, easy to use and flexible interface

05 Jun 2007 P-Marc Jodoin

Super great! can't be better!

07 Jun 2007 Phong Vo

I found some error prones : the code can't deal when covariance matrix becomes singular.

10 Oct 2007 Nash Borges

Noticed something that may be incorrect. When using multiple mixtures, you cannot compute the log likelihood of all of the data under each mixture and then sum them. You need to do the sum before going into the log domain.

10 Oct 2007 Nash Borges

here's an approximation:
http://nlp.cs.byu.edu/mediawiki/index.php/Log_Domain_Computations

30 Oct 2007 Javier Vicente  
06 Nov 2007 prabu chandran

Nice code. Its very easy to use - no variations required and the usuage of the function given makes the whole process of integration of the function simpler. Thank you.

23 Jan 2008 Vaibhav Bedia

is there any way i could find where the gaussians fit in the curve and not have the mean on the x-axis

29 Jan 2008 E Fontaine

Thanks for this well written piece of code

20 Apr 2008 David Liu

Seems like the log likelihood sometime decreases, which should not happen. Perhaps the formula for computing the log likelihood is incorrect?!

29 May 2008 Cihat Eldeniz

This code computes the E-step and the M-step correctly. It is also readable and neat for the user. Nevertheless, there is a problem and an improvement to make I would like to report. The problem is that, the likelihood calculation is incorrect although the outputs of both E and M steps are correct. I have checked the reference to which the code is attributed. In fact, in that reference, the probability density function (pdf) is a single gaussian rather than a mixture. This is why there is a confusion leading to miscalculations. The actual likelihood function should be computed as [in LaTeX format]:

L = \sum^{n}_{i=1} \log(\sum^{k}_{j=1}W(j)*\mathcal{N}(X(:,i)^{T},M(:,j),V(:,:,j)))

The improvement is that there are some loops which can be replaced by matrix multiplications or some other matrix-related operations. Furthermore, the likelihood computation can easily be achieved inside the E-step.

In conclusion, the following code is the one modified with respect to the aforementioned points:

function [W,M,V,L,E] = EM_GM(X,k,ltol,maxiter,pflag,Init)
% Inputs:
% X(n,d) - input data, n=number of observations, d=dimension of variable
% k - maximum number of Gaussian components allowed
% ltol - percentage of the log likelihood difference between 2 iterations ([] for none)
% maxiter - maximum number of iteration allowed ([] for none)
% pflag - 1 for plotting GM for 1D or 2D cases only, 0 otherwise ([] for none)
% Init - structure of initial W, M, V: Init.W, Init.M, Init.V ([] for none)
%
% Ouputs:
% W(1,k) - estimated weights of GM
% M(d,k) - estimated mean vectors of GM
% V(d,d,k) - estimated covariance matrices of GM
% L - log likelihood of estimates
%

%%%% Validate inputs %%%%
if nargin <= 1,
    disp('EM_GM must have at least 2 inputs: X,k!/n')
    return
elseif nargin == 2,
    ltol = 0.1; maxiter = 1000; pflag = 0; Init = [];
    err_X = Verify_X(X);
    err_k = Verify_k(k);
    if err_X | err_k, return; end
elseif nargin == 3,
    maxiter = 1000; pflag = 0; Init = [];
    err_X = Verify_X(X);
    err_k = Verify_k(k);
    [ltol,err_ltol] = Verify_ltol(ltol);
    if err_X | err_k | err_ltol, return; end
elseif nargin == 4,
    pflag = 0; Init = [];
    err_X = Verify_X(X);
    err_k = Verify_k(k);
    [ltol,err_ltol] = Verify_ltol(ltol);
    [maxiter,err_maxiter] = Verify_maxiter(maxiter);
    if err_X | err_k | err_ltol | err_maxiter, return; end
elseif nargin == 5,
     Init = [];
    err_X = Verify_X(X);
    err_k = Verify_k(k);
    [ltol,err_ltol] = Verify_ltol(ltol);
    [maxiter,err_maxiter] = Verify_maxiter(maxiter);
    [pflag,err_pflag] = Verify_pflag(pflag);
    if err_X | err_k | err_ltol | err_maxiter | err_pflag, return; end
elseif nargin == 6,
    err_X = Verify_X(X);
    err_k = Verify_k(k);
    [ltol,err_ltol] = Verify_ltol(ltol);
    [maxiter,err_maxiter] = Verify_maxiter(maxiter);
    [pflag,err_pflag] = Verify_pflag(pflag);
    [Init,err_Init]=Verify_Init(Init);
    if err_X | err_k | err_ltol | err_maxiter | err_pflag | err_Init, return; end
else
    disp('EM_GM must have 2 to 6 inputs!');
    return
end

%%%% Initialize W, M, V,L %%%%
t = cputime;
if isempty(Init),
    [W,M,V] = Init_EM(X,k); L = 0;
else
    W = Init.W;
    M = Init.M;
    V = Init.V;
end
[E,Ln] = Expectation(X,k,W,M,V); % Initialize log likelihood and compute expectation
Lo = 2*Ln;

%%%% EM algorithm %%%%
niter = 0;
while (abs(100*(Ln-Lo)/Lo)>ltol) & (niter<=maxiter),
%**************************************************************************
% For the first loop, expectation is computed above (cf. Line 69). For the
% next loops, it is computed in their preceding loops. Therefore, the
% following line [Line 80] is omitted.
%**************************************************************************
% E = Expectation(X,k,W,M,V); % E-step

    [W,M,V] = Maximization(X,k,E); % M-step
    Lo = Ln;
    [E,Ln] = Expectation(X,k,W,M,V); % Log-likelihood and expectation computation
    niter = niter + 1;
    LnValues(niter) = Ln;
    disp(['Iteration # = ' num2str(niter) ', Ln = ' num2str(Ln) ', Stopping Criterion = ' num2str(abs(100*(Ln-Lo)/Lo)) ', Tolerance = ' num2str(ltol)])
end
L = Ln;
theTime = num2str(fix(clock));
theTime(theTime==' ') = '_';
save(['EM_CostFunction_' theTime '_MoreLoopsRemoved.mat'],'LnValues')

%%%% Plot 1D or 2D %%%%
if pflag==1,
    [n,d] = size(X);
    if d>2,
        disp('Can only plot 1 or 2 dimensional applications!/n');
    else
        Plot_GM(X,k,W,M,V);
    end
    elapsed_time = sprintf('CPU time used for EM_GM: %5.2fs',cputime-t);
    disp(elapsed_time);
    disp(sprintf('Number of iterations: %d',niter-1));
end
%%%%%%%%%%%%%%%%%%%%%%
%%%% End of EM_GM %%%%
%%%%%%%%%%%%%%%%%%%%%%

function [E,L] = Expectation(X,k,W,M,V)
[n,d] = size(X);
a = (2*pi)^(0.5*d);
S = zeros(1,k);
iV = zeros(d,d,k);
for j=1:k,
    if V(:,:,j)==zeros(d,d), V(:,:,j)=ones(d,d)*eps; end
    S(j) = sqrt(det(V(:,:,j)));
    iV(:,:,j) = inv(V(:,:,j));
end
%*************************************************************************
%*************************** MODIFICATION: LOOP REMOVAL *****************
%*************************************************************************
E = zeros(n,k);
% for i=1:n,
% for j=1:k,
% dXM = X(i,:)'-M(:,j);
% pl = exp(-0.5*dXM'*iV(:,:,j)*dXM)/(a*S(j));
% E(i,j) = W(j)*pl;
% end
% E(i,:) = E(i,:)/sum(E(i,:));
% end
chunkSize = 1000;
howManyFullChunks = fix(n/chunkSize);
numberOfRemainingVectors = n - howManyFullChunks*chunkSize;
pdfValueVector = zeros(n,1);
for j = 1:k
    for chunkCounter = 1:howManyFullChunks
        modificationRange = (chunkCounter-1)*chunkSize+1:chunkCounter*chunkSize;
        meanSubtractedChunk = X(modificationRange,:)'-repmat(M(:,j),1,chunkSize);
        pdfValueVectorToExp = diag(-0.5*meanSubtractedChunk'*iV(:,:,j)*meanSubtractedChunk);
        pdfValueVector(modificationRange) = exp(pdfValueVectorToExp)/(a*S(j));
    end
    modificationRange = howManyFullChunks*chunkSize+1:n;
    meanSubtractedChunk = X(modificationRange,:)'-repmat(M(:,j),1,length(modificationRange));
    pdfValueVectorToExp = diag(-0.5*meanSubtractedChunk'*iV(:,:,j)*meanSubtractedChunk);
    pdfValueVector(modificationRange) = exp(pdfValueVectorToExp)/(a*S(j));
    E(:,j) = W(j)*pdfValueVector';
end
sumInColumnDirection = sum(E,2);
divisor = repmat(sumInColumnDirection,1,k);
L = sum(log(sum(E,2)));
E = E./divisor;
%**************************************************************************
%*************************** MODIFICATION ENDS HERE ***********************
%**************************************************************************

%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%% End of Expectation %%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%

function [W,M,V] = Maximization(X,k,E)
[n,d] = size(X);
V = zeros(d,d,k);
%*************************************************************************
%*************************** MODIFICATION: LOOP REMOVAL *****************
%*************************************************************************
% W = zeros(1,k); M = zeros(d,k);
% for i=1:k, % Compute weights
% for j=1:n,
% W(i) = W(i) + E(j,i);
% M(:,i) = M(:,i) + E(j,i)*X(j,:)';
% end
% M(:,i) = M(:,i)/W(i);
% end
W = sum(E,1);
M = X'*E;
M = M./repmat(W,d,1);
%**************************************************************************
%*************************** MODIFICATION ENDS HERE ***********************
%**************************************************************************
for i=1:k,
    for j=1:n,
        dXM = X(j,:)'-M(:,i);
        V(:,:,i) = V(:,:,i) + E(j,i)*dXM*dXM';
    end
    V(:,:,i) = V(:,:,i)/W(i);
end
W = W/n;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%% End of Maximization %%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%**************************************************************************
%************************ LIKELIHOOD REMOVED COMPLETELY *******************
%**************************************************************************
% function L = Likelihood(X,k,W,M,V)
% Compute L based on K. V. Mardia, "Multivariate Analysis", Academic Press, 1979, PP. 96-97
% % to enchance computational speed
% [n,d] = size(X);
% U = mean(X)';
% S = cov(X);
% L = 0;
% for i=1:k,
% iV = inv(V(:,:,i));
% L = L + W(i)*(-0.5*n*log(det(2*pi*V(:,:,i))) ...
% -0.5*(n-1)*(trace(iV*S)+(U-M(:,i))'*iV*(U-M(:,i))));
% end
%**************************************************************************
%*************************** MODIFICATION ENDS HERE ***********************
%**************************************************************************
%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%% End of Likelihood %%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%

function err_X = Verify_X(X)
err_X = 1;
[n,d] = size(X);
if n<d,
    disp('Input data must be n x d!/n');
    return
end
err_X = 0;
%%%%%%%%%%%%%%%%%%%%%%%%%
%%%% End of Verify_X %%%%
%%%%%%%%%%%%%%%%%%%%%%%%%

function err_k = Verify_k(k)
err_k = 1;
if ~isnumeric(k) | ~isreal(k) | k<1,
    disp('k must be a real integer >= 1!/n');
    return
end
err_k = 0;
%%%%%%%%%%%%%%%%%%%%%%%%%
%%%% End of Verify_k %%%%
%%%%%%%%%%%%%%%%%%%%%%%%%

function [ltol,err_ltol] = Verify_ltol(ltol)
err_ltol = 1;
if isempty(ltol),
    ltol = 1e-2;
elseif ~isreal(ltol) | ltol<=0,
    disp('ltol must be a positive real number!');
    return
end
err_ltol = 0;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%% End of Verify_ltol %%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%

function [maxiter,err_maxiter] = Verify_maxiter(maxiter)
err_maxiter = 1;
if isempty(maxiter),
    maxiter = 1000;
elseif ~isreal(maxiter) | maxiter<=0,
    disp('ltol must be a positive real number!');
    return
end
err_maxiter = 0;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%% End of Verify_maxiter %%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

function [pflag,err_pflag] = Verify_pflag(pflag)
err_pflag = 1;
if isempty(pflag),
    pflag = 0;
elseif pflag~=0 & pflag~=1,
    disp('Plot flag must be either 0 or 1!/n');
    return
end
err_pflag = 0;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%% End of Verify_pflag %%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

function [Init,err_Init] = Verify_Init(Init)
err_Init = 1;
if isempty(Init),
    % Do nothing;
elseif isstruct(Init),
    [Wd,Wk] = size(Init.W);
    [Md,Mk] = size(Init.M);
    [Vd1,Vd2,Vk] = size(Init.V);
    if Wk~=Mk | Wk~=Vk | Mk~=Vk,
        disp('k in Init.W(1,k), Init.M(d,k) and Init.V(d,d,k) must equal!/n')
        return
    end
    if Md~=Vd1 | Md~=Vd2 | Vd1~=Vd2,
        disp('d in Init.W(1,k), Init.M(d,k) and Init.V(d,d,k) must equal!/n')
        return
    end
else
    disp('Init must be a structure: W(1,k), M(d,k), V(d,d,k) or []!');
    return
end
err_Init = 0;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%% End of Verify_Init %%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%

function [W,M,V] = Init_EM(X,k)
[n,d] = size(X);
[Ci,C] = kmeans(X,k,'Start','cluster', ...
    'Maxiter',100, ...
    'EmptyAction','drop', ...
    'Display','off'); % Ci(nx1) - cluster indeices; C(k,d) - cluster centroid (i.e. mean)
while sum(isnan(C))>0,
    [Ci,C] = kmeans(X,k,'Start','cluster', ...
        'Maxiter',100, ...
        'EmptyAction','drop', ...
        'Display','off');
end
M = C';
Vp = repmat(struct('count',0,'X',zeros(n,d)),1,k);
for i=1:n, % Separate cluster points
    Vp(Ci(i)).count = Vp(Ci(i)).count + 1;
    Vp(Ci(i)).X(Vp(Ci(i)).count,:) = X(i,:);
end
V = zeros(d,d,k);
for i=1:k,
    W(i) = Vp(i).count/n;
    V(:,:,i) = cov(Vp(i).X(1:Vp(i).count,:));
end
%%%%%%%%%%%%%%%%%%%%%%%%
%%%% End of Init_EM %%%%
%%%%%%%%%%%%%%%%%%%%%%%%

function Plot_GM(X,k,W,M,V)
[n,d] = size(X);
if d>2,
    disp('Can only plot 1 or 2 dimensional applications!/n');
    return
end
S = zeros(d,k);
R1 = zeros(d,k);
R2 = zeros(d,k);
for i=1:k, % Determine plot range as 4 x standard deviations
    S(:,i) = sqrt(diag(V(:,:,i)));
    R1(:,i) = M(:,i)-4*S(:,i);
    R2(:,i) = M(:,i)+4*S(:,i);
end
Rmin = min(min(R1));
Rmax = max(max(R2));
R = [Rmin:0.001*(Rmax-Rmin):Rmax];
% clf, hold on
figure; hold on
if d==1,
    Q = zeros(size(R));
    for i=1:k,
        P = W(i)*normpdf(R,M(:,i),sqrt(V(:,:,i)));
        Q = Q + P;
        plot(R,P,'r-'); grid on,
    end
    plot(R,Q,'k-');
    xlabel('X');
    ylabel('Probability density');
else % d==2
% plot(X(:,1),X(:,2),'r.');
    plot(X(1:200,1),X(1:200,2),'r.');
    plot(X(201:400,1),X(201:400,2),'g.');
    plot(X(401:600,1),X(401:600,2),'b.');
    for i=1:k,
        Plot_Std_Ellipse(M(:,i),V(:,:,i));
    end
    xlabel('1^{st} dimension');
    ylabel('2^{nd} dimension');
    axis([Rmin Rmax Rmin Rmax])
end
title('Gaussian Mixture estimated by EM');
%%%%%%%%%%%%%%%%%%%%%%%%
%%%% End of Plot_GM %%%%
%%%%%%%%%%%%%%%%%%%%%%%%

function Plot_Std_Ellipse(M,V)
[Ev,D] = eig(V);
d = length(M);
if V(:,:)==zeros(d,d),
    V(:,:) = ones(d,d)*eps;
end
iV = inv(V);
% Find the larger projection
P = [1,0;0,0]; % X-axis projection operator
P1 = P * 2*sqrt(D(1,1)) * Ev(:,1);
P2 = P * 2*sqrt(D(2,2)) * Ev(:,2);
if abs(P1(1)) >= abs(P2(1)),
    Plen = P1(1);
else
    Plen = P2(1);
end
count = 1;
step = 0.001*Plen;
Contour1 = zeros(2001,2);
Contour2 = zeros(2001,2);
for x = -Plen:step:Plen,
    a = iV(2,2);
    b = x * (iV(1,2)+iV(2,1));
    c = (x^2) * iV(1,1) - 1;
    Root1 = (-b + sqrt(b^2 - 4*a*c))/(2*a);
    Root2 = (-b - sqrt(b^2 - 4*a*c))/(2*a);
    if isreal(Root1),
        Contour1(count,:) = [x,Root1] + M';
        Contour2(count,:) = [x,Root2] + M';
        count = count + 1;
    end
end
Contour1 = Contour1(1:count-1,:);
Contour2 = [Contour1(1,:);Contour2(1:count-1,:);Contour1(count-1,:)];
plot(M(1),M(2),'k+');
plot(Contour1(:,1),Contour1(:,2),'k-');
plot(Contour2(:,1),Contour2(:,2),'k-');
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%% End of Plot_Std_Ellipse %%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

31 Aug 2008 Jason Lee

we can speed up this piece of code by preallocating the memory of W

in init_EM

%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%% End of Verify_Init %%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%

function [W,M,V] = Init_EM(X,k)
[n,d] = size(X);
[Ci,C] = kmeans(X,k,'Start','cluster', ...
'Maxiter',100, ...
'EmptyAction','drop', ...
'Display','off'); % Ci(nx1) - cluster indeices; C(k,d) - cluster centroid (i.e. mean)
while sum(isnan(C))>0,
[Ci,C] = kmeans(X,k,'Start','cluster', ...
'Maxiter',100, ...
'EmptyAction','drop', ...
'Display','off');
end
M = C';
Vp = repmat(struct('count',0,'X',zeros(n,d)),1,k);
for i=1:n, % Separate cluster points
Vp(Ci(i)).count = Vp(Ci(i)).count + 1;
Vp(Ci(i)).X(Vp(Ci(i)).count,:) = X(i,:);
end
V = zeros(d,d,k);

% Preallocating W
W = zeros(1,k);

for i=1:k,
W(i) = Vp(i).count/n;
V(:,:,i) = cov(Vp(i).X(1:Vp(i).count,:));
end
%%%%%%%%%%%%%%%%%%%%%%%%
%%%% End of Init_EM %%%%
%%%%%%%%%%%%%%%%%%%%%%%%

16 Oct 2008 chang YL

this is good.

29 Dec 2008 Shawn_lj Qingsheng

When I run [W,M,V,L] = EM_GM(X,3,[],[],1,[]); on matlab command as the example, the error happened, as below:

??? Error using ==> kmeans.kmeans
Too many input arguments.

Error in ==> EM_GM>Init_EM at 305
[Ci,C] = kmeans(X,k,'Start','cluster', ...

Error in ==> EM_GM at 63
    [W,M,V] = Init_EM(X,k); L = 0;

Please help,thanks

05 Jan 2009 Adel Fakih

it might be also any other implementation of netlab. type edit kmeans to find the culprit one.

06 Mar 2009 gd j,

it's very helpful thanks

07 Jul 2009 Yashodhan Athavale

I have a multidimensional file in Microsoft Excel. I imported the sheet into X, and executed [W,M,V,L] = EM_GM(X,3,[],[],1,[]);

I am getting the following errors,

[W,M,V,L] = EM_GM(X,1,[],[],1,[])
Warning: Ignoring rows of X with missing data.
> In kmeans at 127
  In EM_GM>Init_EM at 261
  In EM_GM at 74
??? Attempted to access C(1,2); index out of bounds because numel(C)=1.

Error in ==> kmeans>distfun at 722
            D(:,i) = D(:,i) + (X(:,j) - C(i,j)).^2;

Error in ==> kmeans at 329
    D = distfun(X, C, distance, 0);

Error in ==> EM_GM>Init_EM at 261
[Ci,C] = kmeans(X,k,'Start','cluster', ...

Error in ==> EM_GM at 74
    [W,M,V] = Init_EM(X,k); L = 0;

Please Help....!!! thanks in advance..... :)

Please login to add a comment or rating.
Updates
24 Oct 2005

The 'Init_EM' sub-function uses the Matlab function 'kmeans', which may output one or more cluster centroid as 'NaN'. A change is made in 'Init_EM' to solve this problem.

16 Jan 2006

A more computational effective method is implemented to compute the log likelihood of the resulting GM given the input data

06 Mar 2006

Enchance likelihood computation

Tag Activity for this File
Tag Applied By Date/Time
statistics Patrick Tsui 22 Oct 2008 08:01:44
probability Patrick Tsui 22 Oct 2008 08:01:44
expectation maximization Patrick Tsui 22 Oct 2008 08:01:44
em Patrick Tsui 22 Oct 2008 08:01:44
gaussian mixture Patrick Tsui 22 Oct 2008 08:01:44
 

MATLAB Central Terms of Use

NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Terms prior to use.

Contact us at files@mathworks.com