Code covered by the BSD License  

Download apps, toolboxes, and other File Exchange content using Add-On Explorer in MATLAB.

» Watch video

Highlights from
PCA and ICA Package

4.8 | 35 ratings Rate this file 337 Downloads (last 30 days) File Size: 9.91 KB File ID: #38300 Version: 1.4
image thumbnail

PCA and ICA Package


Brian Moore (view profile)


24 Sep 2012 (Updated )

Implements Principal Component Analysis (PCA) and Independent Component Analysis (ICA)

| Watch this File

File Information

This package contains functions that implement Principal Component Analysis (PCA) and its lesser known cousin, Independent Component Analysis (ICA).
PCA and ICA are implemented as functions in this package, and multiple examples are included to demonstrate their use.
In PCA, multi-dimensional data is projected onto the singular vectors corresponding to a few of its largest singular values. Such an operation effectively decomposes the input single into orthogonal components in the directions of largest variance in the data. As a result, PCA is often used in dimensionality reduction applications, where performing PCA yields a low-dimensional representation of data that can be reversed to closely reconstruct the original data.

In ICA, multi-dimensional data is decomposed into components that are maximally independent in the negentropy sense. ICA differs from PCA in that the low-dimensional signals do not necessarily correspond to the directions of maximum variance; rather, the ICA components have maximal statistical independence. In practice, ICA can often uncover disjoint underlying trends in multi-dimensional data.

Required Products MATLAB
MATLAB release MATLAB 7.13 (R2011b)
MATLAB Search Path
Tags for This File   Please login to tag files.
Please login to add a comment or rating.
Comments and Ratings (82)
18 May 2016 Brian Moore

Brian Moore (view profile)

@Vivin: it's up to you to figure out how to import your data into the correct format.

All you have to do is load the data into an m x n matrix containing n (synchronized) samples from each of your m channels and then pass that matrix into myICA.

Comment only
18 May 2016 Vivin Varghese ARIKKUDIL MATHEW

Dear Brian,

Could you kindly answer this question if possible. Suppose i have the values of the signal in a text file. Let's say it is like an array of integers of a large length. Is it possible that i can apply your algorithm towards this signal.

Also if i have 2 channels of different length of wave files. would i be able to retrieve a signal from the output?

I have attached a copy of my code to retrieve the original signal from a text file and its corresponding histogram. and i am able to retrieve a signal from it.

clear all;

x=fopen('C:\Users\pc\Documents\Limoges\My Biblio\CoolTerm Capture 2016-05-18 14-52-33.asc','r');

15 May 2016 Alger

Alger (view profile)

thank you for share

11 May 2016 Vivin Varghese ARIKKUDIL MATHEW

Dear Brian,

Thank you very much for you quick response. YEs i would hope that ECG would be independent from ECG and EMG. But the problem lies is that the are all sometimes mixed together. My main objective is to remove the noise produced during the movement and this noise is highly unpredictable and is caused due to the wires. So if i were to take two a sources from 2 different pairs wires. i would get 3 signals(one from ECG and 2 independant noises induced from each pair of wiring). The goal would be to somehow get the data set to matlab and process it live. But i am unable to find any resources.

10 May 2016 Brian Moore

Brian Moore (view profile)

@Vivin: You can certainly run ICA on your multiple channel ECG data and see what the output looks like. I have no background in medical applications, but if the ECG signal is truly independent of the other signals (EEG, EMG) that are presumably mixed into each channel, then you might see a group separation.

Type "help myICA" to see the syntax of my ICA implementation.

For removing 50Hz noise, it's probably best to use a more classical filtering method, right? Something like this:

If you find that ICA works well on a test dataset, you might want to Google "online independent component analysis" to learn about how people modify ICA to handle streaming data.

Note: I have no experience with online ICA, and, in particular, my PCA and ICA Package doesn't support it.

Comment only
10 May 2016 Vivin Varghese ARIKKUDIL MATHEW

Dear Brian,

I am trying to implement your code on to separate ECG signals from its corresponding noises ( 50HZ line noise, EEG signals, EMG signals, and movement of the body ). Is it possible to use your algorithm to make this separation to filter out the ECG signal alone?

I have another question. Is it possible to apply your code for real time processing. for example, i am collecting ecg data from a patient using a measuring device with some multiple channels, would it be able to separate the sources and give me the individual outputs?

10 May 2016 Vivin Varghese ARIKKUDIL MATHEW

Dear Brian,

I have another question. Is it possible to apply your code for real time processing. for example, i am collecting ecg data from a patient using a measuring device with some multiple channels, would it be able to separate the sources and give me the individual outputs?

10 May 2016 Dominik Jungtäubl

Okay, thank you very much for the immediate response. I think I can continue working with your information ;)

Comment only
09 May 2016 Brian Moore

Brian Moore (view profile)

@Dominik: The pre-whitening just multiplies your data by a particular matrix so that the sample covariance of the "whitened" data is the identity matrix. I can't say much without seeing your data, but it must be the case that the sample covariance of your data must be rank-deficient, i.e., has one or more very small eigenvalues.

Make sure you pass in the data as an m x n matrix where m = #signals and n = # samples.

Comment only
09 May 2016 Dominik Jungtäubl

Hi Brian,

I'm about to seperate EMG recordings into its original signals. But the independent components do have a much greater amplitude (e+005) than the mixed signals. I guess this is because of the preprocessing, i.e. the whitening. Can you tell me why the whitening changes the magnitude to such an extent?

Greetings, Dominik

12 Mar 2016 srinivas balina  
08 Mar 2016 srinivas balina

Hi Brain..
Thank you very much for your work and making it available.
I would like to ask you if it is possible to separate instrumental tracks in songs by ICA.And what is the best way to separate vocals and music in audio signal.
Thanks in advance...

31 Jan 2016 Brian Moore

Brian Moore (view profile)

@Andrew: You can certainly use ICA for image analysis, but it sounds like you're embarking on some kind of research project, so I think it's in your best interest to learn how to do it yourself. ICA is a classical technique and there's tons of literature about on the web just a Google search away. If you decide to use my ICA code, feel free to read the function documentation, which will tell you all you need to know about how to use it.

Comment only
30 Jan 2016 Andrew

Andrew (view profile)

Hi Brain.
I would like to ask you if it is possible to use your code for image analysis.
What i want to do is to analyze a group of medical images, taken at the same position of the body, for different time points.
I 'm trying to Analise these images and to separate the background from the original image. After that i would like to reduce the motion of that part of the body, because of breathing.

Comment only
31 Dec 2015 Brian Moore

Brian Moore (view profile)

@Rajkumar: Say you have d "mixed speech" waveforms, each with n samples, and you're trying to recover the r underlying speech waveforms. Put the mixed speech waveforms into a d x n matrix Z and execute the following

Zica = myICA(Z,r);
Zpca = myPCA(Z,r);

Read the documentation of the two functions to see what is returned.

Note: PCA probably won't work for the speech separation problem; this classic problem is often used as motivation for doing ICA.

Comment only

thanks for your valuable program sir...

how can be the ICA and PCA analysis techniques can be used for speech detection from multiple mixed speech signals please help me sir

thanks in advance sir...

18 Sep 2015 Brian Moore

Brian Moore (view profile)

@Renan: You're welcome to modify and redistribute the code- just follow the rules of the BSD license included in the download.

As for the details of switching kurtosis, I'll leave that to you. To get started, I recommend the following paper:

Comment only
18 Sep 2015 Renan de Padua

Hi Brian.

I want to use the Kurtosis measure, instead of gaussian neg.

First question: Can I change your code? I will not redistribute or anything else, just want to see the difference between the measures
Second question: Can you guide me? I will use the kurtosis function, available on matlab. How can I adjust that?

Thanks for sharing!

21 Aug 2015 Darin McCoy

Hi Brian,

This is probably a really simple request, but Matlab's version of PCA comes in a format

[coeff, score, latent, tsquared, explained, mu] = pca(x)

[Zpca T U mu eigVecs] = myPCA(x,r)

.... how can you get the same results from your version as Matlab's pca?

Comment only
19 Aug 2015 Brian Moore

Brian Moore (view profile)

@Fransesco: ICA is definitely a good approach to decomposing a signal into independent spectra. The non-zero mean of your Lorentzian component is okay: the first step of the ICA algorithm is to center the data (subtract the mean), so, if it is working correctly, ICA should output a zero-mean version of the Lorentzian with no problem.

As I mentioned to Hamed, my implementation of ICA will fail in the presence of Gaussians because its objective function is to minimize the Gaussianity of the output components. Clearly this won't work when one of the output components IS Gaussian.

Here's a few other (probably better) ICA packages I recommend trying:

Comment only
17 Aug 2015 Francesco Lelj

Hi Brian,
thank you very much for your work and making it available.
I am trying to use your routines to anylize the data of some Uv-Vis spectra of a mixture of chemical compounds. I know that in the mixture I have consists of more than 1 compound but less than the number of different spectra I have at different temperature. I have checked the routines against syntetic data I generated modifying demo_ICA using linear combination of gaussian functions, trigoneometric functions and lorentzian functions before using the routines. I have observed that whereas the trigonometric functions are very well identified those of Lorentzian and gaussian are not. I have read in the answers to Hamed that this is because gaussian can be expressed as sum of other gausssians but this is not true with lorentzian. Even with my real data whent I try to analyze them I get negative "spectra" as independent component which is physically impossible. I am not an expert in signal processing and I got the impressin that part of the problem might be the fact that my signals do not have null average as sine and cosine have. Cause I have read that the method has been used also in the study in medicine in EEC and EEG I thought it was a really good method also for analysing superimposition of independent spectra.
Thank you in advance for any advice.
Best regards

06 Aug 2015 Brian Moore

Brian Moore (view profile)

@Hamed: The sum of independent Gaussians is again a Gaussian, so your question is ill-posed: there's not a unique decomposition.

Even if only one component was Gaussian, ICA still wouldn't work: its objective function is to *minimize* Gaussianity of the output signals.

Comment only
06 Aug 2015 Brian Moore

Brian Moore (view profile)

@Yeon-Mo Yang: Sorry for the delay... rng() is a built-in MATLAB function that controls its random number generator. I use it to set the generator seed so the results of myICA() are deterministic, but this isn't necessary. You can comment out the line, its harmless.

rng() was introduced around 2010, so you must have a pre-2010 MATLAB distribution...

Comment only
08 Jul 2015 Hamed

Hamed (view profile)

Hello Again,

I forgot to mention that that noises are independent.

Comment only
08 Jul 2015 Hamed

Hamed (view profile)

Hello Brian,

I've a summation of two normal distribution and I was wondering if I decompose it into its two normal components using ICA.
Thank you.

Comment only
29 Jun 2015 AJ Zadeh

Hi dear Brian And thanks so much for your code !
I loaded tow speech signals with Fs=16000 and n=16000 (1sec) then mixed together using random mixing matrix. I gave mixed signals as input data (Z) and r=2 , d=2 but separated sources was not good at all ! and estimated mixing matrix(A) was not correct too.
I need your help !

Comment only
25 Jun 2015 zhou zexun

Thanks for your code, it is very helpful for me!

25 Jun 2015 Yeon-Mo Yang

What is rng(1) in the first line of the source code demo.
When I tried it failed. I need your help. Thanks!

11 Apr 2015 Brian Moore

Brian Moore (view profile)

@Thomas: The ICA weights have to be decorrelated so they don't converge to the same values. My code implements the FastICA algorithm (reference in multiple previous comments on this package); please refer to the literature for details about the ICA algorithm.

The only restriction on the code is computational complexity: if the code completes, everything should be fine.

In the example you gave, the input to myPCA/myICA needs to be 3 x 300000 matrix: the columns should contain samples

Comment only
11 Apr 2015 Thomas Sim

Sorry again brian, hope i am not bothering you. May i know why do you need to de-correlate your weight matrix after the 4 steps ICA implementation?

Also just to check are there any restriction for your algorithm to work on large datasets?

Because i tried replacing your MVG samples with a 300000x3 matrices of 3 sound signals in ICA_demo and replaced maxsamples in myICA to my matrices length.

The results of the ica de-mixing is very poor for both with or without pca.

On contrary, the result of your pca is reasonable on large datasets.

Comment only
10 Apr 2015 Brian Moore

Brian Moore (view profile)

@Thomas: The principal components (PCs) produced by SVD are orthogonal, but they aren't unit norm, so the second whitening step is just normalizing each component to have unit norm (i.e., T is a diagonal matrix).

In many applications, the PC *magnitudes* are important; in those cases, you'll want to comment out the second whitening step.

I chose to normalize the PCs by default to allow direct comparison with ICA, which, by construction, returns unit norm components.

Hope this helps!

Comment only
10 Apr 2015 Thomas Sim

Hi Brian, hope this isn't a stupid question, mind if i ask for pca, you center and svd the data once but why do you need to whiten it again with svd? It's like doing it twice?

Comment only
02 Apr 2015 Brian Moore

Brian Moore (view profile)

@zongbao: not possible. A quote from Wikipedia:

"In general, ICA cannot identify the actual number of source signals, a uniquely correct ordering of the source signals, nor the proper scaling (including sign) of the source signals."

The first step of ICA is centering + whitening, which removes magnitude information. Indeed, after running myICA(), you can check that cov(z_ic') == identity matrix

Comment only
01 Apr 2015 zongbao

How can I find z_ic that corresponds to a specific eigenvalue and eigenvector? I mean I want to find the independent component that corresponds to the maximum eigenvalue. Thank you.

27 Mar 2015 TS Sharma

Thanks Brian!

Comment only
26 Mar 2015 Brian Moore

Brian Moore (view profile)

@TS Sharma: Yes, the columns of z_ic_new are the ICA coefficients corresponding to each input sample (i.e., columns of z_new)

Comment only
26 Mar 2015 TS Sharma

Referring to your reply to my previous comment, I think it is z_ic_new.

Comment only
26 Mar 2015 TS Sharma

Another basic question. May I know which matrix contains ICA coefficients?

Comment only
25 Mar 2015 TS Sharma

Thanks a lot Brian!! That was very kind of you.

Comment only
25 Mar 2015 Brian Moore

Brian Moore (view profile)

@TS Sharma: First, call myICA() like this:

[z_ic A T mean_z] = myICA(z,NUM);

Then, if new data is in (m x 1) vector z_new, you can project into ICA space by computing

z_ic_new = A * T * (z_new - mean_z);

Here, (A,T,mean_z) are the ICA transformation matrix, whitening matrix, and mean vector, respectively, of the original data

Comment only
25 Mar 2015 TS Sharma

Thank you for the code. This is a rather basic question. How do I project new (test) data onto the newly found ICA basis?

Comment only
12 Mar 2015 pratika tiwari

can we use this ica package for feature extraction??? kindly do reply.

Comment only
02 Mar 2015 pratika tiwari  
02 Mar 2015 pratika tiwari

thanks for the package..!! but how to use for speech signal?? i am too new to this field. that's why i need your guidance .

Comment only
01 Mar 2015 Bicheng Ying

Thanks for this nice package.

29 Jan 2015 Keven Laboy

That makes perfect sense. Thank you for the quick response!

29 Jan 2015 Brian Moore

Brian Moore (view profile)

@Kevin: The mixing matrix is "A", which is indeed orthogonal. Check out the last line of myICA.m: you'll see the statement

z_ic = A * z_cw;

which computes the independent components (z_ic) from the whitened input data (z_cw).

The matrix "T \ pinv(A)" transforms the independent components back to the original (unwhitened) domain, but it's not, in general, orthogonal because the whitening matrix "T" isn't orthogonal. Hope this helps!

Comment only
29 Jan 2015 Keven Laboy

Thank you for the package.

One question, to my understanding if you whiten and center the data, the matrix containing the basis functions (mixing matrix) should be orthogonal. When I run your code and get the mixing matrix by doing T \ pinv(A) (as specified in myICA.m) the resulting matrix is not orthogonal. Is this an issue?

Thank you in advance

Comment only
21 Jan 2015 Marthed

Thanks For providing this package ^_^

09 Jan 2015 M'hamed

How to select the best dimension of this best algorithm ICA ?

Comment only
09 Dec 2014 Peixin

Peixin (view profile)

06 Dec 2014 Brian Moore

Brian Moore (view profile)

@LA_2012: Please read my previous comments for more detail about my implementation of ICA and references to the literature.

To apply ICA to images, you could vectorize each image and store them as columns of the input matrix

Comment only
06 Dec 2014 LA_2012

Is this fixed point ICA or some other ?
How to use it on images?
How to modify the code according to requirement of some other ICA?
From where can I get reference data to understand ICA in a better way?
Thanks in advance :)
Thanks for providing this code

25 Nov 2014 Guilherme

Thanks for sharing your code Mr. Moore.

I'm using part of your ICA function to implement my own.
I wasn't get good results using the weight vector decorrelation as you did so I implemented the Gram-Schmidt decorrelation as proposed by Hyvarinen pq 15 (

Here is the code I inserted right after the weight normalization, still inside the for loop:

% Gram-Schmidt
summ = zeros(nOb,1);
for j = 1 : p-1
wj = W(j,:)';
summ = summ + W(p,:)*wj*wj;
W(p,:) = W(p,:) - summ';
W(p,:) = W(p,:)/norm(W(p,:));

25 Sep 2014 Body

Body (view profile)

I get it. Thanks a lot!

25 Sep 2014 Brian Moore

Brian Moore (view profile)

@Susanne: The for-loop containing "gp" is implementing steps 2-3 of the algorithm on pg. 14. So, in particular, gp is related to g'(.) from the document

Comment only
24 Sep 2014 Susanne

Ok I think I got a better grasp on it thank you so much your code helped me a lot. I got on question though, what is the 'gp' you're using in your ICA code?

23 Sep 2014 Brian Moore

Brian Moore (view profile)

@Body: No, the "err" vector keeps track of the how much the weight vectors are changing at each iteration (used only for stopping criterion). If you do as you suggested, you'll just be measuring how close each weight vector is to being unit norm (*spoiler alert* - they're always unit norm)

Comment only
23 Sep 2014 Brian Moore

Brian Moore (view profile)

@Susanne: I suggest you read the help info and comments in the myICA.m function. If you need more detail, you'll need to refer to the literature. For example I recommend

FYI: I'm using the 4-step algorithm on pg. 14 along with the symmetric decorrelation step involving the W matrix from (45) on pg. 15

Comment only
23 Sep 2014 Body

Body (view profile)

Thanks for your code! ICA is so difficult to understand. U help a lot! I have a question. Can
err(i) = 1 - w(i,:) * w(i,:)'
instead of
err(i) = 1 - w(i,:) * w_old(i,:)'?

23 Sep 2014 Susanne

I have some trouble to completely understand the ICA algorithmen zou are using. Would you mind heling me out?

Comment only
17 Sep 2014 Susanne  
16 Sep 2014 Brian Moore

Brian Moore (view profile)

@LionelB: Good point - I'm disappointed in myself for not mentioning bsxfun() - its MATLAB's secret sauce for vectorizing

Comment only
16 Sep 2014 LionelB

Or even

z0 = bsxfun(@minus,z,sum(z,2)/n);
R = (z0*z0')/(n-1);

:) (Matlab uses this idiom internally, see e.g. the 'cov' function).

Anyhow, I only raised this because I almost gave up on your fine package until I realised the fix was trivial.

Comment only
16 Sep 2014 Brian Moore

Brian Moore (view profile)

@LionelB: Yes, I'm aware. I wrote the MATLAB code so it could be directly translated into, e.g., C++. If I were maximizing MATLAB efficiency, I would have written

z0 = (1 / (n - 1)) * (z - repmat(mean_z,[1 n]));
R = z0 * z0';

instead of

z0 = z - mean_z*ones(1,n);
R = (z0*z0') / (n-1);

Comment only
16 Sep 2014 LionelB

Nice package, but the 'myWhiten' routine is a terrible computational bottleneck for large datasets, because of the inefficient sample covariance calculation. This can easily be done without looping: simply de-mean z, e.g.:

z0 = z-mean_z*ones(1,n);

then calculate

R = (z0*z0')/(n-1)

Delivers an orders-of-magnitude speedup.

Comment only
30 Jul 2014 Ilia

Ilia (view profile)

yes i was talking about z_LD, i was afraid that it could be a too strong restriction for the solution, because I'm not really expert with this kind of analysis, and it is the first time i find the possibility to compute the low dimension z.
Thanks you for the answer.

29 Jul 2014 Brian Moore

Brian Moore (view profile)

@Ilia: Are you referring to "z_LD" as described in the myICA help? It's a matrix of the same size as the input "z" that approximates z from a linear combination of the "NUM" independent components in output "z_ic"

Comment only
29 Jul 2014 Ilia

Ilia (view profile)

thank you for sharing the great job,I've a question.
I'm using your 'myICA.m' and i would like to understand what ICA_LD really do.
can you suggest some lecture about it or explain in few words,if it is possible.
thanks you very much

04 Jun 2014 Brian Moore

Brian Moore (view profile)

@Prarinya Check out the FastICA algorithm from

I'm using the 4-step algorithm on pg. 14 along with the symmetric decorrelation step involving the W matrix from (45) on pg. 15


Comment only
04 Jun 2014 Prarinya Siritanawan

Great contribution. I have a request

Could you please tell us the reference of the algorithm you use? I'm curious about the update rule for creating the transformation matrix 'A' in the code.

I tried to study the ICA algorithm from the ground. However update rule you use are quite different from the document I have. Therefore, I realized that there are more than one approach to update the transformation matrix.

Comment only
22 Apr 2014 Arjuna

Arjuna (view profile)

better than FastIca ;)

27 Feb 2014 Bosi

Bosi (view profile)

09 Jan 2014 Abdelrahman

super good

14 Nov 2013 Brian Moore

Brian Moore (view profile)


You must be using an old version of MATLAB (<= 2009b I believe) where ~ is not supported. Please replace the line with something like

[U,S,temp] = svd(w,'econ');

The variable temp is not used; the name is arbitrary

Comment only
14 Nov 2013 Pierre-Pascal

Hi thanks for providing this code! super useful.

line 97 on my ICA returns an error
[U,S,~] = svd(w,'econ');
it seems to hate the ~ as an output

did I miss something?

Comment only
30 Aug 2013 Morteza Abdollahzadeh  
12 Jul 2013 Shivakumar

Sir, there is an error at line 31. Can you please solve that? thank you for providing this file.

01 Jul 2013 Jim

Jim (view profile)

07 Nov 2012 Wang

Wang (view profile)

15 Oct 2012 Brian Moore

Brian Moore (view profile)

Why the 2 star ranking Eugene? Please provide feedback so I can improve this package!

Comment only
15 Oct 2012 Eugene

Eugene (view profile)

09 Oct 2012 1.1

Fixing bug in myMultiGaussian(). Needed to use lower triangular Cholesky factorization, not the upper triangular version.

05 Nov 2012 1.2

Updating myPCA() documentation

26 Apr 2015 1.3

Improving documentation and code performance

26 Apr 2015 1.4

Uploading .zip (omitted in last update)

Contact us