4.66667
4.7 | 21 ratings Rate this file 371 Downloads (last 30 days) File Size: 398 KB File ID: #38300
image thumbnail

PCA and ICA Package

by

Brian Moore (view profile)

 

24 Sep 2012 (Updated )

Implements Principal Component Analysis (PCA) and Independent Component Analysis (ICA).

| Watch this File

File Information
Description

This package contains functions that implement Principal Component Analysis (PCA) and its lesser known cousin, Independent Component Analysis (ICA).

PCA and ICA are implemented as functions in this package, and multiple examples are included to demonstrate their use.

In PCA, multi-dimensional data is projected onto the singular vectors corresponding to a few of its largest singular values. Such an operation effectively decomposes the input single into orthogonal components in the directions of largest variance in the data. As a result, PCA is often used in dimensionality reduction applications, where performing PCA yields a low-dimensional representation of data that can be reversed to closely reconstruct the original data.

In ICA, multi-dimensional data is decomposed into components that are maximally independent in the negentropy sense. ICA differs from PCA in that the low-dimensional signals do not necessarily correspond to the directions of maximum variance; rather, the ICA components have maximal statistical independence. In practice, ICA can often uncover disjoint underlying trends in multi-dimensional data.

Required Products MATLAB
MATLAB release MATLAB 7.8 (R2009a)
Tags for This File   Please login to tag files.
Please login to add a comment or rating.
Comments and Ratings (48)
27 Mar 2015 TS Sharma

Thanks Brian!

Comment only
26 Mar 2015 Brian Moore

Brian Moore (view profile)

@TS Sharma: Yes, the columns of z_ic_new are the ICA coefficients corresponding to each input sample (i.e., columns of z_new)

Comment only
26 Mar 2015 TS Sharma

Referring to your reply to my previous comment, I think it is z_ic_new.

Comment only
26 Mar 2015 TS Sharma

Another basic question. May I know which matrix contains ICA coefficients?

Comment only
25 Mar 2015 TS Sharma

Thanks a lot Brian!! That was very kind of you.

Comment only
25 Mar 2015 Brian Moore

Brian Moore (view profile)

@TS Sharma: First, call myICA() like this:

[z_ic A T mean_z] = myICA(z,NUM);

Then, if new data is in (m x 1) vector z_new, you can project into ICA space by computing

z_ic_new = A * T * (z_new - mean_z);

Here, (A,T,mean_z) are the ICA transformation matrix, whitening matrix, and mean vector, respectively, of the original data

Comment only
25 Mar 2015 TS Sharma

Thank you for the code. This is a rather basic question. How do I project new (test) data onto the newly found ICA basis?

Comment only
12 Mar 2015 pratika tiwari

can we use this ica package for feature extraction??? kindly do reply.

Comment only
02 Mar 2015 pratika tiwari  
02 Mar 2015 pratika tiwari

thanks for the package..!! but how to use for speech signal?? i am too new to this field. that's why i need your guidance .

Comment only
01 Mar 2015 Bicheng Ying

Thanks for this nice package.

29 Jan 2015 Keven Laboy

That makes perfect sense. Thank you for the quick response!

29 Jan 2015 Brian Moore

Brian Moore (view profile)

@Kevin: The mixing matrix is "A", which is indeed orthogonal. Check out the last line of myICA.m: you'll see the statement

z_ic = A * z_cw;

which computes the independent components (z_ic) from the whitened input data (z_cw).

The matrix "T \ pinv(A)" transforms the independent components back to the original (unwhitened) domain, but it's not, in general, orthogonal because the whitening matrix "T" isn't orthogonal. Hope this helps!

Comment only
29 Jan 2015 Keven Laboy

Thank you for the package.

One question, to my understanding if you whiten and center the data, the matrix containing the basis functions (mixing matrix) should be orthogonal. When I run your code and get the mixing matrix by doing T \ pinv(A) (as specified in myICA.m) the resulting matrix is not orthogonal. Is this an issue?

Thank you in advance

Comment only
21 Jan 2015 Marthed

Thanks For providing this package ^_^

09 Jan 2015 M'hamed

How to select the best dimension of this best algorithm ICA ?
Thank's

Comment only
09 Dec 2014 Peixin

Peixin (view profile)

 
06 Dec 2014 Brian Moore

Brian Moore (view profile)

@LA_2012: Please read my previous comments for more detail about my implementation of ICA and references to the literature.

To apply ICA to images, you could vectorize each image and store them as columns of the input matrix

Comment only
06 Dec 2014 LA_2012

Is this fixed point ICA or some other ?
How to use it on images?
How to modify the code according to requirement of some other ICA?
From where can I get reference data to understand ICA in a better way?
Thanks in advance :)
Thanks for providing this code

25 Nov 2014 Guilherme

Thanks for sharing your code Mr. Moore.

I'm using part of your ICA function to implement my own.
I wasn't get good results using the weight vector decorrelation as you did so I implemented the Gram-Schmidt decorrelation as proposed by Hyvarinen pq 15 (http://mlsp.cs.cmu.edu/courses/fall2012/lectures/ICA_Hyvarinen.pdf).

Here is the code I inserted right after the weight normalization, still inside the for loop:

% Gram-Schmidt
summ = zeros(nOb,1);
for j = 1 : p-1
wj = W(j,:)';
summ = summ + W(p,:)*wj*wj;
end
W(p,:) = W(p,:) - summ';
W(p,:) = W(p,:)/norm(W(p,:));

25 Sep 2014 Body

Body (view profile)

I get it. Thanks a lot!

25 Sep 2014 Brian Moore

Brian Moore (view profile)

@Susanne: The for-loop containing "gp" is implementing steps 2-3 of the algorithm on pg. 14. So, in particular, gp is related to g'(.) from the document

Comment only
24 Sep 2014 Susanne

Ok I think I got a better grasp on it thank you so much your code helped me a lot. I got on question though, what is the 'gp' you're using in your ICA code?

23 Sep 2014 Brian Moore

Brian Moore (view profile)

@Body: No, the "err" vector keeps track of the how much the weight vectors are changing at each iteration (used only for stopping criterion). If you do as you suggested, you'll just be measuring how close each weight vector is to being unit norm (*spoiler alert* - they're always unit norm)

Comment only
23 Sep 2014 Brian Moore

Brian Moore (view profile)

@Susanne: I suggest you read the help info and comments in the myICA.m function. If you need more detail, you'll need to refer to the literature. For example I recommend

http://mlsp.cs.cmu.edu/courses/fall2012/lectures/ICA_Hyvarinen.pdf

FYI: I'm using the 4-step algorithm on pg. 14 along with the symmetric decorrelation step involving the W matrix from (45) on pg. 15

Comment only
23 Sep 2014 Body

Body (view profile)

Thanks for your code! ICA is so difficult to understand. U help a lot! I have a question. Can
err(i) = 1 - w(i,:) * w(i,:)'
instead of
err(i) = 1 - w(i,:) * w_old(i,:)'?

23 Sep 2014 Susanne

I have some trouble to completely understand the ICA algorithmen zou are using. Would you mind heling me out?

Comment only
17 Sep 2014 Susanne  
16 Sep 2014 Brian Moore

Brian Moore (view profile)

@LionelB: Good point - I'm disappointed in myself for not mentioning bsxfun() - its MATLAB's secret sauce for vectorizing

Comment only
16 Sep 2014 LionelB

Or even

z0 = bsxfun(@minus,z,sum(z,2)/n);
R = (z0*z0')/(n-1);

:) (Matlab uses this idiom internally, see e.g. the 'cov' function).

Anyhow, I only raised this because I almost gave up on your fine package until I realised the fix was trivial.

Comment only
16 Sep 2014 Brian Moore

Brian Moore (view profile)

@LionelB: Yes, I'm aware. I wrote the MATLAB code so it could be directly translated into, e.g., C++. If I were maximizing MATLAB efficiency, I would have written

z0 = (1 / (n - 1)) * (z - repmat(mean_z,[1 n]));
R = z0 * z0';

instead of

z0 = z - mean_z*ones(1,n);
R = (z0*z0') / (n-1);

Comment only
16 Sep 2014 LionelB

Nice package, but the 'myWhiten' routine is a terrible computational bottleneck for large datasets, because of the inefficient sample covariance calculation. This can easily be done without looping: simply de-mean z, e.g.:

z0 = z-mean_z*ones(1,n);

then calculate

R = (z0*z0')/(n-1)

Delivers an orders-of-magnitude speedup.

Comment only
30 Jul 2014 Ilia

Ilia (view profile)

yes i was talking about z_LD, i was afraid that it could be a too strong restriction for the solution, because I'm not really expert with this kind of analysis, and it is the first time i find the possibility to compute the low dimension z.
Thanks you for the answer.

29 Jul 2014 Brian Moore

Brian Moore (view profile)

@Ilia: Are you referring to "z_LD" as described in the myICA help? It's a matrix of the same size as the input "z" that approximates z from a linear combination of the "NUM" independent components in output "z_ic"

Comment only
29 Jul 2014 Ilia

Ilia (view profile)

thank you for sharing the great job,I've a question.
I'm using your 'myICA.m' and i would like to understand what ICA_LD really do.
can you suggest some lecture about it or explain in few words,if it is possible.
thanks you very much

04 Jun 2014 Brian Moore

Brian Moore (view profile)

@Prarinya Check out the FastICA algorithm from

http://mlsp.cs.cmu.edu/courses/fall2012/lectures/ICA_Hyvarinen.pdf

I'm using the 4-step algorithm on pg. 14 along with the symmetric decorrelation step involving the W matrix from (45) on pg. 15

Cheers!

Comment only
04 Jun 2014 Prarinya Siritanawan

Great contribution. I have a request

Could you please tell us the reference of the algorithm you use? I'm curious about the update rule for creating the transformation matrix 'A' in the code.

I tried to study the ICA algorithm from the ground. However update rule you use are quite different from the document I have. Therefore, I realized that there are more than one approach to update the transformation matrix.

Comment only
22 Apr 2014 Arjuna

Arjuna (view profile)

better than FastIca ;)

27 Feb 2014 Bosi

Bosi (view profile)

 
09 Jan 2014 Abdelrahman

super good

14 Nov 2013 Brian Moore

Brian Moore (view profile)

Pierre,

You must be using an old version of MATLAB (<= 2009b I believe) where ~ is not supported. Please replace the line with something like

[U,S,temp] = svd(w,'econ');

The variable temp is not used; the name is arbitrary

Comment only
14 Nov 2013 Pierre-Pascal

Hi thanks for providing this code! super useful.

line 97 on my ICA returns an error
[U,S,~] = svd(w,'econ');
it seems to hate the ~ as an output

did I miss something?

Comment only
30 Aug 2013 Morteza Abdollahzadeh  
12 Jul 2013 Shivakumar

Sir, there is an error at line 31. Can you please solve that? thank you for providing this file.

01 Jul 2013 Jim

Jim (view profile)

 
07 Nov 2012 Wang

Wang (view profile)

 
15 Oct 2012 Brian Moore

Brian Moore (view profile)

Why the 2 star ranking Eugene? Please provide feedback so I can improve this package!

Comment only
15 Oct 2012 Eugene

Eugene (view profile)

 
Updates
09 Oct 2012

Fixing bug in myMultiGaussian(). Needed to use lower triangular Cholesky factorization, not the upper triangular version.

05 Nov 2012

Updating myPCA() documentation

Contact us