Transfering any point in PC space to original space

29 views (last 30 days)
Dear experts,
I have a difficult question for you. Basically I have a dataset with 6 variables and 27 cases. I did PCA and plottet it. Afterwards I created a circle around it that includes 95% of the points (the circle is regardless in this case.). I have created 8 new points next (A-D and W-Z) as you can See in the following image. Now I want to do PCA reproduction for these 8 points as I want to know what values the variables have for these points.
I would be very glad if you could tell me how I can handle this problem. Thanks in advance.
To make it clear once more. I had 6 variables at first and then seperated 2 PCs, now I have 8 new points and I need to know what values the 6 variables have for them. I hope it´s possible and if it is, I would be very glad if you could tell me how I can handle this problem. Thanks in advance.
edit: I have already found a formular that has to do something with it but to be honest I can´t quite tell what i should do with this formular in my case.
Formular i found:
PCA reconstruction = PC scores * Eigenvectors + Mean
Kind Regards TG
  5 Comments
Image Analyst
Image Analyst on 16 Sep 2021
We don't need your trusted data. Can you make up some generic, non-proprietary data and attach that? And we're still not sure what you want. OK, so you have 4 variables and 27 observations. So what do you want to know? Do you just want 6 PC variables? If so, why -- what are you going to do with them? Or do you want to model the data and use the 6 variables to predict some kind of output value?
Tom
Tom on 17 Sep 2021
Okey I´ll try but I basically think that @the cyclist already almost got it right.
I´ll start with my data:
dataset = readtable(Exampledata);
data = table2arry(dataset(:,4:9)); %now I have a 27x6 table with 6 variables and 27 observations
data = data - mean(data);
[coeff, score, ~, ~, explained, mu] = pca(data)
figure; %now i´m plotting my data
hold on;
plot1 = plot(score(:,1), score(:,2),'r.');
set (plot1, 'Markersize', 16);
widthandheight_cosy = 25
set(gca, 'XLim', [-widthandheight_cosy, widthandheight_cosy], 'YLim',[-widthandheight_cosy,widthandheight_cosy], 'Box','on' );
axis square;
%then i plot the circle but that´s not meaningful for this
XtremeW = [radius_circle 0]; %plotting the extreme points
plot(XtremeW(:,1),XtremeW(:,2),'*black');
text(radius_circle,0,' W');
XtremeY = [-radius_circle 0];
plot(XtremeY(:,1),XtremeY(:,2),'*black');
text(-radius_circle,0,' Y');
XtremeX = [0 radius_circle];
plot(XtremeX(:,1),XtremeX(:,2),'*black');
text(0,radius_circle,' X');
XtremeZ = [0 -radius_circle];
plot(XtremeZ(:,1),XtremeZ(:,2),'*black');
text(0,-radius_circle,' Z');
XtremeA = [XandYkoord XandYkoord];
plot(XtremeA(:,1),XtremeA(:,2),'*black');
text(XandYkoord,XandYkoord,' A');
XtremeB = [-XandYkoord XandYkoord];
plot(XtremeB(:,1),XtremeB(:,2),'*black');
text(-XandYkoord,XandYkoord,' B');
XtremeC = [XandYkoord -XandYkoord];
plot(XtremeC(:,1),XtremeC(:,2),'*black');
text(XandYkoord,-XandYkoord,' C');
XtremeD = [-XandYkoord -XandYkoord];
plot(XtremeD(:,1),XtremeD(:,2),'*black');
text(-XandYkoord,-XandYkoord,' D');
So that is basically everything important of my script for this. Now I would like to get values for the original variables 1 to 6 of the table that I loaded in the beginning for all of the 8 new Points XtremeA - XtremeZ. So Basically I want a new table where I have the 8 points as observations and the 6 original variables as variables and I want values for each variable for each of the points. I hope it makes sense now.
I will attach an excel document that looks similar to that one that i used.

Sign in to comment.

Accepted Answer

the cyclist
the cyclist on 16 Sep 2021
Borrowing the first few lines of code from my PCA tutorial ...
rng 'default'
M = 7; % Number of observations
N = 5; % Number of variables observed
% Made-up data
X = rand(M,N);
% De-mean (MATLAB will de-mean inside of PCA, but I want the de-meaned values later)
X = X - mean(X); % Use X = bsxfun(@minus,X,mean(X)) if you have an older version of MATLAB
% Do the PCA
[coeff,score,latent,~,explained] = pca(X);
It is noted that there that coeff transforms the data from the original space to the PC space:
dataInPrincipalComponentSpace = X*coeff;
If we have data in the principal component space, we can transform back to the original space like this:
X_again = dataInPrincipalComponentSpace*inv(coeff); % Will be the same as X (within floating point error)
That particular line of code will transform all of the original data points back from PC space to the original coordinates. Each row of dataInPrincipalComponentSpace is the coordinates of one of the original data points.
If you want to transform some other points, then just use those points' coordinates as rows. Here, I'll just choose those coordinates at random:
random_point_in_pc_space = rand(2,N); % Randomly chosen coordinates for two points in the 5-dimensional PC space
random_point_in_orginal_space = random_point_in_pc_space * inv(coeff); % Same random point, in original coordinate system
Instead of random points, you'll want to use the coordinates of your points (A, B, etc).
A wrinkle in your case is that your points are only specified by the first two PC dimensions, PC1 and PC2. So, your W could be
W = [17, 0, 0, 0, 0, 0]; % Coordinates of one possible W
but it could also be
W = [17, 0, 2, -3, 5, -7]; % Coordinates of a different possible W, with the same PC1 and PC2
In fact, an infinite number of points would project from your 6-dimensional space to your point W in PC coordinates, which means there are also an infinite number of data points from the original space that would transform to W.
I don't know your application, so I can't help you interpret the implications for you.
  3 Comments
Tom
Tom on 17 Sep 2021
Edited: Tom on 17 Sep 2021
So basically I just tried a few things and I figured out that I was right and that it doesn´t matter for me which values I put in for the other four PCs. I need extreme cases to use them to optimise my project. For that, I also get my extreme cases even if I put in a 0 for all the other PCs.
I think that´s my problem solved, I tag you if I find that I am wrong at the moment.
THANKS a lot for your help @the cyclist
Is it possible to rate you somewhere?
the cyclist
the cyclist on 17 Sep 2021
I'm glad it worked out.
Accepting and upvoting answers is the way to "rate" contributors here. No other rating required. :-)

Sign in to comment.

More Answers (1)

BOMMALA SILPA
BOMMALA SILPA on 14 Dec 2021
Hello everyone,
I have a question in PCA.I'm working on EEG, I have taken EEG data applied EEMD, got IMFs then applied PCA on IMFs.
[coeff,score,latent,~,explained] = pca(modos);
dataInPrincipalComponentSpace = modos*coeff;
X_again = dataInPrincipalComponentSpace*inv(coeff)';
for me 2 or 3 PCs are enough to retrive the original data. I have tried with above 2 lines but I'm unable to get it.please suggest me how to do it.
  8 Comments
BOMMALA SILPA
BOMMALA SILPA on 16 Dec 2021
I think you were not clear with my question
I have an EEG signal,i want to extract the EOG activity in that.
EEMD was performed on the contaminated EEG signal to get the IMFs. We have determined the principal components and arranged them in decreasing order of their respective variation after performing PCA on the IMFs.
only 2 or 3 PCs were sufficient to extract EOG features from the data. This is the thing I have to do.
I have written the code like
load('sc4002e0_recm.mat');
% EEMD
Nstd=0.3*std(X);
NR=100;
MaxIter=10;
[modos its]=eemd(X,Nstd,NR,MaxIter);
for i=1:K
IMF(:,i)=modos(i,:)';
end
%% PCA
[coeff, score, latent, tsquared, explained]=pca(IMF);
then what is the process to extract the EOG from EEG using 2 or 3 PCs. I have tried with this formula also
Re_IMF=score * coeff' + mu;
but I'm not getting the results
the cyclist
the cyclist on 17 Dec 2021
Sorry, I don't know the answer to your question.

Sign in to comment.

Products


Release

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!