Here's a random example I found online for what I want to do:

# Can I add 95% confidence ellipses around groups of data in a pca plot (biplot)?

216 views (last 30 days)

Show older comments

I am plotting the results of principle component analysis using biplot. I am wondering if there is a way (in matlab) to add confidence ellipses around the groups of data. Maybe I have to use something instead of biplot?

In the example below I would want to draw an ellipse around the data for each site (green and blue points)

Here's how I am currently plotting the data:

[coeff,score,latent,tsquared,explained] = pca(X);

f1=figure(1)

h= biplot(coeff(:,1:2),'scores',score(:,1:2),'color','k','marker','.','markersize',17,'varlabels',...

{'DO','O2sat','pH','Temp','Sal','Depth','PAR','|Velocity|'});

%color by site

hID = get(h,'tag'); %identify handle

hPt = h(strcmp(hID,'obsmarker')); %isolate handles to scatter points

grp = findgroups(site);

grpID = 1:max(grp);

clrMap = winter(length(unique(grp)));

for i = 1:max(grp)

set(hPt(grp==i), 'Color', clrMap(i,:), 'DisplayName', sprintf('MSP%d', grpID(i)))

end

title('Full Deployment (12h)');

set(gca,'fontsize',18)

xlabel('Component 1 (55.26%)') % how can I add percent ('%s %', explained(1))')

ylabel('Component 2 (21.87%)')

[~, unqIdx] = unique(grp);

legend(hPt(unqIdx))

### Accepted Answer

Adam Danz
on 1 Oct 2020

Edited: Adam Danz
on 4 Oct 2020

If you know the center of the clusters and the CI's along the x and y axes for each cluster, you can use this function to plot the ellipses

A comment below that answer points to an alternative solution as well.

Addendum

The block of code below implements the ellipses outlined in the first link above with your data which is attached. I made additional changes from your version to clean some stuff up a bit.

Stars mark the center of each cluster using the mean (consider using the median instead since the data are not normally distributed). Major and minor axes of the ellipses represent the 95% CI using the percentile method which is a good method given that the data are not normally distributed.

load('data.mat')

[coeff,score,latent,tsquared,explained] = pca(X);

f1=figure();

h= biplot(coeff(:,1:2),'scores',score(:,1:2),'color','k','marker','.','markersize',17,'varlabels',...

{'DO','O2sat','pH','Temp','Sal','Depth','PAR','|Velocity|'});

hold on

%color by site

hID = get(h,'tag'); %identify handle

hPt = h(strcmp(hID,'obsmarker')); %isolate handles to scatter points

[grp, grpID] = findgroups(site);

clrMap = winter(numel(grpID));

p = 95; % CI level

for i = 1:max(grpID)

set(hPt(grp==i), 'Color', clrMap(i,:), 'DisplayName', sprintf('MSP%d', grpID(i)))

% Compute centers (means)

allX = arrayfun(@(hh)hh.XData(1), hPt(grp==i));

allY = arrayfun(@(hh)hh.YData(1), hPt(grp==i));

centers(1) = mean(allX); % x mean

centers(2) = mean(allY); % y mean

% Plot centers, do they make sense?

plot(centers(1), centers(2), 'rp', 'MarkerFaceColor', clrMap(i,:), 'MarkerSize', 20, 'LineWidth', 1)

% Compute 95% CI using percentile method

CIx = prctile(allX, [(100-p)/2, p+(100-p)/2]); % x CI [left, right]

CIy = prctile(allY, [(100-p)/2, p+(100-p)/2]); % y CI [lower, upper]

CIrng(1) = CIx(2)-CIx(1); % CI range (x)

CIrng(2) = CIy(2)-CIy(1); % CI range (y)

% Draw ellipses

llc = [CIx(1), CIy(1)]; % (x,y) lower left corners

rectangle('Position',[llc,CIrng],'Curvature',[1,1], 'EdgeColor', clrMap(i,:));

end

title('Full Deployment (12h)');

set(gca,'fontsize',18)

xlabel('Component 1 (55.26%)') % how can I add percent ('%s %', explained(1))')

ylabel('Component 2 (21.87%)')

[~, unqIdx] = unique(grp);

legend(hPt(unqIdx))

### More Answers (0)

### See Also

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!