MATLAB Answers

Defining the 95% of data which are around the mean value

Asked by Giorgos Papakonstantinou on 31 Jul 2013

For a given set of data, how can I define which of those correspond to the 95% of the data which are around the mean value?




No products are associated with this question.

3 Answers

Answer by Jan Simon
on 1 Aug 2013
Edited by Jan Simon
on 1 Aug 2013
 Accepted answer
x = rand(1, 1000) - 0.5;
m = mean(x);
dist = abs(x - m);
[sortDist, sortIndex] = sort(dist);
index_95perc = sortIndex(1:floor(0.95 * numel(x)));
x_95percent = x(index_95perc);

  1 Comment

Thank you Jan. It was easier than I expected. Before your answer I was doing the folllowing:

[CdfY,CdfX] = ecdf(vals,'Function','cdf');  % compute empirical function

where vals is my dataset.

Answer by Image Analyst
on 31 Jul 2013

I'd sort the data using sort(). Then use cumsum() to get the cdf. Normalize the CDF then go from the 2.5% element to the 97.5% element using find() to find the elements (values) where the data starts and stops. It's pretty easy, but let me know if you can't figure it out.


Answer by Giorgos Papakonstantinou on 31 Jul 2013

Thank you for your answer Image Analyst. The data contain also negative values. I am not sure but I think that poses a problem when I normalize the data after the cumsum.

  1 Comment

Tom Lane
on 1 Aug 2013

It sounds like Image Analyst is talking about the cumsum of a vector that assigns probability 1/N to each of N points. However, you could take the 0.025*N and 0.975*N values from the sorted vector directly, converting the index to an integer as you see fit.

Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi test

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

MATLAB Academy

New to MATLAB?

Learn MATLAB today!