Finding quantiles from bucket/frequency data

2 views (last 30 days)
Hi there,
I'm trying to work out arbitrary percentiles from a probability distribution, but my data is not defined as an explicit set of all measurements.
I have one array that contains the data buckets (1L, 2L etc.), and one array that contains the normalized frequency at which they occur in the data (0.15, 0.5 etc., summing to 1). The quantile function works with the raw data (1, 1, 2, 2, 2, 2 etc.), rather than the form that I am using.
Is there a convenient method for finding any given percentile (i.e. built-in functions etc.) or should I rework the structure of the program?
Thanks! Cameron

Answers (1)

Tom Lane
Tom Lane on 7 Aug 2012
Check this out:
x = (1:5)';
freq = [.4 .3 0 .2 .1]';
ecdf(x,'freq',freq)
You can provide x values and their frequencies, and read the desired quantiles off the graph. If you really want to compute them directly instead of reading them from the graph, you could try the two-output form:
[a,b] = ecdf(x,'freq',freq)
This returns the graphed data. You could use these to compute what you want. I can imagine counting the number of b values less than each desired quantile, and taking the corresponding a value. But maybe someone has a more efficient way to do that.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!