Code covered by the BSD License  

Highlights from
Returns weighted percentiles of a sample

2.5

2.5 | 2 ratings Rate this file 5 Downloads (last 30 days) File Size: 3.81 KB File ID: #16920
image thumbnail

Returns weighted percentiles of a sample

by Durga Lal Shrestha

 

16 Oct 2007 (Updated 03 Apr 2008)

Returns weighted percentiles of a sample with six algorithms given weight vector

| Watch this File

File Information
Description

The idea is to give more emphasis in some examples of data as compared to
others by giving more weight. For example, we could give lower weights to
the outliers. The motivation to write this function is to compute percentiles
for Monte Carlo simulations where some simulations are very bad (in terms of
goodness of fit between simulated and actual value) than the others and to
give the lower weights based on some goodness of fit criteria.

USAGE:
 y = WPRCTILE(X,p) % This is same as PRCTILE
 y = WPRCTILE(X,p,w)
 y = WPRCTILE(X,p,w,type)
                    
INPUT:
    X - vector or matrix of the sample data
    p - scalar or a vector of percent values between 0 and 100

    w - positive weight vector for the sample data. Length of w must be equal to either number of rows or columns of X. If the weights are equal, then WPRCTILE is same as PRCTILE.

  type - an integer between 4 and 9 selecting one of the 6 quantile algorithms.
       
 OUTPUT:
    y - percentiles of the values in X
        When X is a vector, y is the same size as p, and y(i) contains the
        P(i)-th percentile.
        When X is a matrix, WPRCTILE calculates percentiles along dimension DIM which is based on: if size(X,1) == length(w), DIM = 1; elseif size(X,2) == length(w), DIM = 2;

 EXAMPLES:
  x = randn(1000,1);
  w = rand(1000,1);
  y = wprctile(x,[2.5 25 50 75],w,7)

MATLAB release MATLAB 7.5 (R2007b)
Tags for This File  
Everyone's Tags
Tags I've Applied
Add New Tags Please login to tag files.
Comments and Ratings (5)
05 Feb 2008 A P

INCORRECT! This understates the median.

27 Feb 2008 Durga Shrestha

To A P

Please mention what is incorrect. Do you mean underestimate the median? But if you see the figure WPRCTILE overestimates the median than by PRCTILE. However this depends on the weight vector.

31 Mar 2008 Li Li

It's all right except that the coordinates fed into the interp1q function is incorrect. However, it's a 2-line fix:

After the line "cumW = cumsum(sortedX(:,2));", it should read
coord = (cumW - sortedX(:,2)/2)./(sum(sortedX(:,2)));
q = [0;coord;1];
instead of q = [0;cumW;1];

02 Apr 2008 Durga Shrestha

Thanks for pointing it. Indeed I have used the the formula pk = k/n (type = 4 in R package). What you suggested is type 5 (p(k) = (k - 0.5)/n)which is used in MATLAB.

I have updated the code using 6 different algorithm to compute the quantile.

03 Jun 2010 Lorenz

I claim that this can be implemented in expected linear time. As you are using sorting, you have at least O(n log(n)), assuming Matlab uses comparison-based sorting (which is proven to need at least n log(n) - O(n) element comparisons in average).

Please login to add a comment or rating.
Updates
17 Oct 2007

Change of Screenshot as wrong y tick marks

27 Feb 2008

Change of the screenshot file as it was very big.

03 Apr 2008

Added option with different 5 algorithm to compute the quantile

Tag Activity for this File
Tag Applied By Date/Time
statistics Durga Lal Shrestha 22 Oct 2008 09:31:48
probability Durga Lal Shrestha 22 Oct 2008 09:31:48
weight Durga Lal Shrestha 22 Oct 2008 09:31:48
percentile Durga Lal Shrestha 22 Oct 2008 09:31:48
quantile Durga Lal Shrestha 22 Oct 2008 09:31:48

Contact us at files@mathworks.com