Path: news.mathworks.com!not-for-mail
From: "Skip Albertson" <salb461@ecy.wa.gov>
Newsgroups: comp.soft-sys.matlab
Subject: Re: replicating MS Excel percentrank formula
Date: Wed, 13 Oct 2010 18:11:04 +0000 (UTC)
Organization: Washington State Dept of Ecology
Lines: 52
Message-ID: <i94sno$9oh$1@fred.mathworks.com>
References: <i6tbnj$94h$1@fred.mathworks.com> <i6trc5$ob7$1@fred.mathworks.com> <i6vroi$lol$1@fred.mathworks.com>
Reply-To: "Skip Albertson" <salb461@ecy.wa.gov>
NNTP-Posting-Host: webapp-02-blr.mathworks.com
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Trace: fred.mathworks.com 1286993464 10001 172.30.248.37 (13 Oct 2010 18:11:04 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Wed, 13 Oct 2010 18:11:04 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 1034885
Xref: news.mathworks.com comp.soft-sys.matlab:678155

I had written a function to do this and offer it here.  Depending on the data, you may need to sort it first and deal with replicates, otherwise this emulates the Excel PercentRank function fairly well:

function PRank = PercentRank(x,p)
%
% function PRank = PercentRank(x,p);
%
% emulates the Percent Rank function found in Microsoft Excel.
% Input at array x, and a value for which you would like to find
% the percentile rank, p.  The value returned, PRank, will use
% linery interpolation, as in Excel.  Extrapolation will
% evaluate to NaN.
%
% For example:
%
% 	x=[5;
% 	10;
% 	15;
% 	20;
% 	95];
%
% 	p=15, returns 0.5, which is at the median
%       p=14; returns 0.45, which is linearly interpolated.
%
% Skip Albertson, October 11, 2010
%
%
y=zeros(size(x)); 
n=numel(x); 
for k=1:n 
    r=numel(find(x<x(k))); 
    y(k)=r/(n-1); 
end
PRank=interp1(x,y,p,'linear');



Ashish Uthama <first.last@mathworks.com> wrote in message <i6vroi$lol$1@fred.mathworks.com>...
> Naresh,
> 
> Dividing by the max assumes that your data starts at 1. Maybe you need 
> to use the range?
> i.e if your data ranges from 5 - 50, the denominator should probably be:
> 
>      max(data(:,2) - min(data(:,2))
> 
> The repeated element handling is also a concern (I do not know what the 
> 'right' answer for your application might be).
> 
> Echoing Rogers comments:
> >  As I have said, this is only a guess.  It would require many examples 
> > to be sure of the details in Excel's logic, or preferably some adequate 
> > documentation.