prctile

Percentiles of a data set

Syntax

Description

example

Y = prctile(X,p) returns percentiles of the values in a data vector or matrix X for the percentages p in the interval [0,100].

  • If X is a vector, then Y is a scalar or a vector with the same length as the number of percentiles required (length(p)). Y(i) contains the p(i) percentile.

  • If X is a matrix, then Y is a row vector or a matrix, where the number of rows of Y is equal to the number of percentiles required (length(p)). The ith row of Y contains the p(i) percentiles of each column of X.

  • For multidimensional arrays, prctile operates along the first nonsingleton dimension of X.

example

Y = prctile(X,p,dim) returns percentiles along dimension dim.

Examples

expand all

Percentiles of a Data Vector

Generate a data set of size 10.

rng('default'); % for reproducibility
x = normrnd(5,2,1,10)
x =
     6.0753    8.6678    0.4823    6.7243    5.6375    2.3846    4.1328   5.6852   12.1568   10.5389

Calculate the 42nd percentile.

Y = prctile(X,42)
Y =
    5.6709

Percentiles of a Data Matrix

Calculate the percentiles along the columns and rows of a data matrix for specified percentages.

Generate a 5-by-5 data matrix.

X = (1:5)'*(2:6)

X =
     2     3     4     5     6
     4     6     8    10    12
     6     9    12    15    18
     8    12    16    20    24
    10    15    20    25    30

Calculate the 25th, 50th, and 75th percentiles along the columns of X.

Y = prctile(X,[25 50 75],1) 
Y =
    3.5000    5.2500    7.0000    8.7500   10.5000
    6.0000    9.0000   12.0000   15.0000   18.0000
    8.5000   12.7500   17.0000   21.2500   25.5000

The rows of Y correspond to the percentiles of columns of X. For example, the 25th, 50th, and 75th percentiles of the third column of X with elements (4, 8, 12, 16, 20) are 7, 12, and 17, respectively. Y = prctile(X,[25 50 75]) returns the same percentile matrix.

Calculate the 25th, 50th, and 75th percentiles along the rows of X.

Y = prctile(X,[25 50 75],2)
Y =

    2.7500    4.0000    5.2500
    5.5000    8.0000   10.5000
    8.2500   12.0000   15.7500
   11.0000   16.0000   21.0000
   13.7500   20.0000   26.2500

The rows of Y correspond to the percentiles of rows of X. For example, the 25th, 50th, and 75th percentiles of the first row of X with elements (2, 3, 4, 5, 6) are 2.75, 4, and 5.25, respectively.

Input Arguments

expand all

X — Input datavector | array

Input data, specified as a vector or array.

Data Types: double | single

p — Percentagesscalar | vector

Percentages for which to compute percentiles, returned as a scalar or vector of scalars from 0 to 100.

Example: 25

Example: [25, 50, 75]

Data Types: double | single

dim — Dimension1 (default) | positive integer

Dimension along which the percentiles of X are required, specified as a positive integer. For example, for a matrix X, when dim = 1, prctile returns the quantile(s) of the columns of X and when dim = 2, quantile returns the quantile(s) of the rows of X. For a multidimensional array X, the length of the dimth dimension of Y is equal to the length of p.

Data Types: double

Output Arguments

expand all

Y — Percentilesscalar | array

Percentiles of a data vector or array, specified as a scalar or array for one or more percentage values.

  • If X is a vector, then Y is a scalar or a vector with the same length as the number of percentiles required (length(p)). Y(i) contains the p(i)th percentile.

  • If X is a matrix, then Y is a vector or a matrix with the length of the dimth dimension equal to the number percentiles required (length(p)). When dim = 1, for example, the ith row of Y contains the p(i)th percentiles of columns of X.

  • If X is an array of dimension d, then Y is an array with the length of the dimth dimension equal to the number of percentiles required (length(p)).

More About

expand all

Multidimensional Array

A multidimensional array is an array with more than two dimensions. For example, if X is a 1-by-3-by-4 array, then X is a 3-D array.

Nonsingleton Dimension

A first nonsingleton dimension is the first dimension of an array whose size is not equal to 1. For example, if X is a 1-by-2-by-3-by-4 array, then the second dimension is the first nonsingleton dimension of X.

Linear Interpolation

Linear interpolation uses linear polynomials to find yi = f(xi), the values of the underlying function Y = f(X) at the points in the vector or array x. Given the data points (x1, y1) and (x2, y2), where y1 = f(x1) and y2 = f(x2), linear interpolation finds y = f(x) for a given x between x1 and x2 as follows:

y=f(x)=y1+(xx1)(x2x1)(y2y1).

Similarly, if the 100(1.5/n)th percentile is y1.5/n and the 100(2.5/n)th percentile is y2.5/n, then linear interpolation finds the 100(2.3/n)th percentile, y2.3/n as:

y2.3n=y1.5n+(2.3n1.5n)(2.5n1.5n)(y2.5ny1.5n).

Algorithms

For an n-element vector X, prctile returns percentiles as follows:

  1. The sorted values in X are taken as the 100(0.5/n)th, 100(1.5/n)th, ..., 100([n – 0.5]/n)th percentiles. For example:

    • For a data vector of five elements such as {6, 3, 2, 10, 1}, the sorted elements {1, 2, 3, 6, 10} respectively correspond to the 10th, 30th, 50th, 70th, and 90th percentiles.

    • For a data vector of six elements such as {6, 3, 2, 10, 8, 1}, the sorted elements {1, 2, 3, 6, 8, 10} respectively correspond to the (50/6)th, (150/6)th, (250/6)th, (350/6)th, (450/6)th, and (550/6)th percentiles.

  2. prctile uses linear interpolation to compute percentiles for percentages between 100(0.5/n) and 100([n – 0.5]/n).

  3. prctile assigns the minimum or maximum values in X to the percentiles corresponding to the percentages outside that range.

prctile treats NaNs as missing values and removes them.

References

[1] Langford, E. "Quartiles in Elementary Statistics", Journal of Statistics Education. Vol. 14, No. 3, 2006.

See Also

| |

Was this topic helpful?