| Statistics Toolbox™ | ![]() |
boxplot(X)
boxplot(X,G)
boxplot(axes,X,...)
boxplot(...,param1,val1,param2,val2,...)
boxplot(X) produces a box plot of the data in X. If X is a matrix, there is one box per column; if X is a vector, there is just one box. On each box, the central mark is the median, the edges of the box are the 25th and 75th percentiles, the whiskers extend to the most extreme data points not considered outliers, and outliers are plotted individually.
boxplot(X,G) specifies one or more grouping variables G, producing a separate box for each set of X values sharing the same G value or values (see Grouped Data). Grouping variables must have one row per element of X, or one row per column of X. Specify a single grouping variable in G using a vector, a character array, a cell array of strings, or a vector categorical array; specify multiple grouping variables in G using a cell array of these variable types, such as {G1 G2 G3}, or by using a matrix. If multiple grouping variables are used, they must all be the same length. Groups that contain a NaN value or an empty string in a grouping variable are omitted, and are not counted in the number of groups considered by other parameters.
By default, character and string grouping variables are sorted in the order they initially appear in the data, categorical grouping variables are sorted by the order of their levels, and numeric grouping variables are sorted in numeric order. To control the order of groups, do one of the following:
Use categorical variables in G and specify the order of their levels.
Use the 'grouporder' parameter described below.
Pre-sort your data.
boxplot(axes,X,...) creates the plot in the axes with handle axes.
boxplot(...,param1,val1,param2,val2,...) specifies optional parameter name/value pairs, as described in the following table.
| Parameter | Values |
|---|---|
| 'plotstyle' |
|
| 'boxstyle' |
|
| 'colorgroup' | One or more grouping variables, of the same type as permitted for G, specifying that the box color should change when the specified variables change. The default is [] for no box color change. |
| 'colors' | Colors for boxes, specified as a single color (such as 'r' or [1 0 0]) or multiple colors (such as 'rgbm' or a three-column matrix of RGB values). The sequence is replicated or truncated as required, so for example 'rb' gives boxes that alternate in color. The default when no 'colorgroup' is specified is to use the same color scheme for all boxes. The default when 'colorgroup' is specified is a modified hsv colormap. |
| 'datalim' | A two-element vector containing lower and upper limits, used by 'extrememode' to determine which points are extreme. The default is [-Inf Inf]. |
| 'extrememode' |
A dotted line marks the limit if any points are outside it, and two gray lines mark the compression region if any points are compressed. Values at +/–Inf can be clipped or compressed, but NaN values still do not appear on the plot. Box notches are drawn to scale and may extend beyond the bounds if the median is inside the limit; they are not drawn if the median is outside the limits. |
| 'factordirection' |
|
| 'fullfactors' |
|
| 'factorseparator' | Specifies which factors should have their values separated by a grid line. The value may be 'auto' or a vector of grouping variable numbers. For example, [1 2] adds a separator line when the first or second grouping variable changes value. 'auto' is [] for one grouping variable and [1] for two or more grouping variables. The default is []. |
| 'factorgap' | Specifies an extra gap to leave between boxes when the corresponding grouping factor changes value, expressed as a percentage of the width of the plot. For example, with [3 1], the gap is 3% of the width of the plot between groups with different values of the first grouping variable, and 1% between groups with the same value of the first grouping variable but different values for the second. 'auto' specifies that boxplot should choose a gap automatically. The default is []. |
| 'grouporder' | Order of groups for plotting, specified as a cell array of strings. With multiple grouping variables, separate values within each string with a comma. Using categorical arrays as grouping variables is an easier way to control the order of the boxes. The default is [], which does not reorder the boxes. |
| 'jitter' | Maximum distance d to displace outliers along the factor axis by a uniform random amount, in order to make duplicate points visible. A d of 1 makes the jitter regions just touch between the closest adjacent groups. The default is 0. |
| 'labels' | A character array, cell array of strings, or numeric vector of box labels. There may be one label per group or one label per X value. Multiple label variables may be specified via a numeric matrix or a cell array containing any of these types. |
| 'labelorientation' |
When the labels are on the y axis, both settings leave the labels horizontal. |
| 'labelverbosity' |
|
| 'medianstyle' |
|
| 'notch' |
Two medians are significantly different at the 5% significance level if their intervals do not overlap. Interval endpoints are the extremes of the notches or the centers of the triangular markers. When the sample size is small, notches may extend beyond the end of the box. |
| 'orientation' |
|
| 'outliersize' | Size of the marker used for outliers, in points. The default is 6 (6/72 inch). |
| 'positions' | Box positions specified as a numeric vector with one entry per group or X value. The default is 1:numGroups, where numGroups is the number of groups. |
| 'symbol' | Symbol and color to use for outliers, using the same values as the LineSpec parameter in plot. The default is 'r+'. If the symbol is omitted then the outliers are invisible; if the color is omitted then the outliers have the same color as their corresponding box. |
| 'whisker' | Maximum whisker length w. The default is a w of 1.5. Points are drawn as outliers if they are larger than q3 + w(q3 – q1) or smaller than q1 – w(q3 – q1), where q1 and q3 are the 25th and 75th percentiles, respectively. The default of 1.5 corresponds to approximately +/–2.7σ and 99.3 coverage if the data are normally distributed. The plotted whisker extends to the adjacent value, which is the most extreme data value that is not an outlier. Set 'whisker' to 0 to give no whiskers and to make every point outside of q1 and q3 an outlier. |
| 'widths' | A scalar or vector of box widths for when 'boxstyle' is 'outline'. The default is half of the minimum separation between boxes, which is 0.5 when the 'positions' argument takes its default value. The list of values is replicated or truncated as necessary. |
When the 'plotstyle' parameter takes the value 'compact', the default values for other parameters are the listed in the following table.
| Parameter | Default when 'plotstyle' is 'compact' |
|---|---|
| 'boxstyle' | 'filled' |
| 'factorseparator' | 'auto' |
| 'factorgap' | 'auto' |
| 'jitter' | 0.5 |
| 'labelorientation' | 'inline' |
| 'labelverbosity' | 'majorminor' |
| 'medianstyle' | 'target' |
| 'outliersize' | 4 |
| 'symbol' | 'o' |
You can see data values and group names using the data cursor in the figure window. The cursor shows the original values of any points affected by the 'datalim' parameter. You can label the group to which an outlier belongs using the gname function.
To modify graphics properties of a box plot component, use findobj with the 'Tag' property to find the component's handle. 'Tag' values for box plot components depend on parameter settings, and are listed in the table below.
| Parameter Settings | 'Tag' Values |
|---|---|
All settings |
|
When 'plotstyle' is 'traditional' |
|
When 'plotstyle' is 'compact' |
|
When 'notch' is 'marker' |
|
Create a box plot of car mileage, grouped by country:
load carsmall boxplot(MPG,Origin)

Create notched box plots for two groups of sample data:
x1 = normrnd(5,1,100,1); x2 = normrnd(6,1,100,1); boxplot([x1,x2],'notch','on')

The difference between the medians of the two groups is approximately 1. Since the notches in the box plot do not overlap, you can conclude, with 95% confidence, that the true medians do differ.
The following figure shows the box plot for the same data with the length of the whiskers specified as 1.0 times the interquartile range. Points beyond the whiskers are displayed using +.
boxplot([x1,x2],'notch','on','whisker',1)

A 'plotstyle' of 'compact' is useful for large numbers of groups:
X = randn(100,25); subplot(2,1,1) boxplot(X) subplot(2,1,2) boxplot(X,'plotstyle','compact')

[1] McGill, R., J. W. Tukey, and W. A. Larsen. "Variations of Boxplots." The American Statistician. Vol. 32, No. 1, 1978, pp. 12–16.
[2] Velleman, P.F., and D.C. Hoaglin. Applications, Basics, and Computing of Exploratory Data Analysis. Pacific Grove, CA: Duxbury Press, 1981.
[3] Nelson, L. S. "Evaluating Overlapping Confidence Intervals." Journal of Quality Technology. Vol. 21, 1989, pp. 140–141.
anova1, kruskalwallis, multcompare
![]() | boundary | candexch | ![]() |
| © 1984-2008- The MathWorks, Inc. - Site Help - Patents - Trademarks - Privacy Policy - Preventing Piracy - RSS |