Box chart (box plot)
boxchart( creates a box chart, or box
plot, for each column of the matrix ydata)ydata. If
ydata is a vector, then boxchart creates a
single box chart.
Each box chart displays the following information: the median, the lower and upper quartiles, any outliers (computed using the interquartile range), and the minimum and maximum values that are not outliers. For more information, see Box Chart (Box Plot).
boxchart(
groups the data in the vector xgroupdata,ydata)ydata according to the unique values in
xgroupdata and plots each group of data as a separate box chart.
xgroupdata determines the position of each box chart along the
x-axis. ydata must be a vector, and
xgroupdata must have the same length as
ydata.
boxchart(___,'GroupByColor',
uses color to differentiate between box charts. The software groups the data in the vector
cgroupdata)ydata according to the unique value combinations in
xgroupdata (if specified) and cgroupdata, and
plots each group of data as a separate box chart. The vector
cgroupdata then determines the color of each box chart.
ydata must be a vector, and cgroupdata must
have the same length as ydata. Specify the
'GroupByColor' name-value pair argument after any of the input
argument combinations in the previous syntaxes.
boxchart(___,
specifies additional chart options using one or more name-value pair arguments. For
example, you can compare sample medians using notches by specifying
Name,Value)'Notch','on'. Specify the name-value pair arguments after all other
input arguments. For a list of properties, see BoxChart Properties.
returns
b = boxchart(___)BoxChart objects. If you do not specify
cgroupdata, then b contains one object. If you
do specify it, then b contains a vector of objects, one for each
unique value in cgroupdata. Use b to set
properties of the box charts after creating them. For a list of properties, see BoxChart Properties.
Create a single box chart from a vector of ages. Use the box chart to visualize the distribution of ages.
Load the patients data set. The Age variable contains the ages of 100 patients. Create a box chart to visualize the distribution of ages.
load patients boxchart(Age) ylabel('Age (years)')

The median patient age of 39 years is shown as the line inside the box. The lower and upper quartiles of 32 and 44 years are shown as the bottom and top edges of the box, respectively. The whiskers, or lines that extend below and above the box, have endpoints that correspond to the youngest and oldest patients. The youngest patient is 25 years old, and the oldest is 50 years old. The data set contains no outliers, which would be represented by small circles.
You can use data tips to get a summary of the data statistics. Hover over the box chart to see the data tip.
Use box charts to compare the distribution of values along the columns and the rows of a magic square.
Create a magic square with 10 rows and 10 columns.
Y = magic(10)
Y = 10×10
92 99 1 8 15 67 74 51 58 40
98 80 7 14 16 73 55 57 64 41
4 81 88 20 22 54 56 63 70 47
85 87 19 21 3 60 62 69 71 28
86 93 25 2 9 61 68 75 52 34
17 24 76 83 90 42 49 26 33 65
23 5 82 89 91 48 30 32 39 66
79 6 13 95 97 29 31 38 45 72
10 12 94 96 78 35 37 44 46 53
11 18 100 77 84 36 43 50 27 59
Create a box chart for each column of the magic square. Each column has a similar median value (around 50). However, the first five columns of Y have greater interquartile ranges than the last five columns of Y. The interquartile range is the distance between the upper quartile (top edge of the box) and the lower quartile (bottom edge of the box).
boxchart(Y) xlabel('Column') ylabel('Value')

Create a box chart for each row of the magic square. Each row has a similar interquartile range, but the median values differ across the rows.
boxchart(Y') xlabel('Row') ylabel('Value')

Plot the magnitudes of earthquakes according to the month in which they occurred. Use a vector of earthquake magnitudes and a grouping variable indicating the month of each earthquake. For each group of data, create a box chart and place it in the specified position along the x-axis.
Read a set of tsunami data into the workspace as a table. The data set includes information on earthquakes as well as other causes of tsunamis. Display the first eight rows, showing the month, cause, and earthquake magnitude columns of the table.
tsunamis = readtable('tsunamis.xlsx'); tsunamis(1:8,["Month","Cause","EarthquakeMagnitude"])
ans=8×3 table
Month Cause EarthquakeMagnitude
_____ __________________ ___________________
10 {'Earthquake' } 7.6
8 {'Earthquake' } 6.9
12 {'Volcano' } NaN
3 {'Earthquake' } 8.1
3 {'Earthquake' } 4.5
5 {'Meteorological'} NaN
11 {'Earthquake' } 9
3 {'Earthquake' } 5.8
Create the table earthquakes, which contains data for the tsunamis caused by earthquakes.
unique(tsunamis.Cause)
ans = 8x1 cell
{0x0 char }
{'Earthquake' }
{'Earthquake and Landslide'}
{'Landslide' }
{'Meteorological' }
{'Unknown Cause' }
{'Volcano' }
{'Volcano and Landslide' }
idx = contains(tsunamis.Cause,'Earthquake');
earthquakes = tsunamis(idx,:);Group the earthquake magnitudes based on the month in which the corresponding tsunamis occurred. For each month, display a separate box chart. For example, boxchart uses the fourth, fifth, and eighth earthquake magnitudes, as well as others, to create the third box chart, which corresponds to the third month.
boxchart(earthquakes.Month,earthquakes.EarthquakeMagnitude) xlabel('Month') ylabel('Earthquake Magnitude')

Notice that because the month values are numeric, the x-axis ruler is also numeric.
For more descriptive month names, convert the earthquakes.Month column to a categorical variable.
monthOrder = ["Jan","Feb","Mar","Apr","May","Jun","Jul", ... "Aug","Sep","Oct","Nov","Dec"]; namedMonths = categorical(earthquakes.Month,1:12,monthOrder);
Create the same box charts as before, but use the categorical variable namedMonths instead of the numeric month values. The x-axis ruler is now categorical, and the order of the categories in namedMonths determines the order of the box charts.
boxchart(namedMonths,earthquakes.EarthquakeMagnitude) xlabel('Month') ylabel('Earthquake Magnitude')

Group medical patients based on their age, and for each age group, create a box chart of diastolic blood pressure values.
Load the patients data set. The Age and Diastolic variables contain the ages and diastolic blood pressure levels of 100 patients.
load patientsGroup the patients into five age bins. Find the minimum and maximum ages, and then divide the range between them into five-year bins. Bin the values in the Age variable by using the discretize function. Use the bin names in bins. The resulting groupAge variable is a categorical variable.
min(Age)
ans = 25
max(Age)
ans = 50
binEdges = 25:5:50;
bins = {'late 20s','early 30s','late 30s','early 40s','late 40s+'};
groupAge = discretize(Age,binEdges,'categorical',bins);Create a box chart for each age group. Each box chart shows the diastolic blood pressure values of the patients in that group.
boxchart(groupAge,Diastolic) xlabel('Age Group') ylabel('Diastolic Blood Pressure')

Use two grouping variables to group data and to position and color the resulting box charts.
Load the sample file TemperatureData.csv, which contains average daily temperatures from January 2015 through July 2016. Read the file into a table.
tbl = readtable('TemperatureData.csv');Convert the tbl.Month variable to a categorical variable. Specify the order of the categories.
monthOrder = {'January','February','March','April','May','June','July', ...
'August','September','October','November','December'};
tbl.Month = categorical(tbl.Month,monthOrder);Create box charts showing the distribution of temperatures during each month of each year. Specify tbl.Month as the positional grouping variable. Specify tbl.Year as the color grouping variable by using the 'GroupByColor' name-value pair argument. Notice that tbl does not contain data for some months of 2016.
boxchart(tbl.Month,tbl.TemperatureF,'GroupByColor',tbl.Year) ylabel('Temperature (F)') legend

In this figure, you can easily compare the distribution of temperatures for one particular month across multiple years. For example, you can see that February temperatures varied much more in 2016 than in 2015.
Create box charts, and plot the mean values over the box charts by using hold on.
Load the patients data set. Convert SelfAssessedHealthStatus to an ordinal categorical variable because the categories Poor, Fair, Good, and Excellent have a natural order.
load patients healthOrder = {'Poor','Fair','Good','Excellent'}; SelfAssessedHealthStatus = categorical(SelfAssessedHealthStatus, ... healthOrder,'Ordinal',true);
Group the patients according to their self-assessed health status, and find the mean patient weight for each group.
meanWeight = groupsummary(Weight,SelfAssessedHealthStatus,'mean');Compare the weights for each group of patients by using box charts. Plot the mean weights over the box charts.
boxchart(SelfAssessedHealthStatus,Weight) hold on plot(meanWeight,'-o') hold off legend(["Weight Data","Weight Mean"])

Use notches to determine whether median values are significantly different from each other.
Load the patients data set. Split the patients according to their location. For each group of patients, create a box chart of their weights. Specify 'Notch','on' so that each box includes a tapered, shaded region called a notch. Box charts whose notches do not overlap have different medians at the 5% significance level.
load patients boxchart(categorical(Location),Weight,'Notch','on') ylabel('Weight (lbs)')

In this example, the three notches overlap, showing that the three weight medians are not significantly different.
Display a side-by-side pair of box charts using the tiledlayout and nexttile functions.
Load the patients data set. Convert Smoker to a categorical variable with the descriptive category names Smoker and Nonsmoker rather than 1 and 0.
load patients Smoker = categorical(Smoker,logical([1 0]),{'Smoker','Nonsmoker'});
Create a 2-by-1 tiled chart layout using the tiledlayout function. Create the first set of axes ax1 within it by calling the nexttile function. In the first set of axes, display two box charts of systolic blood pressure values, one for smokers and the other for nonsmokers. Create the second set of axes ax2 within the tiled chart layout by calling the nexttile function. In the second set of axes, do the same for diastolic blood pressure.
tiledlayout(1,2) % Left axes ax1 = nexttile; boxchart(ax1,Systolic,'GroupByColor',Smoker) ylabel(ax1,'Systolic Blood Pressure') legend % Right axes ax2 = nexttile; boxchart(ax2,Diastolic,'GroupByColor',Smoker) ylabel(ax2,'Diastolic Blood Pressure') legend

BoxChart ObjectCreate a set of color-coded box charts, returned as a vector of BoxChart objects. Use the vector to change the color of one box chart.
Load the patients data set. Convert Gender and Smoker to categorical variables. Specify the descriptive category names Smoker and Nonsmoker rather than 1 and 0.
load patients Gender = categorical(Gender); Smoker = categorical(Smoker,logical([1 0]),{'Smoker','Nonsmoker'});
Combine the Gender and Smoker variables into one grouping variable cgroupdata. Create box charts showing the distribution of diastolic blood pressure levels for each pairing of gender and smoking status. b is a vector of BoxChart objects, one for each group of data.
cgroupdata = Gender.*Smoker;
b = boxchart(Diastolic,'GroupByColor',cgroupdata)b = 4x1 BoxChart array: BoxChart BoxChart BoxChart BoxChart
legend('Location','southeast')

Update the color of the third box chart by using the SeriesIndex property. Updating the SeriesIndex property changes both the box face color and the outlier marker color.
b(3).SeriesIndex = 6;

Create a box chart from power outage data with many outliers, and make it easier to distinguish them visually by changing the properties of the BoxChart object. Find the indices for the outlier entries.
Read power outage data into the workspace as a table. Display the first few rows of the table.
outages = readtable('outages.csv');
head(outages)ans=8×6 table
Region OutageTime Loss Customers RestorationTime Cause
_____________ ________________ ______ __________ ________________ ___________________
{'SouthWest'} 2002-02-01 12:18 458.98 1.8202e+06 2002-02-07 16:50 {'winter storm' }
{'SouthEast'} 2003-01-23 00:49 530.14 2.1204e+05 NaT {'winter storm' }
{'SouthEast'} 2003-02-07 21:15 289.4 1.4294e+05 2003-02-17 08:14 {'winter storm' }
{'West' } 2004-04-06 05:44 434.81 3.4037e+05 2004-04-06 06:10 {'equipment fault'}
{'MidWest' } 2002-03-16 06:18 186.44 2.1275e+05 2002-03-18 23:23 {'severe storm' }
{'West' } 2003-06-18 02:49 0 0 2003-06-18 10:54 {'attack' }
{'West' } 2004-06-20 14:39 231.29 NaN 2004-06-20 19:16 {'equipment fault'}
{'West' } 2002-06-06 19:28 311.86 NaN 2002-06-07 00:51 {'equipment fault'}
Create a BoxChart object b from the outages.Customers values, which indicate how many customers were affected by each power outage. boxchart discards entries with NaN values.
b = boxchart(outages.Customers);
ylabel('Number of Customers')
The plot contains many outliers. To better see them, jitter the outliers and change the outlier marker style. When you set the JitterOutliers property of the BoxChart object to 'on', the software randomly displaces the outlier markers horizontally so that they are unlikely to overlap perfectly. The values and vertical positions of the outliers are unchanged.
b.JitterOutliers = 'on'; b.MarkerStyle = '.';

You can now more easily see the distribution of outliers.
To find the outlier indices, use the isoutlier function. Specify the 'quartiles' method of computing outliers to match the boxchart outlier definition. Use the indices to create the outliers table, which contains a subset of the outages data. Notice that isoutlier identifies 96 outliers.
idx = isoutlier(outages.Customers,'quartiles');
outliers = outages(idx,:);
size(outliers,1)ans = 96
Because of all the outliers, the quartiles of the box chart are hard to see. To inspect them, change the y-axis limits.
ylim([0 4e5])

ydata — Sample dataSample data, specified as a numeric vector or matrix.
If ydata is a matrix, then boxchart
creates a box chart for each column of ydata.
If ydata is a vector and you do not specify
xgroupdata or cgroupdata, then
boxchart creates a single box chart.
If ydata is a vector and you do specify
xgroupdata or cgroupdata, then
boxchart creates a box chart for each unique value
combination in xgroupdata and
cgroupdata.
Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64
xgroupdata — Positional grouping variablePositional grouping variable, specified as a numeric or categorical vector.
xgroupdata must have the same length as the vector
ydata; you cannot specify xgroupdata when
ydata is a matrix.
boxchart groups the data in ydata
according to the unique value combinations in xgroupdata and
cgroupdata. The function creates a box chart for each group of
data and positions each box chart at the corresponding xgroupdata
value. By default, boxchart vertically orients the box charts and
displays the xgroupdata values along the
x-axis. You can change the box chart orientation by using the
Orientation property.
Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | categorical
cgroupdata — Color grouping variableColor grouping variable, specified as a numeric vector, categorical vector, logical
vector, string array, character vector, or cell array of character vectors.
cgroupdata must have the same length as the vector
ydata; you cannot specify cgroupdata when
ydata is a matrix.
boxchart groups the data in ydata
according to the unique value combinations in xgroupdata and
cgroupdata. The function creates a box chart for each group of
data and assigns the same color to groups with the same cgroupdata
value.
Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | categorical | logical | string | char | cell
ax — Target axesAxes objectTarget axes, specified as an Axes object. If you do not specify the
axes, then boxchart uses the current axes
(gca).
Specify optional
comma-separated pairs of Name,Value arguments. Name is
the argument name and Value is the corresponding value.
Name must appear inside quotes. You can specify several name and value
pair arguments in any order as
Name1,Value1,...,NameN,ValueN.
boxchart([rand(10,4); 4*rand(1,4)],'BoxFaceColor',[0 0.5
0],'MarkerColor',[0 0.5 0]) creates box charts with green boxes and green
outliers, if applicable.The BoxChart properties listed here are only a subset. For a complete
list, see BoxChart Properties.
'BoxFaceColor' — Box colorBox color, specified as an RGB triplet, hexadecimal color code, color name, or short name.
For a custom color, specify an RGB triplet or a hexadecimal color code.
An RGB triplet is a three-element row vector whose elements
specify the intensities of the red, green, and blue
components of the color. The intensities must be in the
range [0,1]; for example, [0.4
0.6 0.7].
A hexadecimal color code is a character vector or a string
scalar that starts with a hash symbol (#)
followed by three or six hexadecimal digits, which can range
from 0 to F. The
values are not case sensitive. Thus, the color codes
'#FF8800',
'#ff8800',
'#F80', and
'#f80' are equivalent.
Alternatively, you can specify some common colors by name. This table lists the named color options, the equivalent RGB triplets, and hexadecimal color codes.
| Color Name | Short Name | RGB Triplet | Hexadecimal Color Code | Appearance |
|---|---|---|---|---|
'red' | 'r' | [1 0 0] | '#FF0000' |
|
'green' | 'g' | [0 1 0] | '#00FF00' |
|
'blue' | 'b' | [0 0 1] | '#0000FF' |
|
'cyan'
| 'c' | [0 1 1] | '#00FFFF' |
|
'magenta' | 'm' | [1 0 1] | '#FF00FF' |
|
'yellow' | 'y' | [1 1 0] | '#FFFF00' |
|
'black' | 'k' | [0 0 0] | '#000000' |
|
'white' | 'w' | [1 1 1] | '#FFFFFF' |
|
'none' | Not applicable | Not applicable | Not applicable | No color |
Here are the RGB triplets and hexadecimal color codes for the default colors MATLAB® uses in many types of plots.
| RGB Triplet | Hexadecimal Color Code | Appearance |
|---|---|---|
[0 0.4470 0.7410] | '#0072BD' |
|
[0.8500 0.3250 0.0980] | '#D95319' |
|
[0.9290 0.6940 0.1250] | '#EDB120' |
|
[0.4940 0.1840 0.5560] | '#7E2F8E' |
|
[0.4660 0.6740 0.1880] | '#77AC30' |
|
[0.3010 0.7450 0.9330] | '#4DBEEE' |
|
[0.6350 0.0780 0.1840] | '#A2142F' |
|
Example: b =
boxchart(rand(10,1),'BoxFaceColor','red')
Example: b.BoxFaceColor = [0 0.5 0.5];
Example: b.BoxFaceColor = '#EDB120';
'MarkerStyle' — Outlier style'o' (default) | '+' | '*' | '.' | 'x' | ...Outlier style, specified as one of the options listed in this table.
| Value | Description |
|---|---|
'o' | Circle |
'+' | Plus sign |
'*' | Asterisk |
'.' | Point |
'x' | Cross |
'_' | Horizontal line |
'|' | Vertical line |
'square' or 's' | Square |
'diamond' or 'd' | Diamond |
'^' | Upward-pointing triangle |
'v' | Downward-pointing triangle |
'>' | Right-pointing triangle |
'<' | Left-pointing triangle |
'pentagram' or 'p' | Five-pointed star (pentagram) |
'hexagram' or 'h' | Six-pointed star (hexagram) |
'none' | No markers |
Example: b = boxchart([rand(10,1);2],'MarkerStyle','x')
Example: b.MarkerStyle = 'x';
'JitterOutliers' — Outlier marker displacement'off' (default) | on/off logical valueOutlier marker displacement, specified as 'on' or 'off', or as numeric or logical 1 (true) or 0 (false). A value of 'on' is equivalent to true, and 'off' is equivalent to false. Thus, you can use the value of this property as a logical value. The value is stored as an on/off logical value of type matlab.lang.OnOffSwitchState.
If you set the JitterOutliers property to
'on', then boxchart randomly displaces the
outlier markers along the XData direction to help you distinguish
between outliers that have similar ydata values. For an example,
see Visualize and Find Outliers.
Example: b = boxchart([rand(20,1);2;2;2],'JitterOutliers','on')
Example: b.JitterOutliers = 'on';
'Notch' — Median comparison display'off' (default) | on/off logical valueMedian comparison display, specified as 'on' or 'off', or as numeric or logical 1 (true) or 0 (false). A value of 'on' is equivalent to true, and 'off' is equivalent to false. Thus, you can use the value of this property as a logical value. The value is stored as an on/off logical value of type matlab.lang.OnOffSwitchState.
If you set the Notch property to 'on', then
boxchart creates a tapered, shaded region around each median.
Box charts whose notches do not overlap have different medians at the 5% significance
level. For more information, see Box Chart (Box Plot).
Notches can extend beyond the lower and upper quartiles.
Example: b = boxchart(rand(10,2),'Notch','on')
Example: b.Notch = 'on';
'Orientation' — Orientation of box charts'vertical' (default) | 'horizontal'Orientation of box charts, specified as 'vertical' or
'horizontal'. By default, the box charts are vertically
orientated, so that the ydata statistics are aligned with the
y-axis. Regardless of the orientation,
boxchart stores the ydata values in the
YData property of the BoxChart
object.
Example: b = boxchart(rand(10,1),'Orientation','horizontal')
Example: b.Orientation = 'horizontal';
b — Box chartsBoxChart objectsBox charts, returned as a vector of BoxChart objects.
b contains one BoxChart object for each unique
value in cgroupdata. For more information, see BoxChart Properties.
A box chart, or box plot, provides a visual representation of summary statistics for a data sample. Given numeric data, the corresponding box chart displays the following information: the median, the lower and upper quartiles, any outliers (computed using the interquartile range), and the minimum and maximum values that are not outliers.
The line inside of each box is the sample median. You can compute the value of the
median using the median function.
The top and bottom edges of each box are the upper and lower quartiles, respectively. The distance between the top and bottom edges is the interquartile range (IQR).
For more information on how the quartiles are computed, see quantile Algorithms (Statistics and Machine Learning Toolbox), where the upper quartile
corresponds to the 0.75 quantile and the lower quartile corresponds to the 0.25
quantile. To use the quantile function, you must have a
Statistics and Machine Learning Toolbox™ license.
Outliers are values that are more than 1.5 · IQR away from the top or bottom of the box. By default,
boxchart displays each outlier using an 'o'
symbol. The outlier computation is comparable to that of the isoutlier function with the 'quartiles'
method.
The whiskers are lines that extend above and below each box. One whisker connects the upper quartile to the nonoutlier maximum (the maximum value that is not an outlier), and the other connects the lower quartile to the nonoutlier minimum (the minimum value that is not an outlier).
Notches help you compare sample medians across multiple box charts. When you
specify 'Notch','on', the boxchart function
creates a tapered, shaded region around each median. Box charts whose notches do not
overlap have different medians at the 5% significance level. The significance level is
based on a normal distribution assumption, but the median comparison is reasonably
robust for other distributions.
The top and bottom edges of the notch region correspond to and , respectively, where m is the median,
IQR is the interquartile range, and n is the
number of data points, excluding NaN values.

You can add two types of data tips to a BoxChart object: one for
each box chart and one for each outlier. A general data tip appears at the nonoutlier
maximum value, regardless of where you click on the box chart.

Note
The displayed Num Points value includes NaN
values in the corresponding ydata, but
boxchart discards the NaN values before
computing the box chart statistics.
You can use the datatip
function to add more data tips to a BoxChart object, but the indexing
of data tips differs from other charts. boxchart first assigns
indices to the box charts and then assigns indices to the outliers. For example, if a
BoxChart object b displays two box charts and one
outlier, datatip(b,'DataIndex',3) creates a data tip at the outlier
point.
You have a modified version of this example. Do you want to open this example with your edits?