Multiple comparison test


  • c = multcompare(stats) example
  • c = multcompare (stats,Name,Value)
  • [c,m] = multcompare(___)
  • [c,m,h] = multcompare(___)
  • [c,m,h,gnames] = multcompare(___) example



c = multcompare(stats) returns a matrix c of the pairwise comparison results from a multiple comparison test using the information contained in the stats structure. multcompare also displays an interactive graph of the estimates and comparison intervals. Each group mean is represented by a symbol, and the interval is represented by a line extending out from the symbol. Two group means are significantly different if their intervals are disjoint; they are not significantly different if their intervals overlap. If you use your mouse to select any group, then the graph will highlight all other groups that are significantly different, if any.

c = multcompare (stats,Name,Value) returns a matrix of pairwise comparison results, c, using additional options specified by one or more Name,Value pair arguments. For example, you can specify the confidence interval, or the type of critical value to use in the multiple comparison.

[c,m] = multcompare(___) also returns a matrix, m, which contains estimated values of the means (or whatever statistics are being compared) for each group and the corresponding standard errors. You can use any of the previous syntaxes.

[c,m,h] = multcompare(___) also returns a handle, h, to the comparison graph.


[c,m,h,gnames] = multcompare(___) also returns a cell array, gnames, which contains the names of the groups.


expand all

Multiple Comparison of Group Means

Load the sample data.

load carsmall

Perform a one-way analysis of variance (ANOVA) to see if there is any difference between the mileage of the cars by origin.

[p,t,stats] = anova1(MPG,Origin,'off');

Perform a multiple comparison of the group means.

[c,m,h,nms] = multcompare(stats);

multcompare displays the estimates with comparison intervals around them. You can click the graphs of each country to compare its mean to those of other countries.

Now display the mean estimates and the standard errors with the corresponding group names.

[nms num2cell(m)]
ans = 

    'USA'        [21.1328]    [0.8814]
    'Japan'      [31.8000]    [1.8206]
    'Germany'    [28.4444]    [2.3504]
    'France'     [23.6667]    [4.0711]
    'Sweden'     [22.5000]    [4.9860]
    'Italy'      [     28]    [7.0513]

Multiple Comparison of Material Strength

Input the data.

strength = [82 86 79 83 84 85 86 87 74 82 ...
            78 75 76 77 79 79 77 78 82 79];
alloy = {'st','st','st','st','st','st','st','st',...

The data are from a study of the strength of structural beams in Hogg (1987). The vector strength measures deflections of beams in thousandths of an inch under 3,000 pounds of force. The vector alloy identifies each beam as steel ('st'), alloy 1 ('al1'), or alloy 2 ('al2'). Although alloy is sorted in this example, grouping variables do not need to be sorted.

First perform one-way ANOVA.

[p,a,s] = anova1(strength,alloy);

The small p -value suggests that the strength of the beams is not the same.

Now, perform a multiple comparison of the mean strength of the beams.

[c,m,h,nms] = multcompare(s);

Display the comparison results with the corresponding group names.

[nms(c(:,1)), nms(c(:,2)), num2cell(c(:,3:6))]
ans = 

    'st'     'al1'    [ 3.6064]    [ 7]    [10.3936]    [1.6831e-04]
    'st'     'al2'    [ 1.6064]    [ 5]    [ 8.3936]    [    0.0040]
    'al1'    'al2'    [-5.6280]    [-2]    [ 1.6280]    [    0.3560]

The third row of the output matrix shows that the differences in strength between the two alloys is not significant. A 95% confidence interval for the difference is [-5.6, 1.6], so you cannot reject the hypothesis that the true difference is zero. This is also confirmed by the corresponding p -value of 0.3560 in the sixth column.

The first two rows show that both comparisons involving the first group (steel) have confidence intervals that do not include zero. And the corresponding p -values (1.6831e-04 and 0.0040, respectively) are small. In other words, those differences are significant. The graph shows the same information.

Input Arguments

expand all

stats — Test datastructure

Test data, specified as a structure. You can create a structure using one of the following functions:

  • anova1 — One-way analysis of variance.

  • anova2 — Two-way analysis of variance.

  • anovanN-way analysis of variance.

  • aoctool — Interactive analysis of covariance tool.

  • friedman — Friedman's test.

  • kruskalwallis — Kruskal-Wallis test.

multcompare does not support multiple comparisons using anovan output for a model that includes random or nested effects. The calculations for a random effects model produce a warning that all effects are treated as fixed. Nested models are not accepted.

Data Types: struct

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'Alpha',0.01,'CType','bonferroni','Display','off' computes the Bonferroni critical values, conducts the hypothesis tests at the 1% significance level, and omits the interactive display.

'Alpha' — Significance level0.05 (default) | scalar value in the range (0,1)

Significance level of the multiple comparison test, specified as the comma-separated pair consisting of 'Alpha' and a scalar value in the range (0,1). The value specified for 'Alpha' determines the 100 × (1 – α) confidence levels of the intervals returned in the matrix c and in the figure.

Example: 'Alpha',0.01

Data Types: single | double

'CType' — Type of critical value'tukey-kramer' (default) | 'hsd' | 'lsd' | 'bonferroni' | 'dunn-sidak' | 'scheffe'

Type of critical value to use for the multiple comparison, specified as the comma-separated pair consisting of 'CType' and one of the following.

'tukey-kramer' or 'hsd'

Tukey's honestly significant difference criterion. This is based on the Studentized range distribution. It is optimal for balanced one-way ANOVA and similar procedures with equal sample sizes. It has been proven to be conservative for one-way ANOVA with different sample sizes. According to the unproven Tukey-Kramer conjecture, it is also accurate for problems where the quantities being compared are correlated, as in analysis of covariance with unbalanced covariate values.


Critical values from the t distribution, after a Bonferroni adjustment to compensate for multiple comparisons. This procedure is conservative, but usually less so than the Scheffé procedure.


Critical values from the t distribution, after an adjustment for multiple comparisons that was proposed by Dunn and proved accurate by Sidák. This procedure is similar to, but less conservative than, the Bonferroni procedure.


Tukey's least significant difference procedure. This procedure is a simple t-test. It is reasonable if the preliminary test (say, the one-way ANOVA F statistic) shows a significant difference. If it is used unconditionally, it provides no protection against multiple comparisons.


Critical values from Scheffé's S procedure, derived from the F distribution. This procedure provides a simultaneous confidence level for comparisons of all linear combinations of the means. It is conservative for comparisons of simple differences of pairs.

Example: 'CType','bonferroni'

'Display' — Display toggle'on' (default) | 'off'

Display toggle, specified as the comma-separated pair consisting of 'Display' and either 'on' or 'off'. If you specify 'on', then multcompare displays a graph of the estimates and their comparison intervals. If you specify 'off', then multcompare omits the graph.

Example: 'Display','off'

'Dimension' — Dimension over which to calculate marginal means1 (default) | positive integer value | vector of positive integer values

A vector specifying the dimension or dimensions over which to calculate the population marginal means, specified as a positive integer value, or a vector of such values. Use the 'Dimension' name-value pair only if you create the input structure stats using the function anovan.

For example, if you specify 'Dimension' as 1, then multcompare compares the means for each value of the first grouping variable, adjusted by removing effects of the other grouping variables as if the design were balanced. If you specify 'Dimension'as [1,3], then multcompare computes the population marginal means for each combination of the first and third grouping variables, removing effects of the second grouping variable. If you fit a singular model, some cell means may not be estimable and any population marginal means that depend on those cell means will have the value NaN.

Population marginal means are described by Milliken and Johnson (1992) and by Searle, Speed, and Milliken (1980). The idea behind population marginal means is to remove any effect of an unbalanced design by fixing the values of the factors specified by 'Dimension', and averaging out the effects of other factors as if each factor combination occurred the same number of times. The definition of population marginal means does not depend on the number of observations at each factor combination. For designed experiments where the number of observations at each factor combination has no meaning, population marginal means can be easier to interpret than simple means ignoring other factors. For surveys and other studies where the number of observations at each combination does have meaning, population marginal means may be harder to interpret.

Example: 'Dimension',[1,3]

Data Types: single | double

'Estimate' — Estimates to be compared'column' (default) | 'row' | 'slope' | 'intercept' | 'pmm'

Estimates to be compared, specified as the comma-separated pair consisting of 'Estimate' and an allowable value. The allowable values for 'Estimate' depend on the function used to generate the input structure stats, according to the following table.


None. This name-value pair is ignored, and multcompare always compares the group means.


Either 'column' to compare column means, or 'row' to compare row means.


None. This name-value pair is ignored, and multcompare always compares the population marginal means as specified by the 'Dimension' name-value pair argument.


Either 'slope', 'intercept', or 'pmm' to compare slopes, intercepts, or population marginal means, respectively. If the analysis of covariance model did not include separate slopes, then 'slope' is not allowed. If it did not include separate intercepts, then no comparisons are possible.


None. This name-value pair is ignored, and multcompare always compares the average column ranks.


None. This name-value pair is ignored, and multcompare always compares the average group ranks.

Example: 'Estimate','row'

Output Arguments

expand all

c — Matrix of multiple comparison resultsmatrix of scalar values

Matrix of multiple comparison results, returned as an p-by-6 matrix of scalar values, where p is the number of pairs of groups. Each row of the matrix contains the result of one paired comparison test. Columns 1 and 2 contain the indices of the two samples being compared. Column 3 contains the lower confidence interval, column 4 contains the estimate, and column 5 contains the upper confidence interval. Column 6 contains the p-value for the hypothesis test that the corresponding mean difference is not equal to 0.

For example, suppose one row contains the following entries.

2.0000  5.0000  1.9442  8.2206  14.4971 0.0432

These numbers indicate that the mean of group 2 minus the mean of group 5 is estimated to be 8.2206, and a 95% confidence interval for the true difference of the means is [1.9442, 14.4971]. The p-value for the corresponding hypothesis test that the difference of the means of groups 2 and 5 is significantly different from zero is 0.0432.

In this example the confidence interval does not contain 0, so the difference is significant at the 5% significance level. If the confidence interval did contain 0, the difference would not be significant. The p-value of 0.0432 also indicates that the difference of the means of groups 2 and 5 is significantly different from 0.

m — Matrix of estimatesmatrix of scalar values

Matrix of the estimates, returned as a matrix of scalar values. The first column of m contains the estimated values of the means (or whatever statistics are being compared) for each group, and the second column contains their standard errors.

h — Handle to the figurehandle

Handle to the figure containing the interactive graph, returned as a handle. The title of this graph contains instructions for interacting with the graph, and the x-axis label contains information about which means are significantly different from the selected mean. If you plan to use this graph for presentation, you may want to omit the title and the x-axis label. You can remove them using interactive features of the graph window, or you can use the following commands.


gnames — Group namescell array of strings

Group names, returned as a cell array of strings. Each row of gnames contains the name of a group.

More About

expand all

Multiple Comparison Tests

A one-way analysis of variance compares the means of several groups to test the hypothesis that they are all equal, against the general alternative that they are not all equal. Sometimes this alternative may be too general. You may need information about which pairs of means are significantly different, and which are not. A multiple comparison test can provide this information.

When you perform a simple t-test of one group mean against another, you specify a significance level that determines the cutoff value of the t-statistic. For example, you can specify the value alpha = 0.05 to insure that when there is no real difference, you will incorrectly find a significant difference no more than 5% of the time. When there are many group means, there are also many pairs to compare. If you applied an ordinary t-test in this situation, the alpha value would apply to each comparison, so the chance of incorrectly finding a significant difference would increase with the number of comparisons. Multiple comparison procedures are designed to provide an upper bound on the probability that any comparison will be incorrectly found significant.


[1] Hochberg, Y., and A. C. Tamhane. Multiple Comparison Procedures. Hoboken, NJ: John Wiley & Sons, 1987.

[2] Milliken, G. A., and D. E. Johnson. Analysis of Messy Data, Volume 1: Designed Experiments. Boca Raton, FL: Chapman & Hall/CRC Press, 1992.

[3] Searle, S. R., F. M. Speed, and G. A. Milliken. "Population marginal means in the linear model: an alternative to least-squares means." American Statistician. 1980, pp. 216–221.

Was this topic helpful?