MATLAB Examples

Perform One-Way ANOVA

This example shows how to perform one-way ANOVA to determine whether data from several groups have a common mean.

Load and display the sample data.

load hogg
hogg
hogg =

    24    14    11     7    19
    15     7     9     7    24
    21    12     7     4    19
    27    17    13     7    15
    33    14    12    12    10
    23    16    18    18    20

The data comes from a Hogg and Ledolter (1987) study on bacteria counts in shipments of milk. The columns of the matrix hogg represent different shipments. The rows are bacteria counts from cartons of milk chosen randomly from each shipment.

Test if some shipments have higher counts than others. By default, anova1 returns two figures. One is the standard ANOVA table, and the other one is the box plots of data by group.

[p,tbl,stats] = anova1(hogg);
p
p =

   1.1971e-04

The small p-value of about 0.0001 indicates that the bacteria counts from the different shipments are not the same.

You can get some graphical assurance that the means are different by looking at the box plots. The notches, however, compare the medians, not the means. For more information on this display, see boxplot.

View the standard ANOVA table. anova1 saves the standard ANOVA table as a cell array in the output argument tbl.

tbl
tbl =

  4x6 cell array

  Columns 1 through 5

    {'Source' }    {'SS'        }    {'df'}    {'MS'      }    {'F'       }
    {'Columns'}    {[  803.0000]}    {[ 4]}    {[200.7500]}    {[  9.0076]}
    {'Error'  }    {[  557.1667]}    {[25]}    {[ 22.2867]}    {0x0 double}
    {'Total'  }    {[1.3602e+03]}    {[29]}    {0x0 double}    {0x0 double}

  Column 6

    {'Prob>F'    }
    {[1.1971e-04]}
    {0x0 double  }
    {0x0 double  }

Save the F-statistic value in the variable Fstat.

Fstat = tbl{2,5}
Fstat =

    9.0076

View the statistics necessary to make a multiple pairwise comparison of group means. anova1 saves these statistics in the structure stats.

stats
stats = 

  struct with fields:

    gnames: [5x1 char]
         n: [6 6 6 6 6]
    source: 'anova1'
     means: [23.8333 13.3333 11.6667 9.1667 17.8333]
        df: 25
         s: 4.7209

ANOVA rejects the null hypothesis that all group means are equal, so you can use the multiple comparisons to determine which group means are different from others. To conduct multiple comparison tests, use the function multcompare, which accepts stats as an input argument. In this example, anova1 rejects the null hypothesis that the mean bacteria counts from all four shipments are equal to each other, i.e., $H_{0}: \mu_{1} = \mu_{2} = \mu_{3} = \mu_{4}$.

Perform a multiple comparison test to determine which shipments are different than the others in terms of mean bacteria counts.

multcompare(stats)
ans =

    1.0000    2.0000    2.4953   10.5000   18.5047    0.0059
    1.0000    3.0000    4.1619   12.1667   20.1714    0.0013
    1.0000    4.0000    6.6619   14.6667   22.6714    0.0001
    1.0000    5.0000   -2.0047    6.0000   14.0047    0.2119
    2.0000    3.0000   -6.3381    1.6667    9.6714    0.9719
    2.0000    4.0000   -3.8381    4.1667   12.1714    0.5544
    2.0000    5.0000  -12.5047   -4.5000    3.5047    0.4806
    3.0000    4.0000   -5.5047    2.5000   10.5047    0.8876
    3.0000    5.0000  -14.1714   -6.1667    1.8381    0.1905
    4.0000    5.0000  -16.6714   -8.6667   -0.6619    0.0292

The first two columns show which group means are compared with each other. For example, the first row compares the means for groups 1 and 2. The last column shows the p-values for the tests. The p-values 0.0059, 0.0013, and 0.0001 indicate that the mean bacteria counts in the milk from the first shipment is different from the ones from the second, third, and fourth shipments. The p-value of 0.0292 indicates that the mean bacteria counts in the milk from the fourth shipment is different from the ones from the fifth. The procedure fails to reject the hypotheses that the other group means are different from each other.

The figure also illustrates the same result. The blue bar shows the comparison interval for the first group mean, which does not overlap with the comparison intervals for the second, third, and fourth group means, shown in red. The comparison interval for the mean of fifth group, shown in gray, overlaps with the comparison interval for the first group mean. Hence, the group means for the first and fifth groups are not significantly different from each other.