# anova

Analysis of variance for between-subject effects in a repeated measures model

## Description

## Examples

### Analysis of Variance for Average Response

Load the sample data.

`load fisheriris`

The column vector `species`

consists of iris flowers of three different species: setosa, versicolor, and virginica. The double matrix `meas`

consists of four types of measurements on the flowers: the length and width of sepals and petals in centimeters, respectively.

Store the data in a table array.

t = table(species,meas(:,1),meas(:,2),meas(:,3),meas(:,4),... 'VariableNames',{'species','meas1','meas2','meas3','meas4'}); Meas = dataset([1 2 3 4]','VarNames',{'Measurements'});

Fit a repeated measures model where the measurements are the responses and the species is the predictor variable.

rm = fitrm(t,'meas1-meas4~species','WithinDesign',Meas);

Perform analysis of variance.

anova(rm)

`ans=`*3×7 table*
Within Between SumSq DF MeanSq F pValue
________ ________ ______ ___ _______ ______ ___________
Constant constant 7201.7 1 7201.7 19650 2.0735e-158
Constant species 309.61 2 154.8 422.39 1.1517e-61
Constant Error 53.875 147 0.36649

There are 150 observations and 3 species. The degrees of freedom for species is 3 - 1 = 2, and for error it is 150 - 3 = 147. The small $$p$$-value of 1.1517e-61 indicates that the measurements differ significantly according to species.

### Panel Data

Load the sample panel data.

`load('panelData.mat');`

The dataset array, `panelData`

, contains yearly observations on eight cities for 6 years. The first variable, `Growth`

, measures economic growth (the response variable). The second and third variables are city and year indicators, respectively. The last variable, `Employ`

, measures employment (the predictor variable). This is simulated data.

Store the data in a table array and define city as a nominal variable.

t = table(panelData.Growth,panelData.City,panelData.Year,... 'VariableNames',{'Growth','City','Year'});

Convert the data in a proper format to do repeated measures analysis.

t = unstack(t,'Growth','Year','NewDataVariableNames',... {'year1','year2','year3','year4','year5','year6'});

Add the mean employment level over the years as a predictor variable to the table `t`

.

t(:,8) = table(grpstats(panelData.Employ,panelData.City)); t.Properties.VariableNames{'Var8'} = 'meanEmploy';

Define the within-subjects variable.

Year = [1 2 3 4 5 6]';

Fit a repeated measures model, where the growth figures over the 6 years are the responses and the mean employment is the predictor variable.

rm = fitrm(t,'year1-year6 ~ meanEmploy','WithinDesign',Year);

Perform analysis of variance.

`anovatbl = anova(rm,'WithinModel',Year)`

`anovatbl=`*3×7 table*
Within Between SumSq DF MeanSq F pValue
_________ __________ __________ __ __________ ________ _________
Contrast1 constant 588.17 1 588.17 0.038495 0.85093
Contrast1 meanEmploy 3.7064e+05 1 3.7064e+05 24.258 0.0026428
Contrast1 Error 91675 6 15279

### Longitudinal Data

Load the sample data.

`load('longitudinalData.mat');`

The matrix `Y`

contains response data for 16 individuals. The response is the blood level of a drug measured at five time points (time = 0, 2, 4, 6, and 8). Each row of `Y`

corresponds to an individual, and each column corresponds to a time point. The first eight subjects are female, and the second eight subjects are male. This is simulated data.

Define a variable that stores gender information.

Gender = ['F' 'F' 'F' 'F' 'F' 'F' 'F' 'F' 'M' 'M' 'M' 'M' 'M' 'M' 'M' 'M']';

Store the data in a proper table array format to do repeated measures analysis.

t = table(Gender,Y(:,1),Y(:,2),Y(:,3),Y(:,4),Y(:,5),... 'VariableNames',{'Gender','t0','t2','t4','t6','t8'});

Define the within-subjects variable.

Time = [0 2 4 6 8]';

Fit a repeated measures model, where blood levels are the responses and gender is the predictor variable.

rm = fitrm(t,'t0-t8 ~ Gender','WithinDesign',Time);

Perform analysis of variance.

anovatbl = anova(rm)

`anovatbl=`*3×7 table*
Within Between SumSq DF MeanSq F pValue
________ ________ ______ __ ______ ______ __________
Constant constant 54702 1 54702 1079.2 1.1897e-14
Constant Gender 2251.7 1 2251.7 44.425 1.0693e-05
Constant Error 709.6 14 50.685

There are 2 genders and 16 observations, so the degrees of freedom for gender is (2 - 1) = 1 and for error it is (16 - 2)*(2 - 1) = 14. The small $$p$$-value of 1.0693e-05 indicates that there is a significant effect of gender on blood pressure.

Repeat analysis of variance using orthogonal contrasts.

anovatbl = anova(rm,'WithinModel','orthogonalcontrasts')

`anovatbl=`*15×7 table*
Within Between SumSq DF MeanSq F pValue
________ ________ __________ __ __________ __________ __________
Constant constant 54702 1 54702 1079.2 1.1897e-14
Constant Gender 2251.7 1 2251.7 44.425 1.0693e-05
Constant Error 709.6 14 50.685
Time constant 310.83 1 310.83 31.023 6.9065e-05
Time Gender 13.341 1 13.341 1.3315 0.26785
Time Error 140.27 14 10.019
Time^2 constant 565.42 1 565.42 98.901 1.0003e-07
Time^2 Gender 1.4076 1 1.4076 0.24621 0.62746
Time^2 Error 80.039 14 5.7171
Time^3 constant 2.6127 1 2.6127 1.4318 0.25134
Time^3 Gender 7.8853e-06 1 7.8853e-06 4.3214e-06 0.99837
Time^3 Error 25.546 14 1.8247
Time^4 constant 2.8404 1 2.8404 0.47924 0.50009
Time^4 Gender 2.9016 1 2.9016 0.48956 0.49559
Time^4 Error 82.977 14 5.9269

## Input Arguments

`rm`

— Repeated measures model

`RepeatedMeasuresModel`

object

Repeated measures model, returned as a `RepeatedMeasuresModel`

object.

For properties and methods of this object, see `RepeatedMeasuresModel`

.

`WM`

— Within-subject model

`'separatemeans'`

(default) | `'orthogonalcontrasts'`

| character vector or string scalar defining a model specification | *r*-by-*nc* matrix specifying
*nc* contrasts

Within-subject model, specified as one of the following:

`'separatemeans'`

— The response is the average of the repeated measures (average across the within-subject model).`'orthogonalcontrasts'`

— This is valid when the within-subject model has a single numeric factor*T*. Responses are the average, the slope of centered*T*, and, in general, all orthogonal contrasts for a polynomial up to*T*^(*p*– 1), where*p*is the number of rows in the within-subject model.`anova`

multiplies`Y`

, the response you use in the repeated measures model`rm`

by the orthogonal contrasts, and uses the columns of the resulting product matrix as the responses.`anova`

computes the orthogonal contrasts for*T*using the*Q*factor of a QR factorization of the Vandermonde matrix.A character vector or string scalar that defines a model specification in the within-subject factors. Responses are defined by the terms in that model.

`anova`

multiplies*Y*, the response matrix you use in the repeated measures model`rm`

by the terms of the model, and uses the columns of the result as the responses.For example, if there is a Time factor and

`'Time'`

is the model specification, then`anova`

uses two terms, the constant and the uncentered Time term. The default is`'1'`

to perform on the average response.An

*r*-by-*nc*matrix,*C*, specifying*nc*contrasts among the*r*repeated measures. If*Y*represents the matrix of repeated measures you use in the repeated measures model`rm`

, then the output`tbl`

contains a separate analysis of variance for each column of*Y***C*.

The `anova`

table contains a separate univariate
analysis of variance results for each response.

**Example: **`'WithinModel','Time'`

**Example: **`'WithinModel','orthogonalcontrasts'`

## Output Arguments

`anovatbl`

— Results of analysis of variance

table

Results of analysis of variance for between-subject effects, returned as a table. This includes all terms on the between-subjects model and the following columns.

Column Name | Definition |
---|---|

`Within` | Within-subject factors |

`Between` | Between-subject factors |

`SumSq` | Sum of squares |

`DF` | Degrees of freedom |

`MeanSq` | Mean squared error |

`F` | F-statistic |

`pValue` | p-value corresponding to the F-statistic |

## More About

### Vandermonde Matrix

Vandermonde matrix is the matrix where columns are the powers
of the vector *a*, that is, *V*(*i*,*j*)
= *a*(*i*)^{(n — j)},
where *n* is the length of *a*.

### QR Factorization

QR factorization of an *m*-by-*n* matrix *A* is
the factorization that matrix into the product *A* = *Q***R*,
where *R* is an *m*-by-*n* upper
triangular matrix and *Q* is an *m*-by-*m* unitary
matrix.

## Version History

**Introduced in R2014a**

## MATLAB Command

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

Select a Web Site

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

You can also select a web site from the following list:

## How to Get Best Site Performance

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

### Americas

- América Latina (Español)
- Canada (English)
- United States (English)

### Europe

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)