Compute Measures of Dispersion

The measures of dispersion summarize how spread out (or scattered) the data values are on the number line. MuPAD® provides the following functions for calculating the measures of dispersion. These functions describe the deviation from the arithmetic average (mean) of a data sample:

  • The stats::variance function calculates the variance

    , where is the arithmetic mean of the data sample x1, x2, ..., xn.

  • The stats::stdev function calculates the standard deviation

    , where is the arithmetic average of the data sample x1, x2, ..., xn.

  • The stats::meandev function calculates the mean deviation

    , where is the arithmetic average of the data sample x1, x2, ..., xn.

The standard deviation and the variance are popular measures of dispersion. The standard deviation is the square root of the variance and has the desirable property of being in the same units as the data. For example, if the data is in meters, the standard deviation is also in meters. Both the standard deviation and the variance are sensitive to outliers. A data value that is separate from the body of the data can increase the value of the statistics by an arbitrarily large amount. For example, compute the variance and the standard deviation of the list x that contains one outlier:

L := [1, 1, 1, 1, 1, 1, 1, 1, 100.0]:
variance = stats::variance(L);
stdev = stats::stdev(L)

The mean deviation is also sensitive to outliers. Nevertheless, the large outlier in the list x affects the mean deviation less than it affects the variance and the standard deviation:

meandev = stats::meandev(L)

Now, compute the variance, the standard deviation, and the mean deviation of the list y that contains one small outlier. Again, the mean deviation is less sensitive to the outlier than the other two measures:

S := [100, 100, 100, 100, 100, 100, 100, 100, 1.0]:
variance = stats::variance(S);
stdev = stats::stdev(S);
meandev = stats::meandev(S)

Was this topic helpful?