Quantcast

Documentation Center

  • Trial Software
  • Product Updates

Compute Covariance and Correlation

If you have two or more data samples with an equal number of elements, you can estimate how similar these data samples are. The most common measures of similarity of two data samples are the covariance and the correlation. MuPAD® provides the following functions for computing the covariance and the correlation of two data samples:

  • The stats::covariance function calculates the covariance

    . Here is the arithmetic average of the data sample x1, x2, ..., xn, and is the arithmetic average of the data sample y1, y2, ..., yn.

  • The stats::correlation function calculates the linear (Bravais-Pearson) correlation coefficient

    . Here is the arithmetic average of the data sample x1, x2, ..., xn, and is the arithmetic average of the data sample y1, y2, ..., yn.

Create the lists x and y:

x := [1, 1, 0.1]:
y := [1, 2, 0.1]:

To estimate the similarity of these lists, compute their covariance. For completely uncorrelated (nonsimilar) data, the covariance is a small value. A positive covariance indicates that the data change in the same direction (increases or decreases together). A negative covariance indicates the data change in opposite directions. There are two common definitions of the covariance. By default, the stats::covariance function uses the definition with the divisor n - 1. To switch to the alternative definition, use the Population option:

stats::covariance(x, y),
stats::covariance(x, y, Population)

The covariance of a data sample with itself is the variance of that data sample:

stats::covariance(x, x) = stats::variance(x)

The correlation of data samples indicates the degree of similarity of these data samples. For completely uncorrelated data, the value of the correlation (as well as the covariance) tends to 0. For correlated data that change in the same direction, the correlation tends to 1. For correlated data that change in the opposite directions, the correlation tends to -1. Compute the correlation of x and y:

stats::correlation(x, y),
stats::correlation(x, x),
stats::correlation(x, -x)

Was this topic helpful?