corrplot

Plot variable correlations

Syntax

Description

example

corrplot(X) creates a matrix of plots showing correlations among pairs of variables in X. Histograms of the variables appear along the matrix diagonal; scatter plots of variable pairs appear off diagonal. The slopes of the least-squares reference lines in the scatter plots are equal to the displayed correlation coefficients.

example

corrplot(X,Name,Value) uses additional options specified by one or more Name,Value pair arguments.

example

R = corrplot(___) returns the correlation matrix of X displayed in the plots. You can use any of the previous input arguments.

example

[R,PValue] = corrplot(___) additionally returns the p-values corresponding to the elements of R, used to test the null hypothesis of no correlation against the alternative of a nonzero correlation.

Examples

expand all

Plot Pearson's Correlation Coefficients

Plot correlations between multiple time series.

Load data on Canadian inflation and interest rates.

load Data_Canada

Plot the Pearson's linear correlation coefficients between all pairs of variables.

corrplot(DataTable)

The correlation plot shows that the short-term, medium-term, and long-term interest rates are highly correlated.

To examine the timestamp of a datum, enter gname(dates) into the Command Window, and the software presents an interactive cross hair over the plot. To expose the timestamp of a datum, click it using the cross hair.

Plot and Test Kendall's Rank Correlation Coefficients

Plot Kendall's rank correlations between multiple time series. Conduct a hypothesis test to determine which correlations are significantly different from zero.

Load data on Canadian inflation and interest rates.

load Data_Canada

Plot the Kendall's rank correlation coefficients between all pairs of variables. Specify a hypothesis test to determine which correlations are significantly different from zero.

corrplot(DataTable,'type','Kendall','testR','on')

The correlation coefficients highlighted in red indicate which pairs of variables have correlations significantly different from zero. For these time series, all pairs of variables have correlations significantly different from zero.

Conduct Right-Tailed Correlation Tests

Test for correlations greater than zero between multiple time series.

Load data on Canadian inflation and interest rates.

load Data_Canada

Return the pairwise Pearson's correlations and corresponding p-values for testing the null hypothesis of no correlation against the right-tailed alternative that the correlations are greater than zero.

[R,PValue] = corrplot(DataTable,'tail','right');
PValue
PValue =

    1.0000    0.0000    0.0000    0.0000    0.0000
    0.0000    1.0000    0.0000    0.0000    0.0001
    0.0000    0.0000    1.0000    0.0000    0.0000
    0.0000    0.0000    0.0000    1.0000    0.0000
    0.0000    0.0001    0.0000    0.0000    1.0000

The output PValue has pairwise p-values all less than the default 0.05 significance level, indicating that all pairs of variables have correlation significantly greater than zero.

Input Arguments

expand all

X — Data seriesnumeric matrix | tabular array

Data series that corrplot uses to plot correlations, specified as a numObs-by-numVars numeric matrix or tabular array. X consists of numObs observations made on numVars variables, and plots the correlations between the numVars variables.

If X is a tabular array, then the variables must be numeric.

Data Types: double | table

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'tails','right','alpha',0.1 specifies right-tailed tests at the 0.1 significance level

'type' — Correlation coefficient'Pearson' (default) | 'Kendall' | 'Spearman'

Correlation coefficient to compute, specified as the comma-separated pair consisting of 'type' and one of the following:

'Pearson'Pearson's linear correlation coefficient
'Kendall'Kendall's rank correlation coefficient (τ)
'Spearman'Spearman's rank correlation coefficient (ρ)

Example: 'type','Kendall'

'rows' — Option for handling rows with NaN values'pairwise' (default) | 'all' | 'complete'

Option for handling rows with NaN values, specified as the comma-separated pair consisting of 'rows' and one of the following:

'all'Use all rows, regardless of NaNs.
'complete'Use only rows with no NaNs.
'pairwise'Use rows with no NaNs in column i or j to compute R(i,j).

Example: 'rows','complete'

'tail' — Alternative hypothesis'both' (default) | 'right' | 'left'

Alternative hypothesis (Ha) used to compute the p-values, specified as the comma-separated pair consisting of 'tail' and one of the following:

'both'Ha: Correlation is not zero.
'right'Ha: Correlation is greater than zero.
'left'Ha: Correlation is less than zero.

Example: 'tail','left'

'varNames' — Variable namescell array of strings

Variable names to be used in the plots, specified as the comma-separated pair consisting of 'varNames' and a cell array of strings with numVars names. All variable names are truncated to the first five characters.

  • If X is a matrix, then the default variable names are {'var1','var2',...}.

  • If X is a tabular array, then the default variable names are X.Properties.VariableNames.

Example: 'varNames',{'CPF','AGE','BBD'}

'testR' — Significance tests indicator'off' (default) | 'on'

Significance tests indicator for whether or not to test for significant correlations, specified as the comma-separated pair consisting of 'testR' and one of 'off' or 'on'. If you specify the value 'on', significant correlations are highlighted in red in the correlation matrix plot.

Example: 'testR','on'

'alpha' — Significance level0.05 (default) | scalar between 0 and 1

Significance level for tests of correlation, specified as a scalar between 0 and 1.

Example: 'alpha',0.01

Output Arguments

expand all

R — Correlationsmatrix

Correlations between pairs of variables in X that are displayed in the plots, returned as a numVars-by-numVars matrix.

PValuep-valuesmatrix

p-values corresponding to significance tests on the elements of R, returned as a numVars-by-numVars matrix. The p-values are used to test the hypothesis of no correlation against the alternative of nonzero correlation.

More About

expand all

Tips

  • The option 'rows','pairwise', which is the default, can return a correlation matrix that is not positive definite. The 'complete' option always returns a positive-definite matrix, but in general the estimates are based on fewer observations.

  • Use gname to identify points in the plots.

Algorithms

The software computes:

  • p-values for Pearson's correlation by transforming the correlation to create a t-statistic with numObs – 2 degrees of freedom. The transformation is exact when X is normal.

  • p-values for Kendall's and Spearman's rank correlations using either the exact permutation distributions (for small sample sizes) or large-sample approximations.

  • p-values for two-tailed tests by doubling the more significant of the two one-tailed p-values.

See Also

| |

Was this topic helpful?