Syntax
err = cvshrink(obj)
[err,gamma]
= cvshrink(obj)
[err,gamma,delta]
= cvshrink(obj)
[err,gamma,delta,numpred]
= cvshrink(obj)
[err,...] = cvshrink(obj,Name,Value)
Description
err = cvshrink(obj) returns
a vector of crossvalidated classification error values for differing
values of the regularization parameter Gamma.
[err,gamma]
= cvshrink(obj) also returns the vector
of Gamma values.
[err,gamma,delta]
= cvshrink(obj) also returns the vector
of Delta values.
[err,gamma,delta,numpred]
= cvshrink(obj) returns the vector
of number of nonzero predictors for each setting of the parameters
Gamma and Delta.
[err,...] = cvshrink(obj,Name,Value) cross
validates with additional options specified by one or more Name,Value pair
arguments.
Tips
Examine the err and numpred outputs
to see the tradeoff between crossvalidated error and number of predictors.
When you find a satisfactory point, set the corresponding gamma and delta properties
in the model using dot notation. For example, if (i,j) is
the location of the satisfactory point, set
obj.Gamma = gamma(i);
obj.Delta = delta(i,j);
obj 
Discriminant analysis classifier, produced using fitcdiscr.

NameValue Pair Arguments
Specify optional commaseparated pairs of Name,Value arguments.
Name is the argument
name and Value is the corresponding
value. Name must appear
inside single quotes (' ').
You can specify several name and value pair
arguments in any order as Name1,Value1,...,NameN,ValueN.
'delta' 
Scalar delta — cvshrink uses
this value of delta with every value of gamma for
regularization. Row vector delta — For each i and j, cvshrink uses delta(j) with gamma(i) for
regularization. Matrix delta — The number
of rows of delta must equal the number of elements
in gamma. For each i and j, cvshrink uses delta(i,j) with gamma(i) for
regularization.
Default: 0 
'gamma' 
Vector of Gamma values for crossvalidation.
Default: 0:0.1:1 
'NumDelta' 
Number of Delta intervals for crossvalidation. For every value
of Gamma, cvshrink crossvalidates the discriminant
using NumDelta + 1 values
of Delta, uniformly spaced from zero to the maximal Delta at which
all predictors are eliminated for this value of Gamma. If you set delta, cvshrink ignores NumDelta.
Default: 0 
'NumGamma' 
Number of Gamma intervals for crossvalidation. cvshrink crossvalidates
the discriminant using NumGamma + 1 values
of Gamma, uniformly spaced from MinGamma to 1.
If you set gamma, cvshrink ignores NumGamma.
Default: 10 
'verbose' 
Verbosity level, an integer from 0 to 2.
Higher values give more progress messages.
Default: 0 
Output Arguments
err 
Numeric vector or matrix of errors. err is
the misclassification error rate, meaning the average fraction of
misclassified data over all folds.
If delta is a scalar (default), err(i) is
the misclassification error rate for obj regularized
with gamma(i). If delta is a vector, err(i,j) is
the misclassification error rate for obj regularized
with gamma(i) and delta(j). If delta is a matrix, err(i,j) is
the misclassification error rate for obj regularized
with gamma(i) and delta(i,j).

gamma 
Vector of Gamma values used for regularization. See Gamma and Delta.

delta 
Vector or matrix of Delta values used for regularization. See Gamma and Delta.
If you give a scalar for the delta namevalue
pair, the output delta is a row vector the same
size as gamma, with entries equal to the input
scalar. If you give a row vector for the delta namevalue
pair, the output delta is a matrix with the same
number of columns as the row vector, and with the number of rows equal
to the number of elements of gamma. The output delta(i,j) is
equal to the input delta(j). If you give a matrix for the delta namevalue
pair, the output delta is the same as the input
matrix. The number of rows of delta must equal
the number of elements in gamma.

numpred 
Numeric vector or matrix containing the number of predictors
in the model at various regularizations. numpred has
the same size as err.
If delta is a scalar (default), numpred(i) is
the number of predictors for obj regularized with gamma(i) and delta. If delta is a vector, numpred(i,j) is
the number of predictors for obj regularized with gamma(i) and delta(j). If delta is a matrix, numpred(i,j) is
the number of predictors for obj regularized with gamma(i) and delta(i,j).

Definitions
Gamma and Delta
Regularization is the process of finding a small set of predictors
that yield an effective predictive model. For linear discriminant
analysis, there are two parameters, γ and δ,
that control regularization as follows. cvshrink helps
you select appropriate values of the parameters.
Let Σ represent the covariance matrix of the data X,
and let
be the centered data (the data X minus
the mean by class). Define
The regularized covariance matrix
is
Whenever γ ≥ MinGamma,
is nonsingular.
Let μ_{k} be the
mean vector for those elements of X in class k,
and let μ_{0} be the
global mean vector (the mean of the rows of X).
Let C be the correlation matrix of the data X,
and let
be the regularized correlation
matrix:
where I is the identity matrix.
The linear term in the regularized discriminant analysis classifier
for a data point x is
The parameter δ enters into this equation
as a threshold on the final term in square brackets. Each component
of the vector
is set to zero
if it is smaller in magnitude than the threshold δ.
Therefore, for class k, if component j is
thresholded to zero, component j of x does
not enter into the evaluation of the posterior probability.
The DeltaPredictor property is a vector related
to this threshold. When δ ≥ DeltaPredictor(i), all classes k have
Therefore, when δ ≥ DeltaPredictor(i), the regularized
classifier does not use predictor i.
Examples
expand all
Regularize a discriminant analysis classifier,
and view the tradeoff between the number of predictors in the model
and the classification accuracy.
Create a linear discriminant analysis classifier for the ovariancancer data.
Set the SaveMemory and FillCoeffs options
to keep the resulting model reasonably small.
load ovariancancer
obj = fitcdiscr(obs,grp,...
'SaveMemory','on','FillCoeffs','off');
Use 10 levels of Gamma and 10 levels
of Delta to search for good parameters. This search
is timeconsuming. Set Verbose to 1 to
view the progress.
rng('default') % for reproducibility
[err,gamma,delta,numpred] = cvshrink(obj,...
'NumGamma',9,'NumDelta',9,'Verbose',1);
Done building crossvalidated model.
Processing Gamma step 1 out of 10.
Processing Gamma step 2 out of 10.
Processing Gamma step 3 out of 10.
Processing Gamma step 4 out of 10.
Processing Gamma step 5 out of 10.
Processing Gamma step 6 out of 10.
Processing Gamma step 7 out of 10.
Processing Gamma step 8 out of 10.
Processing Gamma step 9 out of 10.
Processing Gamma step 10 out of 10.
Plot the classification error rate against the number
of predictors.
plot(err,numpred,'k.')
xlabel('Error rate');
ylabel('Number of predictors');
See Also
ClassificationDiscriminant  fitcdiscr
More About