# Documentation

### This is machine translation

Translated by
Mouse over text to see original. Click the button below to return to the English verison of the page.

# ClassificationDiscriminant.fit

Fit discriminant analysis classifier (to be removed)

`ClassificationDiscriminant.fit` will be removed in a future release. Use `fitcdiscr` instead.

## Syntax

`obj = ClassificationDiscriminant.fit(x,y)obj = ClassificationDiscriminant.fit(x,y,Name,Value)`

## Description

`obj = ClassificationDiscriminant.fit(x,y)` returns a discriminant analysis classifier based on the input variables (also known as predictors, features, or attributes) `x` and output (response) `y`.

`obj = ClassificationDiscriminant.fit(x,y,Name,Value)` fits a classifier with additional options specified by one or more `Name,Value` pair arguments. If you use one of the following five options, `obj` is of class `ClassificationPartitionedModel`: `'CrossVal'`, `'KFold'`, `'Holdout'`, `'Leaveout'`, or `'CVPartition'`. Otherwise, `obj` is of class `ClassificationDiscriminant`.

## Input Arguments

expand all

Predictor values, specified as a matrix of numeric values. Each column of `x` represents one variable, and each row represents one observation.

`ClassificationDiscriminant.fit` considers `NaN` values in `x` as missing values. `ClassificationDiscriminant.fit` does not use observations with missing values for `x` in the fit.

Example:

Data Types: `single` | `double`

Classification values, specified as a numeric vector, categorical vector (nominal or ordinal), logical vector, character array, or cell array of character vectors. Each row of `y` represents the classification of the corresponding row of `x`.

`ClassificationDiscriminant.fit` considers `NaN` values in `y` to be missing values. `ClassificationDiscriminant.fit` does not use observations with missing values for `y` in the fit.

Data Types: `single` | `double` | `logical` | `char` | `cell`

### Name-Value Pair Arguments

Specify optional comma-separated pairs of `Name,Value` arguments. `Name` is the argument name and `Value` is the corresponding value. `Name` must appear inside single quotes (`' '`). You can specify several name and value pair arguments in any order as `Name1,Value1,...,NameN,ValueN`.

expand all

Class names, specified as the comma-separated pair consisting of `'ClassNames'` and an array. Use the data type that exists in `y`. The default is the class names that exist in `y`. Use `ClassNames` to order the classes or to select a subset of classes for training.

Data Types: `single` | `double` | `logical` | `char`

Cost of misclassification, specified as the comma-separated pair consisting of `'Cost'` and a square matrix, where `Cost(i,j)` is the cost of classifying a point into class `j` if its true class is `i`. Alternatively, `Cost` can be a structure `S` having two fields: `S.ClassNames` containing the group names as a variable of the same type as `y`, and `S.ClassificationCosts` containing the cost matrix.

The default is `Cost(i,j)=1` if `i~=j`, and `Cost(i,j)=0` if `i=j`.

Data Types: `single` | `double` | `struct`

Cross-validation flag, specified as the comma-separated pair consisting of `'Crossval'` and `'on'` or `'off'`.

If you specify `'on'`, then the software implements 10-fold cross-validation.

To override this cross-validation setting, use one of these name-value pair arguments: `CVPartition`, `Holdout`, `KFold`, or `Leaveout`. To create a cross-validated model, you can use one cross-validation name-value pair argument at a time only.

Alternatively, cross validate later by passing `Mdl` to `crossval`.

Example: `'CrossVal','on'`

Cross-validated model partition, specified as the comma-separated pair consisting of `'CVPartition'` and an object created using `cvpartition`.

To create a cross-validated model, you can use one of these four name-value pair arguments only: `CVPartition`, `Holdout`, `KFold`, or `Leaveout`.

Linear coefficient threshold, specified as the comma-separated pair consisting of `'Delta'` and a nonnegative scalar value. If a coefficient of `Mdl` has magnitude smaller than `Delta`, `Mdl` sets this coefficient to `0`, and you can eliminate the corresponding predictor from the model. Set `Delta` to a higher value to eliminate more predictors.

`Delta` must be `0` for quadratic discriminant models.

Data Types: `single` | `double`

Discriminant type, specified as the comma-separated pair consisting of `'DiscrimType'` and a character vector in this table.

ValueDescriptionPredictor Covariance Treatment
`'linear'`Regularized linear discriminant analysis (LDA)
• All classes have the same covariance matrix.

• `${\stackrel{^}{\Sigma }}_{\gamma }=\left(1-\gamma \right)\stackrel{^}{\Sigma }+\gamma \text{diag}\left(\stackrel{^}{\Sigma }\right).$`

$\stackrel{^}{\Sigma }$ is the empirical, pooled covariance matrix and γ is the amount of regularization.

`'diaglinear'`LDAAll classes have the same, diagonal covariance matrix.
`'pseudolinear'`LDAAll classes have the same covariance matrix. The software inverts the covariance matrix using the pseudo inverse.
`'quadratic'`Quadratic discriminant analysis (QDA)The covariance matrices can vary among classes.
`'diagquadratic'`QDAThe covariance matrices are diagonal and can vary among classes.
`'pseudoquadratic'`QDAThe covariance matrices can vary among classes. The software inverts the covariance matrix using the pseudo inverse.

 Note:   To use regularization, you must specify `'linear'`. To specify the amount of regularization, use the `Gamma` name-value pair argument.

Example: `'DiscrimType','quadratic'`

`Coeffs` property flag, specified as the comma-separated pair consisting of `'FillCoeffs'` and `'on'` or `'off'`. Setting the flag to `'on'` populates the `Coeffs` property in the classifier object. This can be computationally intensive, especially when cross validating. The default is `'on'`, unless you specify a cross validation name-value pair, in which case the flag is set to `'off'` by default.

Example: `'FillCoeffs','off'`

Amount of regularization to apply when estimating the covariance matrix of the predictors, specified as the comma-separated pair consisting of `'Gamma'` and a scalar value in the interval [0,1]. `Gamma` provides finer control over the covariance matrix structure than `DiscrimType`.

• If you specify `0`, then the software does not use regularization to adjust the covariance matrix. That is, the software estimates and uses the unrestricted, empirical covariance matrix.

• For linear discriminant analysis, if the empirical covariance matrix is singular, then the software automatically applies the minimal regularization required to invert the covariance matrix. You can display the chosen regularization amount by entering `Mdl.Gamma` at the command line.

• For quadratic discriminant analysis, if at least one class has an empirical covariance matrix that is singular, then the software throws an error.

• If you specify a value in the interval (0,1), then you must implement linear discriminant analysis, otherwise the software throws an error. Consequently, the software sets `DiscrimType` to `'linear'`.

• If you specify `1`, then the software uses maximum regularization for covariance matrix estimation. That is, the software restricts the covariance matrix to be diagonal. Alternatively, you can set `DiscrimType` to `'diagLinear'` or `'diagQuadratic'` for diagonal covariance matrices.

Example: `'Gamma',1`

Data Types: `single` | `double`

Fraction of data used for holdout validation, specified as the comma-separated pair consisting of `'Holdout'` and a scalar value in the range (0,1). If you specify `'Holdout',p`, then the software:

1. Randomly reserves `p*100`% of the data as validation data, and trains the model using the rest of the data

2. Stores the compact, trained model in the `Trained` property of the cross-validated model.

To create a cross-validated model, you can use one of these four name-value pair arguments only: `CVPartition`, `Holdout`, `KFold`, or `Leaveout`.

Example: `'Holdout',0.1`

Data Types: `double` | `single`

Number of folds to use in a cross-validated classifier, specified as the comma-separated pair consisting of `'KFold'` and a positive integer value greater than 1. If you specify, e.g., `'KFold',k`, then the software:

1. Randomly partitions the data into k sets

2. For each set, reserves the set as validation data, and trains the model using the other k – 1 sets

3. Stores the `k` compact, trained models in the cells of a `k`-by-1 cell vector in the `Trained` property of the cross-validated model.

To create a cross-validated model, you can use one of these four options only: `CVPartition`, `Holdout`, `KFold`, or `Leaveout`.

Example: `'KFold',8`

Data Types: `single` | `double`

Leave-one-out cross-validation flag, specified as the comma-separated pair consisting of `'Leaveout'` and `'on'` or `'off'`. If you specify `'Leaveout','on'`, then, for each of the n observations, where n is `size(Mdl.X,1)`, the software:

1. Reserves the observation as validation data, and trains the model using the other n – 1 observations

2. Stores the n compact, trained models in the cells of a n-by-1 cell vector in the `Trained` property of the cross-validated model.

To create a cross-validated model, you can use one of these four options only: `'``CVPartition``'`, `'``Holdout``'`, `'``KFold``'`, or `'``Leaveout``'`.

Example: `'Leaveout','on'`

Data Types: `char`

Predictor variable names, specified as the comma-separated pair consisting of `'PredictorNames'` and a cell array of character vectors containing the names for the predictor variables, in the order in which they appear in `X`.

Data Types: `cell`

Prior probabilities for each class, specified as the comma-separated pair consisting of `'Prior'` and one of the following.

• `'empirical'` determines class probabilities from class frequencies in `y`. If you pass observation weights, they are used to compute the class probabilities.

• `'uniform'` sets all class probabilities equal.

• A vector containing one scalar value for each class.

• A structure `S` with two fields:

• `S.ClassNames` containing the class names as a variable of the same type as `y`

• `S.ClassProbs` containing a vector of corresponding probabilities

Example: `'Prior','uniform'`

Data Types: `single` | `double` | `char` | `struct`

Response variable name, specified as the comma-separated pair consisting of `'ResponseName'` and a character vector containing the name of the response variable `y`.

Example: `'ResponseName','Response'`

Data Types: `char`

Flag to save covariance matrix, specified as the comma-separated pair consisting of `'SaveMemory'` and either `'on'` or `'off'`. If you specify `'on'`, then `fitcdiscr` does not store the full covariance matrix, but instead stores enough information to compute the matrix. The `predict` method computes the full covariance matrix for prediction, and does not store the matrix. If you specify `'off'`, then `fitcdiscr` computes and stores the full covariance matrix in `Mdl`.

Specify `SaveMemory` as `'on'` when the input matrix contains thousands of predictors.

Example: `'SaveMemory','on'`

Score transform function, specified as the comma-separated pair consisting of `'ScoreTransform'` and a function handle or value in this table.

ValueFormula
`'doublelogit'`1/(1 + e–2x)
`'invlogit'`log(x / (1–x))
`'ismax'`Set the score for the class with the largest score to `1`, and scores for all other classes to `0`.
`'logit'`1/(1 + ex)
`'none'` or `'identity'`x (no transformation)
`'sign'`–1 for x < 0
0 for x = 0
1 for x > 0
`'symmetric'`2x – 1
`'symmetriclogit'`2/(1 + ex) – 1
`'symmetricismax'`Set the score for the class with the largest score to `1`, and scores for all other classes to `-1`.

For a MATLAB® function, or a function that you define, enter its function handle.

`Mdl.ScoreTransform = @function;`

`function` should accept a matrix (the original scores) and return a matrix of the same size (the transformed scores).

Example: `'ScoreTransform','logit'`

Data Types: `function_handle` | `char`

Observation weights, specified as the comma-separated pair consisting of `'Weights'` and a vector of scalar values. The length of `Weights` is the number of rows in `X`. `fitcdiscr` normalizes the weights to sum to 1.

Data Types: `single` | `double`

## Output Arguments

expand all

Discriminant analysis classifier, returned as a classifier object.

Note that using the `'CrossVal'`, `'KFold'`, `'Holdout'`, `'Leaveout'`, or `'CVPartition'` options results in a tree of class `ClassificationPartitionedModel`. You cannot use a partitioned tree for prediction, so this kind of tree does not have a `predict` method.

Otherwise, `obj` is of class `ClassificationDiscriminant`, and you can use the `predict` method to predict the response of new data.

## Definitions

### Discriminant Classification

The model for discriminant analysis is:

• Each class (`Y`) generates data (`X`) using a multivariate normal distribution. That is, the model assumes `X` has a Gaussian mixture distribution (`gmdistribution`).

• For linear discriminant analysis, the model has the same covariance matrix for each class, only the means vary.

• For quadratic discriminant analysis, both means and covariances of each class vary.

`predict` classifies so as to minimize the expected classification cost:

`$\stackrel{^}{y}=\underset{y=1,...,K}{\mathrm{arg}\mathrm{min}}\sum _{k=1}^{K}\stackrel{^}{P}\left(k|x\right)C\left(y|k\right),$`

where

• $\stackrel{^}{y}$ is the predicted classification.

• K is the number of classes.

• $\stackrel{^}{P}\left(k|x\right)$ is the posterior probability of class k for observation x.

• $C\left(y|k\right)$ is the cost of classifying an observation as y when its true class is k.

For details, see How the predict Method Classifies.

## Examples

expand all

`load fisheriris`

Construct a discriminant analysis classifier using the sample data.

```obj = ClassificationDiscriminant.fit(meas,species) ```
```obj = ClassificationDiscriminant PredictorNames: {'x1' 'x2' 'x3' 'x4'} ResponseName: 'Y' ClassNames: {'setosa' 'versicolor' 'virginica'} ScoreTransform: 'none' NumObservations: 150 DiscrimType: 'linear' Mu: [3x4 double] Coeffs: [3x3 struct] Properties, Methods```

## Alternatives

The `classify` function also performs discriminant analysis. `classify` is usually more awkward to use:

• `classify` requires you to fit the classifier every time you make a new prediction.

• `classify` does not perform cross validation.

• `classify` requires you to fit the classifier when changing prior probabilities.