Documentation

This is machine translation

Translated by Microsoft
Mouse over text to see original. Click the button below to return to the English verison of the page.

fitcdiscr

Fit discriminant analysis classifier

Syntax

  • Mdl = fitcdiscr(Tbl,ResponseVarName)
  • Mdl = fitcdiscr(Tbl,formula)
  • Mdl = fitcdiscr(Tbl,Y)
  • Mdl = fitcdiscr(___,Name,Value)
    example

Description

Mdl = fitcdiscr(Tbl,ResponseVarName) returns a fitted discriminant analysis model based on the input variables (also known as predictors, features, or attributes) contained in the table Tbl and output (response or labels) contained in ResponseVarName.

Mdl = fitcdiscr(Tbl,formula) returns a fitted discriminant analysis model based on the input variables contained in the table Tbl. formula is an explanatory model of the response and a subset of predictor variables in Tbl used to fit Mdl.

Mdl = fitcdiscr(Tbl,Y) returns a fitted discriminant analysis model based on the input variables contained in the table Tbl and response Y.

example

Mdl = fitcdiscr(X,Y) returns a discriminant analysis classifier based on the input variables X and response Y.

example

Mdl = fitcdiscr(___,Name,Value) fits a classifier with additional options specified by one or more name-value pair arguments, using any of the previous syntaxes. For example, you can optimize hyperparameters to minimize the model's cross-validation loss, or specify the cost of misclassification, the prior probabilities for each class, or the observation weights.

Examples

collapse all

Load Fisher's iris data set.

load fisheriris

Train a discriminant analysis model using the entire data set.

Mdl = fitcdiscr(meas,species)
Mdl = 

  ClassificationDiscriminant
             ResponseName: 'Y'
    CategoricalPredictors: []
               ClassNames: {'setosa'  'versicolor'  'virginica'}
           ScoreTransform: 'none'
          NumObservations: 150
              DiscrimType: 'linear'
                       Mu: [3×4 double]
                   Coeffs: [3×3 struct]


Mdl is a ClassificationDiscriminant model. To access its properties, use dot notation. For example, display the group means for each predictor.

Mdl.Mu
ans =

    5.0060    3.4280    1.4620    0.2460
    5.9360    2.7700    4.2600    1.3260
    6.5880    2.9740    5.5520    2.0260

To predict lables for new observations, pass Mdl and predictor data to predict.

This example shows how to optimize hyperparameters automatically using fitcdiscr. The example uses Fisher's iris data.

Load the data.

load fisheriris

Find hyperparameters that minimize five-fold cross-validation loss by using automatic hyperparameter optimization.

For reproducibility, set the random seed and use the 'expected-improvement-plus' acquisition function.

rng(1)
Mdl = fitcdiscr(meas,species,'OptimizeHyperparameters','auto',...
    'HyperparameterOptimizationOptions',...
    struct('AcquisitionFunctionName','expected-improvement-plus'))
|=================================================================================================|
| Iter | Eval   | Objective  | Objective  | BestSoFar  | BestSoFar  |        Delta |        Gamma |
|      | result |            | runtime    | (observed) | (estim.)   |              |              |
|=================================================================================================|
|    1 | Best   |       0.04 |     6.9812 |       0.04 |       0.04 |   9.8475e-06 |      0.94133 |
|    2 | Accept |       0.04 |     2.4727 |       0.04 |       0.04 |      0.39336 |      0.94938 |
|    3 | Accept |   0.046667 |     1.5047 |       0.04 |   0.039998 |   0.00032175 |      0.53586 |
|    4 | Accept |       0.04 |     1.4135 |       0.04 |   0.041412 |       1.3006 |      0.46226 |
|    5 | Best   |   0.033333 |     1.5036 |   0.033333 |       0.04 |       1.4967 |      0.53372 |
|    6 | Accept |       0.04 |      1.344 |   0.033333 |       0.04 |   3.3864e-05 |      0.79533 |
|    7 | Accept |       0.06 |     1.4353 |   0.033333 |   0.042857 |       3.7209 |      0.84714 |
|    8 | Accept |   0.033333 |     1.3759 |   0.033333 |    0.03333 |       2.0873 |      0.53311 |
|    9 | Accept |    0.66667 |     1.1915 |   0.033333 |    0.11111 |       6.8972 |      0.55358 |
|   10 | Best   |   0.026667 |     1.3642 |   0.026667 |    0.10267 |    0.0020277 |      0.10686 |
|   11 | Accept |   0.046667 |      1.203 |   0.026667 |   0.097576 |     0.069484 |      0.64092 |
|   12 | Accept |   0.026667 |     1.5132 |   0.026667 |   0.091667 |       1.6561 |     0.032319 |
|   13 | Best   |       0.02 |     1.1429 |       0.02 |   0.086154 |      0.43465 |     0.019285 |
|   14 | Accept |   0.033333 |     1.2601 |       0.02 |   0.082381 |       3.6017 |      0.92783 |
|   15 | Accept |       0.04 |     1.7448 |       0.02 |   0.079556 |      0.56861 |      0.32796 |
|   16 | Accept |   0.046667 |     1.5929 |       0.02 |     0.0775 |     0.062482 |      0.45071 |
|   17 | Accept |   0.033333 |     1.4724 |       0.02 |   0.074902 |       1.0219 |      0.27807 |
|   18 | Accept |    0.66667 |      1.523 |       0.02 |   0.020264 |       85.872 |       0.2624 |
|   19 | Accept |   0.046667 |      1.314 |       0.02 |   0.020262 |   0.00084501 |      0.49636 |
|   20 | Accept |       0.04 |     1.2631 |       0.02 |   0.020256 |   0.00010171 |      0.94956 |
|=================================================================================================|
| Iter | Eval   | Objective  | Objective  | BestSoFar  | BestSoFar  |        Delta |        Gamma |
|      | result |            | runtime    | (observed) | (estim.)   |              |              |
|=================================================================================================|
|   21 | Accept |   0.046667 |     1.4348 |       0.02 |   0.020247 |    0.0054478 |      0.68698 |
|   22 | Accept |       0.02 |     1.6211 |       0.02 |   0.020008 |     0.016218 |    0.0095266 |
|   23 | Accept |       0.04 |     1.4904 |       0.02 |   0.020007 |   3.6005e-06 |      0.27716 |
|   24 | Accept |       0.04 |     1.5362 |       0.02 |   0.020007 |   1.2727e-06 |      0.95223 |
|   25 | Accept |    0.66667 |     1.3754 |       0.02 |   0.020064 |       997.38 |      0.44271 |
|   26 | Accept |       0.04 |     1.2032 |       0.02 |   0.020093 |     0.028087 |      0.99234 |
|   27 | Accept |       0.02 |     1.3256 |       0.02 |   0.020108 |      0.17018 |      0.04175 |
|   28 | Accept |   0.026667 |     1.5258 |       0.02 |   0.020122 |   1.8406e-05 |      0.13728 |
|   29 | Accept |       0.02 |     1.2007 |       0.02 |   0.020138 |     2.06e-06 |     0.015157 |
|   30 | Accept |       0.02 |     1.6354 |       0.02 |    0.02016 |   0.00018042 |     0.059692 |

__________________________________________________________
Optimization completed.
MaxObjectiveEvaluations of 30 reached.
Total function evaluations: 30
Total elapsed time: 276.698 seconds.
Total objective function evaluation time: 48.9646

Best observed feasible point:
     Delta      Gamma  
    _______    ________

    0.43465    0.019285

Observed objective function value = 0.02
Estimated objective function value = 0.02016
Function evaluation time = 1.1429

Best estimated feasible point (according to models):
     Delta       Gamma  
    ________    ________

    2.06e-06    0.015157

Estimated objective function value = 0.02016
Estimated function evaluation time = 1.5083


Mdl = 

  ClassificationDiscriminant
                         ResponseName: 'Y'
                CategoricalPredictors: []
                           ClassNames: {'setosa'  'versicolor'  'virginica'}
                       ScoreTransform: 'none'
                      NumObservations: 150
    HyperparameterOptimizationResults: [1×1 BayesianOptimization]
                          DiscrimType: 'linear'
                                   Mu: [3×4 double]
                               Coeffs: [3×3 struct]


The fit achieved about 2% loss for the default 5-fold cross validation.

Input Arguments

collapse all

Sample data used to train the model, specified as a table. Each row of Tbl corresponds to one observation, and each column corresponds to one predictor variable. Optionally, Tbl can contain one additional column for the response variable. Multi-column variables and cell arrays other than cell arrays of character vectors are not allowed.

If Tbl contains the response variable, and you want to use all remaining variables in Tbl as predictors, then specify the response variable using ResponseVarName.

If Tbl contains the response variable, and you want to use only a subset of the remaining variables in Tbl as predictors, then specify a formula using formula.

If Tbl does not contain the response variable, then specify a response variable using Y. The length of response variable and the number of rows of Tbl must be equal.

Data Types: table

Response variable name, specified as the name of a variable in Tbl.

You must specify ResponseVarName as a character vector. For example, if the response variable Y is stored as Tbl.Y, then specify it as 'Y'. Otherwise, the software treats all columns of Tbl, including Y, as predictors when training the model.

The response variable must be a categorical or character array, logical or numeric vector, or cell array of character vectors. If Y is a character array, then each element must correspond to one row of the array.

It is good practice to specify the order of the classes using the ClassNames name-value pair argument.

Data Types: char

Explanatory model of the response and a subset of the predictor variables, specified as a character vector in the form of 'Y~X1+X2+X3'. In this form, Y represents the response variable, and X1, X2, and X3 represent the predictor variables. The variables must be variable names in Tbl (Tbl.Properties.VariableNames).

To specify a subset of variables in Tbl as predictors for training the model, use a formula. If you specify a formula, then the software does not use any variables in Tbl that do not appear in formula.

Data Types: char

Class labels, specified as a categorical or character array, logical or numeric vector, or cell array of character vectors. Each row of Y represents the classification of the corresponding row of X.

The software considers NaN, '' (empty character vector), and <undefined> values in Y to be missing values. Consequently, the software does not train using observations with a missing response.

Data Types: single | double | logical | char | cell

Predictor values, specified as a numeric matrix. Each column of X represents one variable, and each row represents one observation.

fitcdiscr considers NaN values in X as missing values. fitcdiscr does not use observations with missing values for X in the fit.

Data Types: single | double

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'DiscrimType','quadratic','SaveMemory','on' specifies a quadratic discriminant classifier and does not store the covariance matrix in the output object.

    Note:   You cannot use any cross-validation name-value pair along with OptimizeHyperparameters. You can modify the cross-validation for OptimizeHyperparameters only by using the HyperparameterOptimizationOptions name-value pair.

Model Parameters

collapse all

Names of classes to use for training, specified as the comma-separated pair consisting of 'ClassNames' and a categorical or character array, logical or numeric vector, or cell array of character vectors. ClassNames must be the same data type as Y.

If ClassNames is a character array, then each element must correspond to one row of the array.

Use ClassNames to:

  • Order the classes during training.

  • Specify the order of any input or output argument dimension that corresponds to the class order. For example, use ClassNames to specify the order of the dimensions of Cost or the column order of classification scores returned by predict.

  • Select a subset of classes for training. For example, suppose that the set of all distinct class names in Y is {'a','b','c'}. To train the model using observations from classes 'a' and 'c' only, specify 'ClassNames',{'a','c'}.

The default is the set of all distinct class names in Y.

Example: 'ClassNames',{'b','g'}

Data Types: categorical | char | logical | single | double | cell

Cost of misclassification of a point, specified as the comma-separated pair consisting of 'Cost' and one of the following:

  • Square matrix, where Cost(i,j) is the cost of classifying a point into class j if its true class is i (i.e., the rows correspond to the true class and the columns correspond to the predicted class). To specify the class order for the corresponding rows and columns of Cost, additionally specify the ClassNames name-value pair argument.

  • Structure S having two fields: S.ClassNames containing the group names as a variable of the same type as Y, and S.ClassificationCosts containing the cost matrix.

The default is Cost(i,j)=1 if i~=j, and Cost(i,j)=0 if i=j.

Data Types: single | double | struct

Coeffs property flag, specified as the comma-separated pair consisting of 'FillCoeffs' and 'on' or 'off'. Setting the flag to 'on' populates the Coeffs property in the classifier object. This can be computationally intensive, especially when cross validating. The default is 'on', unless you specify a cross validation name-value pair, in which case the flag is set to 'off' by default.

Example: 'FillCoeffs','off'

Predictor variable names, specified as the comma-separated pair consisting of 'PredictorNames' and a cell array of unique character vectors. The functionality of 'PredictorNames' depends on the way you supply the training data.

  • If you supply X and Y, then you can use 'PredictorNames' to give the predictor variables in X names.

    • The order of the names in PredcitorNames must correspond to the column order of X. That is, PredictorNames{1} is the name of X(:,1), PredictorNames{2} is the name of X(:,2), and so on. Also, size(X,2) and numel(PredictorNames) must be equal.

    • By default, PredictorNames is {x1,x2,...}.

  • If you supply Tbl, then you can use 'PredictorNames' to choose which predictor variables to use in training. That is, fitcdiscr uses the predictor variables in PredictorNames and the response only in training.

    • PredictorNames must be a subset of Tbl.Properties.VariableNames and cannot include the name of the response variable.

    • By default, PredictorNames contains the names of all predictor variables.

    • It good practice to specify the predictors for training using one of 'PredictorNames' or formula only.

Example: 'PredictorNames',{'SepalLength','SepalWidth','PedalLength','PedalWidth'}

Data Types: cell

Prior probabilities for each class, specified as the comma-separated pair consisting of 'Prior' and a value in this table.

ValueDescription
'empirical'The class prior probabilities are the class relative frequencies in Y.
'uniform'All class prior probabilities are equal to 1/K, where K is the number of classes.
numeric vectorEach element is a class prior probability. Order the elements according to Mdl.ClassNames or specify the order using the ClassNames name-value pair argument. The software normalizes the elements such that they sum to 1.
structure

A structure S with two fields:

  • S.ClassNames contains the class names as a variable of the same type as Y.

  • S.ClassProbs contains a vector of corresponding prior probabilities. The software normalizes the elements such that they sum to 1.

If you set values for both Weights and Prior, the weights are renormalized to add up to the value of the prior probability in the respective class.

Example: 'Prior','uniform'

Data Types: single | double | struct

Response variable name, specified as the comma-separated pair consisting of 'ResponseName' and a character vector.

  • If you supply Y, then you can use 'ResponseName' to specify a name for the response variable.

  • If you supply ResponseVarName or formula, then you cannot use 'ResponseName'.

Example: 'ResponseName','response'

Data Types: char

Flag to save covariance matrix, specified as the comma-separated pair consisting of 'SaveMemory' and either 'on' or 'off'. If you specify 'on', then fitcdiscr does not store the full covariance matrix, but instead stores enough information to compute the matrix. The predict method computes the full covariance matrix for prediction, and does not store the matrix. If you specify 'off', then fitcdiscr computes and stores the full covariance matrix in Mdl.

Specify SaveMemory as 'on' when the input matrix contains thousands of predictors.

Example: 'SaveMemory','on'

Score transform function, specified as the comma-separated pair consisting of 'ScoreTransform' and a function handle or value in this table.

ValueFormula
'doublelogit'1/(1 + e–2x)
'invlogit'log(x / (1–x))
'ismax'Set the score for the class with the largest score to 1, and scores for all other classes to 0.
'logit'1/(1 + ex)
'none' or 'identity'x (no transformation)
'sign'–1 for x < 0
0 for x = 0
1 for x > 0
'symmetric'2x – 1
'symmetriclogit'2/(1 + ex) – 1
'symmetricismax'Set the score for the class with the largest score to 1, and scores for all other classes to -1.

For a MATLAB® function, or a function that you define, enter its function handle.

Mdl.ScoreTransform = @function;

function should accept a matrix (the original scores) and return a matrix of the same size (the transformed scores).

Example: 'ScoreTransform','logit'

Data Types: function_handle | char

Observation weights, specified as the comma-separated pair consisting of 'Weights' and a numeric vector of positive values or name of a variable in Tbl. The software weighs the observations in each row of X or Tbl with the corresponding value in Weights. The size of Weights must equal the number of rows of X or Tbl.

If you specify the input data as a table Tbl, then Weights can be the name of a variable in Tbl that contains a numeric vector. In this case, you must specify Weights as a character vector. For example, if the weights vector W is stored as Tbl.W, then specify it as 'W'. Otherwise, the software treats all columns of Tbl, including W, as predictors or the response when training the model.

The software normalizes Weights to sum up to the value of the prior probability in the respective class.

By default, Weights is ones(n,1), where n is the number of observations in X or Tbl.

Data Types: double | single | char

Cross-Validation

collapse all

Cross-validation flag, specified as the comma-separated pair consisting of 'Crossval' and 'on' or 'off'.

If you specify 'on', then the software implements 10-fold cross-validation.

To override this cross-validation setting, use one of these name-value pair arguments: CVPartition, Holdout, KFold, or Leaveout. To create a cross-validated model, you can use one cross-validation name-value pair argument at a time only.

Alternatively, cross validate later by passing Mdl to crossval.

Example: 'CrossVal','on'

Cross-validation partition, specified as the comma-separated pair consisting of 'CVPartition' and a cvpartition partition object as created by cvpartition. The partition object specifies the type of cross-validation, and also the indexing for training and validation sets.

To create a cross-validated model, you can use one of these four name-value pair arguments only: CVPartition, Holdout, KFold, or Leaveout.

Fraction of data used for holdout validation, specified as the comma-separated pair consisting of 'Holdout' and a scalar value in the range (0,1). If you specify 'Holdout',p, then the software:

  1. Randomly reserves p*100% of the data as validation data, and trains the model using the rest of the data

  2. Stores the compact, trained model in the Trained property of the cross-validated model.

To create a cross-validated model, you can use one of these four name-value pair arguments only: CVPartition, Holdout, KFold, or Leaveout.

Example: 'Holdout',0.1

Data Types: double | single

Number of folds to use in a cross-validated classifier, specified as the comma-separated pair consisting of 'KFold' and a positive integer value greater than 1. If you specify, e.g., 'KFold',k, then the software:

  1. Randomly partitions the data into k sets

  2. For each set, reserves the set as validation data, and trains the model using the other k – 1 sets

  3. Stores the k compact, trained models in the cells of a k-by-1 cell vector in the Trained property of the cross-validated model.

To create a cross-validated model, you can use one of these four name-value pair arguments only: CVPartition, Holdout, KFold, or Leaveout.

Example: 'KFold',5

Data Types: single | double

Leave-one-out cross-validation flag, specified as the comma-separated pair consisting of 'Leaveout' and 'on' or 'off'. If you specify 'Leaveout','on', then, for each of the n observations, where n is size(Mdl.X,1), the software:

  1. Reserves the observation as validation data, and trains the model using the other n – 1 observations

  2. Stores the n compact, trained models in the cells of an n-by-1 cell vector in the Trained property of the cross-validated model.

To create a cross-validated model, you can use one of these four name-value pair arguments only: CVPartition, Holdout, KFold, or Leaveout.

Example: 'Leaveout','on'

Data Types: char

Hyperparameters

collapse all

Linear coefficient threshold, specified as the comma-separated pair consisting of 'Delta' and a nonnegative scalar value. If a coefficient of Mdl has magnitude smaller than Delta, Mdl sets this coefficient to 0, and you can eliminate the corresponding predictor from the model. Set Delta to a higher value to eliminate more predictors.

Delta must be 0 for quadratic discriminant models.

Data Types: single | double

Discriminant type, specified as the comma-separated pair consisting of 'DiscrimType' and a character vector in this table.

ValueDescriptionPredictor Covariance Treatment
'linear'Regularized linear discriminant analysis (LDA)
  • All classes have the same covariance matrix.

  • Σ^γ=(1γ)Σ^+γdiag(Σ^).

    Σ^ is the empirical, pooled covariance matrix and γ is the amount of regularization.

'diaglinear'LDAAll classes have the same, diagonal covariance matrix.
'pseudolinear'LDAAll classes have the same covariance matrix. The software inverts the covariance matrix using the pseudo inverse.
'quadratic'Quadratic discriminant analysis (QDA)The covariance matrices can vary among classes.
'diagquadratic'QDAThe covariance matrices are diagonal and can vary among classes.
'pseudoquadratic'QDAThe covariance matrices can vary among classes. The software inverts the covariance matrix using the pseudo inverse.

    Note:   To use regularization, you must specify 'linear'. To specify the amount of regularization, use the Gamma name-value pair argument.

Example: 'DiscrimType','quadratic'

Amount of regularization to apply when estimating the covariance matrix of the predictors, specified as the comma-separated pair consisting of 'Gamma' and a scalar value in the interval [0,1]. Gamma provides finer control over the covariance matrix structure than DiscrimType.

  • If you specify 0, then the software does not use regularization to adjust the covariance matrix. That is, the software estimates and uses the unrestricted, empirical covariance matrix.

    • For linear discriminant analysis, if the empirical covariance matrix is singular, then the software automatically applies the minimal regularization required to invert the covariance matrix. You can display the chosen regularization amount by entering Mdl.Gamma at the command line.

    • For quadratic discriminant analysis, if at least one class has an empirical covariance matrix that is singular, then the software throws an error.

  • If you specify a value in the interval (0,1), then you must implement linear discriminant analysis, otherwise the software throws an error. Consequently, the software sets DiscrimType to 'linear'.

  • If you specify 1, then the software uses maximum regularization for covariance matrix estimation. That is, the software restricts the covariance matrix to be diagonal. Alternatively, you can set DiscrimType to 'diagLinear' or 'diagQuadratic' for diagonal covariance matrices.

Example: 'Gamma',1

Data Types: single | double

Hyperparameter Optimization

collapse all

Parameters to optimize, specified as:

  • 'none' — Do not optimize.

  • 'auto' — Use {'Delta','Gamma'}

  • 'all' — Optimize all eligible parameters.

  • Cell array of eligible parameter names

  • Vector of optimizableVariable objects, typically the output of hyperparameters

The optimization attempts to minimize the cross-validation loss (error) for fitcdiscr by varying the parameters. For information about cross-validation loss (albeit in a different context), see Classification Loss. To control the cross-validation type and other aspects of the optimization, use the HyperparameterOptimizationOptions name-value pair.

The eligible parameters for fitcdiscr are:

  • Deltafitcdiscr searches among positive values, by default log-scaled in the range [1e-6,1e3].

  • DiscrimTypefitcdiscr searches among 'linear', 'quadratic', 'diagLinear', 'diagQuadratic', 'pseudoLinear', and 'pseudoQuadratic'.

  • Gammafitcdiscr searches among real values in the range [0,1].

Set nondefault parameters by passing a vector of optimizableVariable objects that have nondefault values. For example,

load fisheriris
params = hyperparameters('fitcdiscr',meas,species);
params(1).Range = [1e-4,1e6];

Pass params as the value of OptimizeHyperparameters.

By default, iterative display appears at the command line, and plots appear according to the number of hyperparameters in the optimization. For the optimization and plots, the objective function is log(1 + cross-validation loss) for regression, and the misclassification rate for classification. To control the iterative display, set the HyperparameterOptimizationOptions name-value pair, Verbose field. To control the plots, set the HyperparameterOptimizationOptions name-value pair, ShowPlots field.

For an example, see Optimize Discriminant Analysis Model.

Example: 'auto'

Data Types: char | cell

Options for optimization, specified as a structure. Modifies the effect of the OptimizeHyperparameters name-value pair. All fields in the structure are optional.

Field NameValuesDefault
Optimizer
  • 'bayesopt' — Use Bayesian optimization. Internally, this setting calls bayesopt.

  • 'gridsearch' — Use grid search with NumGridDivisions values per dimension.

  • 'randomsearch' — Search at random among MaxObjectiveEvaluations points.

'gridsearch' searches in a random order, using uniform sampling without replacement from the grid. After optimization, you can get a table in grid order by using the commandsortrows(Mdl.ParameterOptimizationResults).

'bayesopt'
AcquisitionFunctionName
  • 'expected-improvement-per-second-plus'

  • 'expected-improvement'

  • 'expected-improvement-plus'

  • 'expected-improvement-per-second'

  • 'lower-confidence-bound'

  • 'probability-of-improvement'

For details, see the bayesopt AcquisitionFunctionName name-value pair, or Acquisition Function Types.
'expected-improvement-per-second-plus'
MaxObjectiveEvaluationsMaximum number of objective function evaluations.30 for 'bayesopt' or 'randomsearch', and the entire grid for 'gridsearch'
NumGridDivisionsFor 'gridsearch', the number of values in each dimension. Can be a vector of positive integers giving the number of values for each dimension, or a scalar that applies to all dimensions. Ignored for categorical variables.10
ShowPlotsLogical value indicating whether to show plots. If true, plots the best objective function value against iteration number. If there are one or two optimization parameters, and if Optimizer is 'bayesopt', then ShowPlots also plots a model of the objective function against the parameters.true
SaveIntermediateResultsLogical value indicating whether to save results when Optimizer is 'bayesopt'. If true, overwrites a workspace variable named 'BayesoptResults' at each iteration. The variable is a BayesianOptimization object.false
VerboseDisplay to the command line.
  • 0 — No iterative display

  • 1 — Iterative display

  • 2 — Iterative display with extra information

For details, see the bayesoptVerbose name-value pair.
1
Repartition

Logical value indicating whether to repartition the cross-validation at every iteration. If false, the optimizer uses a single partition for the optimization.

true usually gives the most robust results because this setting takes partitioning noise into account. However, for good results, true requires at least twice as many function evaluations.

false
Use no more than one of the following three field names.
CVPartitionA cvpartition object, as created by cvpartitionKfold = 5
HoldoutA scalar in the range (0,1) representing the holdout fraction.
KfoldAn integer greater than 1.

Example: struct('MaxObjectiveEvaluations',60)

Data Types: struct

Output Arguments

collapse all

Trained discriminant analysis classification model, returned as a ClassificationDiscriminant model object or a ClassificationPartitionedModel cross-validated model object.

If you set any of the name-value pair arguments KFold, Holdout, CrossVal, or CVPartition, then Mdl is a ClassificationPartitionedModel cross-validated model object. Otherwise, Mdl is a ClassificationDiscriminant model object.

To reference properties of Mdl, use dot notation. For example, to display the estimated component means at the Command Window, enter Mdl.Mu.

Alternative Functionality

Functions

The classify function also performs discriminant analysis. classify is usually more awkward to use.

  • classify requires you to fit the classifier every time you make a new prediction.

  • classify does not perform cross validation or hyperparameter optimization.

  • classify requires you to fit the classifier when changing prior probabilities.

More About

collapse all

Discriminant Classification

The model for discriminant analysis is:

  • Each class (Y) generates data (X) using a multivariate normal distribution. That is, the model assumes X has a Gaussian mixture distribution (gmdistribution).

    • For linear discriminant analysis, the model has the same covariance matrix for each class, only the means vary.

    • For quadratic discriminant analysis, both means and covariances of each class vary.

predict classifies so as to minimize the expected classification cost:

y^=argminy=1,...,Kk=1KP^(k|x)C(y|k),

where

  • y^ is the predicted classification.

  • K is the number of classes.

  • P^(k|x) is the posterior probability of class k for observation x.

  • C(y|k) is the cost of classifying an observation as y when its true class is k.

For details, see How the predict Method Classifies.

Tall Array Support

This function supports tall arrays for out-of-memory data with some limitations.

  • Supported name-value pairs are:

    • 'ClassNames'

    • 'Cost'

    • 'DiscrimType'

    • 'PredictorNames'

    • 'Prior'

    • 'ResponseName'

    • 'ScoreTransform'

    • 'Weights'

  • For tall arrays and tall tables, fitcdiscr returns a CompactClassificationDiscriminant object, which contains most of the same properties as a ClassificationDiscriminant object. The main difference is that the compact object is sensitive to memory requirements. The compact object does not include properties that include the data, or that include an array of the same size as the data. The compact object does not contain these ClassificationDiscriminant properties:

    • ModelParameters

    • NumObservations

    • ParameterOptimizationResults

    • RowsUsed

    • XCentered

    • W

    • X

    • Y

    Additionally, the compact object does not support these ClassificationDiscriminant methods:

    • compact

    • crossval

    • cvshrink

    • resubEdge

    • resubLoss

    • resubMargin

    • resubPredict

For more information, see Tall Arrays.

Introduced in R2014a

Was this topic helpful?