Note: This page has been translated by MathWorks. Click here to see

To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

Discriminant analysis

`class = classify(sample,training,group)`

class = classify(sample,training,group,'* type*')

class = classify(sample,training,group,'

`type`

`prior`

[class,err] = classify(...)

[class,err,POSTERIOR] = classify(...)

[class,err,POSTERIOR,logp] = classify(...)

[class,err,POSTERIOR,logp,coeff] = classify(...)

`class = classify(sample,training,group)`

classifies
each row of the data in `sample`

into one of the groups in
`training`

. `sample`

and
`training`

must be matrices with the same number of columns.
`group`

is a grouping variable for `training`

. Its
unique values define groups; each element defines the group to which the corresponding
row of `training`

belongs. `group`

can be a
categorical variable, a numeric vector, a character array, a string array, or a cell
array of character vectors. `training`

and `group`

must have the same number of rows. `classify`

treats
`<undefined>`

values, `NaN`

s, empty character
vectors, empty strings, and `<missing>`

string values in
`group`

as missing data values, and ignores the corresponding rows
of `training`

. The output `class`

indicates the group
to which each row of `sample`

has been assigned, and is of the same
type as `group`

.

`class = classify(sample,training,group,'`

allows
you to specify the type of discriminant function. Specify * type*')

`type`

`type`

`linear`

— Fits a multivariate normal density to each group, with a pooled estimate of covariance. This is the default.`diaglinear`

— Similar to`linear`

, but with a diagonal covariance matrix estimate (naive Bayes classifiers).`quadratic`

— Fits multivariate normal densities with covariance estimates stratified by group.`diagquadratic`

— Similar to`quadratic`

, but with a diagonal covariance matrix estimate (naive Bayes classifiers).`mahalanobis`

— Uses Mahalanobis distances with stratified covariance estimates.

`class = classify(sample,training,group,'`

allows
you to specify prior probabilities for the groups. * type*',

`prior`

`prior`

A numeric vector the same length as the number of unique values in

`group`

(or the number of levels defined for`group`

, if`group`

is categorical). If`group`

is numeric or categorical, the order ofmust correspond to the ordered values in`prior`

`group`

. Otherwise, the order ofmust correspond to the order of the first occurrence of the values in`prior`

`group`

.A 1-by-1 structure with fields:

`prob`

— A numeric vector.`group`

— Of the same type as`group`

, containing unique values indicating the groups to which the elements of`prob`

correspond.

As a structure,

can contain groups that do not appear in`prior`

`group`

. This can be useful if`training`

is a subset a larger training set.`classify`

ignores any groups that appear in the structure but not in the`group`

array.The character vector or string scalar

`'empirical'`

, indicating that group prior probabilities should be estimated from the group relative frequencies in`training`

.

* prior* defaults to a numeric vector
of equal probabilities, i.e., a uniform distribution.

`prior`

`[class,err] = classify(...)`

also
returns an estimate `err`

of the misclassification
error rate based on the `training`

data. `classify`

returns
the apparent error rate, i.e., the percentage of observations in `training`

that
are misclassified, weighted by the prior probabilities for the groups.

`[class,err,POSTERIOR] = classify(...)`

also
returns a matrix `POSTERIOR`

of estimates of the
posterior probabilities that the *j*th training group
was the source of the *i*th sample observation, i.e., *Pr*(*group
j*|*obs i*). `POSTERIOR`

is
not computed for Mahalanobis discrimination.

`[class,err,POSTERIOR,logp] = classify(...)`

also
returns a vector `logp`

containing estimates of the
logarithms of the unconditional predictive probability density of
the sample observations, *p*(*obs i*)
= ∑*p*(*obs i*|*group
j*)*Pr*(*group j*) over
all groups. `logp`

is not computed for Mahalanobis
discrimination.

`[class,err,POSTERIOR,logp,coeff] = classify(...)`

also
returns a structure array `coeff`

containing coefficients
of the boundary curves between pairs of groups. Each element `coeff(I,J)`

contains information for comparing group `I`

to group `J`

in
the following fields:

`type`

— Type of discriminant function, from theinput.`type`

`name1`

— Name of the first group.`name2`

— Name of the second group.`const`

— Constant term of the boundary equation (K)`linear`

— Linear coefficients of the boundary equation (L)`quadratic`

— Quadratic coefficient matrix of the boundary equation (Q)

For the `linear`

and `diaglinear`

types,
the `quadratic`

field is absent, and a row `x`

from
the `sample`

array is classified into group `I`

rather
than group `J`

if `0 < K+x*L`

.
For the other types, `x`

is classified into group `I`

if ```
0
< K+x*L+x*Q*x'
```

.

The `fitcdiscr`

function also performs
discriminant analysis. You can train a classifier by using the
`fitcdiscr`

function and predict labels of new data by using the
`predict`

function. The
`fitcdiscr`

supports cross-validation and hyperparameter
optimization and does not require you to fit the classifier every time you make a new
prediction or you change prior probabilities.

[1] Krzanowski, W. J. *Principles
of Multivariate Analysis: A User's Perspective*. New York:
Oxford University Press, 1988.

[2] Seber, G. A. F. *Multivariate
Observations*. Hoboken, NJ: John Wiley & Sons, Inc.,
1984.