Accelerating the pace of engineering and science

# predict

Predict classification

## Syntax

label = predict(obj,X)
[label,score] = predict(obj,X)
[label,score,cost] = predict(obj,X)

## Description

label = predict(obj,X) returns a vector of predicted class labels for a matrix X, based on obj, a trained full or compact classifier.

[label,score] = predict(obj,X) returns a matrix of scores (posterior probabilities).

[label,score,cost] = predict(obj,X) returns a matrix of costs; label is the vector of minimal costs for each row of cost.

## Input Arguments

 obj Discriminant analysis classifier of class ClassificationDiscriminant or CompactClassificationDiscriminant, typically constructed with fitcdiscr. X Matrix where each row represents an observation, and each column represents a predictor. The number of columns in X must equal the number of predictors in obj.

## Output Arguments

 label Vector of class labels of the same type as the response data used in training obj. Each entry of labels corresponds to a predicted class label for the corresponding row of X; see Predicted Class Label. score Numeric matrix of size N-by-K, where N is the number of observations (rows) in X, and K is the number of classes (in obj.ClassNames). score(i,j) is the posterior probability that row i of X is of class j; see Posterior Probability. cost Matrix of expected costs of size N-by-K. cost(i,j) is the cost of classifying row i of X as class j. See Cost.

## Definitions

### Posterior Probability

The posterior probability that a point z belongs to class j is the product of the prior probability and the multivariate normal density. The density function of the multivariate normal with mean μj and covariance Σj at a point z is

$P\left(x|k\right)=\frac{1}{{\left(2\pi |{\Sigma }_{k}|\right)}^{1/2}}\mathrm{exp}\left(-\frac{1}{2}{\left(x-{\mu }_{k}\right)}^{T}{\Sigma }_{k}^{-1}\left(x-{\mu }_{k}\right)\right),$

where $|{\Sigma }_{k}|$ is the determinant of Σk, and ${\Sigma }_{k}^{-1}$ is the inverse matrix.

Let P(k) represent the prior probability of class k. Then the posterior probability that an observation x is of class k is

$\stackrel{^}{P}\left(k|x\right)=\frac{P\left(x|k\right)P\left(k\right)}{P\left(x\right)},$

where P(x) is a normalization constant, the sum over k of P(x|k)P(k).

### Prior Probability

The prior probability is one of three choices:

• 'uniform' — The prior probability of class k is one over the total number of classes.

• 'empirical' — The prior probability of class k is the number of training samples of class k divided by the total number of training samples.

• Custom — The prior probability of class k is the kth element of the prior vector. See fitcdiscr.

After creating a classifier obj, you can set the prior using dot notation:

`obj.Prior = v;`

where v is a vector of positive elements representing the frequency with which each element occurs. You do not need to retrain the classifier when you set a new prior.

### Cost

The matrix of expected costs per observation is defined in Cost.

### Predicted Class Label

predict classifies so as to minimize the expected classification cost:

$\stackrel{^}{y}=\underset{y=1,...,K}{\mathrm{arg}\mathrm{min}}\sum _{k=1}^{K}\stackrel{^}{P}\left(k|x\right)C\left(y|k\right),$

where

• $\stackrel{^}{y}$ is the predicted classification.

• K is the number of classes.

• $\stackrel{^}{P}\left(k|x\right)$ is the posterior probability of class k for observation x.

• $C\left(y|k\right)$ is the cost of classifying an observation as y when its true class is k.

## Examples

Examine predictions for a few rows in the Fisher iris data:

```load fisheriris
obj = fitcdiscr(meas,species);
X = meas(99:102,:); % take four rows
[label score cost] = predict(obj,X)

label =
'versicolor'
'versicolor'
'virginica'
'virginica'

score =
0.0000    1.0000    0.0000
0.0000    0.9999    0.0001
0.0000    0.0000    1.0000
0.0000    0.0011    0.9989

cost =
1.0000    0.0000    1.0000
1.0000    0.0001    0.9999
1.0000    1.0000    0.0000
1.0000    0.9989    0.0011```