# resubPredict

Class: ClassificationSVM

Predict support vector machine classifier resubstitution responses

## Syntax

• ``label = resubPredict(SVMModel)``
• ``````[label,Score] = resubPredict(SVMModel)``````
## Description

````label = resubPredict(SVMModel)` returns a vector of predicted class labels (`label`) for the trained support vector machine (SVM) classifier `SVMModel` using the predictor data `SVMModel.X`.```

``````[label,Score] = resubPredict(SVMModel)``` additionally returns class likelihood measures, that is, either scores or posterior probabilities.```

## Input Arguments

Full, trained SVM classifier, specified as a `ClassificationSVM` model trained using `fitcsvm`.

## Output Arguments

Predicted class labels, returned as a categorical or character array, logical or numeric vector, or cell array of character vectors.

`label`:

• Is the same data type as the observed class labels (`SVMModel.Y`)

• Has length equal to the number of rows of `SVMModel.X`

For one-class learning, the elements of `label` are the one class represented in `SVMModel.Y`

Predicted class scores or posterior probabilities, returned as a numeric column vector or numeric matrix.

• For one-class learning, `Score` is a column vector with the same number of rows as `SVMModel.X`. The elements are the positive class scores for the corresponding observations. You cannot obtain posterior probabilities for one-class learning.

• For two-class learning, `Score` is a two column matrix with the same number of rows as `SVMModel.X`.

• If you fit the optimal score-to-posterior probability transformation function using `fitPosterior` or `fitSVMPosterior`, then `Score` contains class posterior probabilities. That is, if the value of `SVMModel.ScoreTransform` is not `none`, then the elements of the first and second columns of `Score` are the negative class (`SVMModel.ClassNames{1}`) and positive class (`SVMModel.ClassNames{2}`) posterior probabilities for the corresponding observations, respectively.

• Otherwise, the elements of the first column are the negative class scores and the elements of the second column are the positive class scores for the corresponding observations.

If `SVMModel``.KernelParameters.Function` is `'linear'`, then the software estimates the classification score for the observation x using

`$f\left(x\right)=\left(x/s\right)\prime \beta +b.$`

`SVMModel` stores β, b, and s in the properties `Beta`, `Bias`, and `KernelParameters``.Scale`, respectively.

Data Types: `double` | `single`

## Definitions

### Classification Score

The SVM classification score for classifying observation x is the signed distance from x to the decision boundary ranging from -∞ to +∞. A positive score for a class indicates that x is predicted to be in that class, a negative score indicates otherwise.

The score for predicting x into the positive class, also the numerical, predicted response for x, $f\left(x\right)$, is the trained SVM classification function

`$f\left(x\right)=\sum _{j=1}^{n}{\alpha }_{j}{y}_{j}G\left({x}_{j},x\right)+b,$`

where $\left({\alpha }_{1},...,{\alpha }_{n},b\right)$ are the estimated SVM parameters, $G\left({x}_{j},x\right)$ is the dot product in the predictor space between x and the support vectors, and the sum includes the training set observations. The score for predicting x into the negative class is –f(x).

If G(xj,x) = xjx (the linear kernel), then the score function reduces to

`$f\left(x\right)=\left(x/s\right)\prime \beta +b.$`

s is the kernel scale and β is the vector of fitted linear coefficients.

### Posterior Probability

The probability that an observation belongs in a particular class, given the data.

For SVM, the posterior probability is a function of the score, P(s), that observation j is in class k = {-1,1}.

• For separable classes, the posterior probability is the step function

`$P\left({s}_{j}\right)=\left\{\begin{array}{l}\begin{array}{cc}0;& s<\underset{{y}_{k}=-1}{\mathrm{max}}{s}_{k}\end{array}\\ \begin{array}{cc}\pi ;& \underset{{y}_{k}=-1}{\mathrm{max}}{s}_{k}\le {s}_{j}\le \underset{{y}_{k}=+1}{\mathrm{min}}{s}_{k}\end{array}\\ \begin{array}{cc}1;& {s}_{j}>\underset{{y}_{k}=+1}{\mathrm{min}}{s}_{k}\end{array}\end{array},$`

where:

• sj is the score of observation j.

• +1 and –1 denote the positive and negative classes, respectively.

• π is the prior probability that an observation is in the positive class.

• For inseparable classes, the posterior probability is the sigmoid function

`$P\left({s}_{j}\right)=\frac{1}{1+\mathrm{exp}\left(A{s}_{j}+B\right)},$`

where the parameters A and B are the slope and intercept parameters.

### Prior Probability

The prior probability is the believed relative frequency that observations from a class occur in the population for each class.

## Examples

Load the `ionosphere` data set.

```load ionosphere ```

Train an SVM classifier. It is good practice to specify the class order and standardize the data.

```SVMModel = fitcsvm(X,Y,'ClassNames',{'b','g'},'Standardize',true); ```

`SVMModel` is a `ClassificationSVM` classifier. The positive class is `'g'`.

Predict the training sample labels and scores. Display the results for the first 10 observations.

```[label,score] = resubPredict(SVMModel); table(Y(1:10),label(1:10),score(1:10,2),'VariableNames',... {'TrueLabel','PredictedLabel','Score'}) ```
```ans = TrueLabel PredictedLabel Score _________ ______________ _______ 'g' 'g' 1.4861 'b' 'b' -1.0004 'g' 'g' 1.8685 'b' 'b' -2.6458 'g' 'g' 1.2805 'b' 'b' -1.4617 'g' 'g' 2.1672 'b' 'b' -5.7085 'g' 'g' 2.4797 'b' 'b' -2.7811 ```

## Algorithms

• By default, the software computes optimal posterior probabilities using Platt's method [1]:

1. Performing 10-fold cross validation

2. Fitting the sigmoid function parameters to the scores returned from the cross validation

3. Estimating the posterior probabilities by entering the cross-validation scores into the fitted sigmoid function

• The software incorporates prior probabilities in the SVM objective function during training.

• For SVM, `predict` classifies observations into the class yielding the largest score (i.e., the largest posterior probability). The software accounts for misclassification costs by applying the average-cost correction before training the classifier. That is, given the class prior vector P, misclassification cost matrix C, and observation weight vector w, the software defines a new vector of observation weights (W) such that

`${W}_{j}={w}_{j}{P}_{j}\sum _{k=1}^{K}{C}_{jk}.$`

## References

[1] Platt, J. "Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods." In Advances in Large Margin Classifiers. MIT Press, 1999, pp. 61–74.