# Documentation

### This is machine translation

Translated by
Mouse over text to see original. Click the button below to return to the English verison of the page.

# resubPredict

Class: ClassificationSVM

Predict support vector machine classifier resubstitution responses

## Syntax

• ``label = resubPredict(SVMModel)``
example
• ``````[label,Score] = resubPredict(SVMModel)``````
example

## Description

example

````label = resubPredict(SVMModel)` returns a vector of predicted class labels (`label`) for the trained support vector machine (SVM) classifier `SVMModel` using the predictor data `SVMModel.X`.```

example

``````[label,Score] = resubPredict(SVMModel)``` additionally returns class likelihood measures, that is, either scores or posterior probabilities.```

## Input Arguments

expand all

Full, trained SVM classifier, specified as a `ClassificationSVM` model trained using `fitcsvm`.

## Output Arguments

expand all

Predicted class labels, returned as a categorical or character array, logical or numeric vector, or cell array of character vectors.

`label`:

• Is the same data type as the observed class labels (`SVMModel.Y`)

• Has length equal to the number of rows of `SVMModel.X`

For one-class learning, the elements of `label` are the one class represented in `SVMModel.Y`

Predicted class scores or posterior probabilities, returned as a numeric column vector or numeric matrix.

• For one-class learning, `Score` is a column vector with the same number of rows as `SVMModel.X`. The elements are the positive class scores for the corresponding observations. You cannot obtain posterior probabilities for one-class learning.

• For two-class learning, `Score` is a two column matrix with the same number of rows as `SVMModel.X`.

• If you fit the optimal score-to-posterior probability transformation function using `fitPosterior` or `fitSVMPosterior`, then `Score` contains class posterior probabilities. That is, if the value of `SVMModel.ScoreTransform` is not `none`, then the elements of the first and second columns of `Score` are the negative class (`SVMModel.ClassNames{1}`) and positive class (`SVMModel.ClassNames{2}`) posterior probabilities for the corresponding observations, respectively.

• Otherwise, the elements of the first column are the negative class scores and the elements of the second column are the positive class scores for the corresponding observations.

If `SVMModel``.KernelParameters.Function` is `'linear'`, then the software estimates the classification score for the observation x using

`$f\left(x\right)=\left(x/s\right)\prime \beta +b.$`

`SVMModel` stores β, b, and s in the properties `Beta`, `Bias`, and `KernelParameters``.Scale`, respectively.

Data Types: `double` | `single`

## Definitions

### Classification Score

The SVM classification score for classifying observation x is the signed distance from x to the decision boundary ranging from -∞ to +∞. A positive score for a class indicates that x is predicted to be in that class, a negative score indicates otherwise.

The score for predicting x into the positive class, also the numerical, predicted response for x, $f\left(x\right)$, is the trained SVM classification function

`$f\left(x\right)=\sum _{j=1}^{n}{\alpha }_{j}{y}_{j}G\left({x}_{j},x\right)+b,$`

where $\left({\alpha }_{1},...,{\alpha }_{n},b\right)$ are the estimated SVM parameters, $G\left({x}_{j},x\right)$ is the dot product in the predictor space between x and the support vectors, and the sum includes the training set observations. The score for predicting x into the negative class is –f(x).

If G(xj,x) = xjx (the linear kernel), then the score function reduces to

`$f\left(x\right)=\left(x/s\right)\prime \beta +b.$`

s is the kernel scale and β is the vector of fitted linear coefficients.

### Posterior Probability

The probability that an observation belongs in a particular class, given the data.

For SVM, the posterior probability is a function of the score, P(s), that observation j is in class k = {-1,1}.

• For separable classes, the posterior probability is the step function

`$P\left({s}_{j}\right)=\left\{\begin{array}{l}\begin{array}{cc}0;& s<\underset{{y}_{k}=-1}{\mathrm{max}}{s}_{k}\end{array}\\ \begin{array}{cc}\pi ;& \underset{{y}_{k}=-1}{\mathrm{max}}{s}_{k}\le {s}_{j}\le \underset{{y}_{k}=+1}{\mathrm{min}}{s}_{k}\end{array}\\ \begin{array}{cc}1;& {s}_{j}>\underset{{y}_{k}=+1}{\mathrm{min}}{s}_{k}\end{array}\end{array},$`

where:

• sj is the score of observation j.

• +1 and –1 denote the positive and negative classes, respectively.

• π is the prior probability that an observation is in the positive class.

• For inseparable classes, the posterior probability is the sigmoid function

`$P\left({s}_{j}\right)=\frac{1}{1+\mathrm{exp}\left(A{s}_{j}+B\right)},$`

where the parameters A and B are the slope and intercept parameters.

### Prior Probability

The prior probability is the believed relative frequency that observations from a class occur in the population for each class.

## Examples

expand all

Load the `ionosphere` data set.

```load ionosphere ```

Train an SVM classifier. It is good practice to specify the class order and standardize the data.

```SVMModel = fitcsvm(X,Y,'ClassNames',{'b','g'},'Standardize',true); ```

`SVMModel` is a `ClassificationSVM` classifier. The positive class is `'g'`.

Predict the training sample labels and scores. Display the results for the first 10 observations.

```[label,score] = resubPredict(SVMModel); table(Y(1:10),label(1:10),score(1:10,2),'VariableNames',... {'TrueLabel','PredictedLabel','Score'}) ```
```ans = TrueLabel PredictedLabel Score _________ ______________ _______ 'g' 'g' 1.4861 'b' 'b' -1.0004 'g' 'g' 1.8685 'b' 'b' -2.6458 'g' 'g' 1.2805 'b' 'b' -1.4617 'g' 'g' 2.1672 'b' 'b' -5.7085 'g' 'g' 2.4797 'b' 'b' -2.7811 ```

Load the `ionosphere` data set.

```load ionosphere ```

Train an SVM classifier. It is good practice to specify the class order and standardize the data.

```SVMModel = fitcsvm(X,Y,'ClassNames',{'b','g'},'Standardize',true); ```

`SVMModel` is a `ClassificationSVM` classifier. The positive class is `'g'`.

Fit the optimal score-to-posterior-probability transformation function.

```rng(1); % For reproducibility ScoreSVMModel = fitPosterior(SVMModel) ```
```ScoreSVMModel = ClassificationSVM ResponseName: 'Y' CategoricalPredictors: [] ClassNames: {'b' 'g'} ScoreTransform: '@(S)sigmoid(S,-9.481802e-01,-1.218745e-01)' NumObservations: 351 Alpha: [90×1 double] Bias: -0.1343 KernelParameters: [1×1 struct] Mu: [1×34 double] Sigma: [1×34 double] BoxConstraints: [351×1 double] ConvergenceInfo: [1×1 struct] IsSupportVector: [351×1 logical] Solver: 'SMO' ```

Since the classes are inseparable, the score transformation function (`ScoreSVMModel.ScoreTransform`) is the sigmoid function.

Estimate scores and positive class posterior probabilities for the training data. Display the results for the first 10 observations.

```[label,scores] = resubPredict(SVMModel); [~,postProbs] = resubPredict(ScoreSVMModel); table(Y(1:10),label(1:10),scores(1:10,2),postProbs(1:10,2),'VariableNames',... {'TrueLabel','PredictedLabel','Score','PosteriorProbability'}) ```
```ans = TrueLabel PredictedLabel Score PosteriorProbability _________ ______________ _______ ____________________ 'g' 'g' 1.4861 0.82215 'b' 'b' -1.0004 0.30436 'g' 'g' 1.8685 0.86916 'b' 'b' -2.6458 0.084183 'g' 'g' 1.2805 0.79184 'b' 'b' -1.4617 0.22028 'g' 'g' 2.1672 0.89814 'b' 'b' -5.7085 0.0050122 'g' 'g' 2.4797 0.92223 'b' 'b' -2.7811 0.074805 ```

## Algorithms

• By default, the software computes optimal posterior probabilities using Platt's method [1]:

1. Performing 10-fold cross validation

2. Fitting the sigmoid function parameters to the scores returned from the cross validation

3. Estimating the posterior probabilities by entering the cross-validation scores into the fitted sigmoid function

• The software incorporates prior probabilities in the SVM objective function during training.

• For SVM, `predict` classifies observations into the class yielding the largest score (i.e., the largest posterior probability). The software accounts for misclassification costs by applying the average-cost correction before training the classifier. That is, given the class prior vector P, misclassification cost matrix C, and observation weight vector w, the software defines a new vector of observation weights (W) such that

`${W}_{j}={w}_{j}{P}_{j}\sum _{k=1}^{K}{C}_{jk}.$`

## References

[1] Platt, J. "Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods." In Advances in Large Margin Classifiers. MIT Press, 1999, pp. 61–74.