**Class: **ClassificationSVM

Classification margins for support vector machine classifiers by resubstitution

- example
`m = resubMargin(SVMModel)`

returns
the resubstitution classification
margins (`m`

= resubMargin(`SVMModel`

)`m`

) for the support vector machine
(SVM) classifier `SVMModel`

using the training data
stored in `SVMModel.X`

and corresponding class labels
stored in `SVMModel.Y`

.

The *edge* is the weighted
mean of the *classification margins*.

The weights are the prior class probabilities. If you supply weights, then the software normalizes them to sum to the prior probabilities in the respective classes. The software uses the renormalized weights to compute the weighted mean.

One way to choose among multiple classifiers, e.g., to perform feature selection, is to choose the classifier that yields the highest edge.

The *classification margins* for
binary classification are, for each observation, the difference between
the classification score for the true class and the classification
score for the false class.

The software defines the classification margin for binary classification as

$$m=2yf\left(x\right).$$

*x* is
an observation. If the true label of *x* is the positive
class, then *y* is 1, and –1 otherwise. *f*(*x*)
is the positive-class classification score for the observation *x*.
The literature commonly defines the margin as *m* = *y**f*(*x*).

If the margins are on the same scale, then they serve as a classification confidence measure, i.e., among multiple classifiers, those that yield larger margins are better.

The SVM *classification score* for
classifying observation *x* is the signed distance
from *x* to the decision boundary ranging from -∞
to +∞. A positive score for a class indicates that *x* is
predicted to be in that class, a negative score indicates otherwise.

The score for predicting *x* into the positive
class, also the numerical, predicted response for *x*, $$f(x)$$, is the trained SVM classification
function

$$f(x)={\displaystyle \sum _{j=1}^{n}{\alpha}_{j}}{y}_{j}G({x}_{j},x)+b,$$

where $$({\alpha}_{1},\mathrm{...},{\alpha}_{n},b)$$ are
the estimated SVM parameters, $$G({x}_{j},x)$$ is
the dot product in the predictor space between *x* and
the support vectors, and the sum includes the training set observations.
The score for predicting *x* into the negative class
is –*f*(*x*).

If *G*(*x _{j}*,

$$f\left(x\right)=\left(x/s\right)\prime \beta +b.$$

*s* is
the kernel scale and *β* is the vector of fitted
linear coefficients.

For binary classification, the software defines the margin for
observation *j*, *m _{j}*,
as

$${m}_{j}=2{y}_{j}f({x}_{j}),$$

where *y _{j}* ∊
{-1,1}, and

[1] Christianini, N., and J. C. Shawe-Taylor. *An
Introduction to Support Vector Machines and Other Kernel-Based Learning
Methods*. Cambridge, UK: Cambridge University Press, 2000.

`ClassificationSVM`

| `CompactClassificationSVM`

| `fitcsvm`

| `margin`

| `resubEdge`

| `resubLoss`

Was this topic helpful?