# Documentation

# ClassificationSVM class

Superclasses: `CompactClassificationSVM`

Support vector machine for binary classification

## Description

`ClassificationSVM` is a support vector machine classifier for one- or two-class learning. To train a `ClassificationSVM` classifier, use `fitcsvm`.

Trained `ClassificationSVM` classifiers store the training data, parameter values, prior probabilities, support vectors, and algorithmic implementation information. You can use these classifiers to:

## Construction

`Mdl = fitcsvm(Tbl,ResponseVarName)` returns an SVM classifier (`Mdl`) trained using the sample data contained in the table `Tbl`. `ResponseVarName` is the name of the variable in `Tbl` that contains the class labels for one- or two-class classification. For details, see `fitcsvm`.

`Mdl = fitcsvm(Tbl,formula)` returns an SVM classifer trained using the predictor data and class labels in the table `Tbl`. `formula` is an explanatory model of the response and a subset of predictor variables in `Tbl` used for training. For details, see `fitcsvm`.

`Mdl = fitcsvm(Tbl,Y)` returns an SVM classifer trained using the predictor variables in table `Tbl` and class labels in vector `Y`. For details, see `fitcsvm`.

```Mdl = fitcsvm(X,Y)``` returns an SVM classifier trained using the predictors in the matrix `X` and class labels in the vector `Y` for one- or two-class classification. For details, see `fitcsvm`.

`Mdl = fitcsvm(___,Name,Value)` returns a trained SVM classifier with additional options specified by one or more `Name,Value` pair arguments, using any of the previous syntaxes. For example, you can specify the type of cross validation, the cost for misclassification, or the type of score transformation function. For name-value pair argument details, see `fitcsvm`.

If you set one of the following five options, then `Mdl` is a `ClassificationPartitionedModel` model: `'CrossVal'`, `'CVPartition'`, `'Holdout'`, `'KFold'`, or `'Leaveout'`. Otherwise, `Mdl` is a `ClassificationSVM` classifier.

### Input Arguments

Sample data used to train the model, specified as a table. Each row of `Tbl` corresponds to one observation, and each column corresponds to one predictor variable. Optionally, `Tbl` can contain one additional column for the response variable. Multi-column variables and cell arrays other than cell arrays of character vectors are not allowed.

If `Tbl` contains the response variable, and you want to use all remaining variables in `Tbl` as predictors, then specify the response variable using `ResponseVarName`.

If `Tbl` contains the response variable, and you want to use only a subset of the remaining variables in `Tbl` as predictors, then specify a formula using `formula`.

If `Tbl` does not contain the response variable, then specify a response variable using `Y`. The length of response variable and the number of rows of `Tbl` must be equal.

Data Types: `table`

Response variable name, specified as the name of a variable in `Tbl`.

You must specify `ResponseVarName` as a character vector. For example, if the response variable `Y` is stored as `Tbl.Y`, then specify it as `'Y'`. Otherwise, the software treats all columns of `Tbl`, including `Y`, as predictors when training the model.

The response variable must be a categorical or character array, logical or numeric vector, or cell array of character vectors. If `Y` is a character array, then each element must correspond to one row of the array.

It is good practice to specify the order of the classes using the `ClassNames` name-value pair argument.

Data Types: `char`

Explanatory model of the response and a subset of the predictor variables, specified as a character vector in the form of `'Y~X1+X2+X3'`. In this form, `Y` represents the response variable, and `X1`, `X2`, and `X3` represent the predictor variables. The variables must be variable names in `Tbl` (`Tbl.Properties.VariableNames`).

To specify a subset of variables in `Tbl` as predictors for training the model, use a formula. If you specify a formula, then the software does not use any variables in `Tbl` that do not appear in `formula`.

Data Types: `char`

Class labels to which the SVM model is trained, specified as a categorical or character array, logical or numeric vector, or cell array of character vectors.

• `Y` must contain at most two distinct classes. For multiclass learning, see `fitcecoc`.

• If `Y` is a character array, then each element must correspond to one row of the array.

• The length of `Y` and the number of rows of `Tbl` or `X` must be equal.

• It is good practice to specify the class order using the `ClassNames` name-value pair argument.

Data Types: `char` | `cell` | `categorical` | `logical` | `single` | `double`

Predictor data to which the SVM classifier is trained, specified as a matrix of numeric values.

Each row of `X` corresponds to one observation (also known as an instance or example), and each column corresponds to one predictor.

The length of `Y` and the number of rows of `X` must be equal.

To specify the names of the predictors in the order of their appearance in `X`, use the `PredictorNames` name-value pair argument.

Data Types: `double` | `single`

### Note

The software treats `NaN`, empty character vector (`''`), and `<undefined>` elements as missing values. If a row of `X` or an element of `Y` contains at least one `NaN`, then the software removes those rows and elements from both arguments. Such deletion decreases the effective training or cross-validation sample size.

## Properties

`Alpha`

s-by-1 numeric vector of trained classifier coefficients from the dual problem, that is, the estimated Lagrange multipliers. s is the number of support vectors in the trained classifier, that is, `sum(Mdl.IsSupportVector)`.

If you specify removing duplicates using `RemoveDuplicates`, then for a given set of duplicate observations that are support vectors, `Alpha` contains one coefficient corresponding to the entire set. That is, MATLAB® attributes a nonzero coefficient to one observation from the set of duplicates and a coefficient of `0` to all other duplicate observations in the set.

`Beta`

Numeric vector of linear predictor coefficients. `Beta` has length equal to the number of predictors used to train the model.

If your predictor data contains categorical variables, then the software uses full dummy encoding for these variables. The software creates one dummy variable for each level of each categorical variable. `Beta` stores one value for each predictor variable, including the dummy variables. For example, if there are three predictors, one of which is a categorical variable with three levels, then `Beta` is a numeric vector containing five values.

If `KernelParameters.Function` is `'linear'`, then the classification score for the observation x is consistent with

`$f\left(x\right)=\left(x/s\right)\prime \beta +b.$`
`Mdl` stores β, b, and s in the properties `Beta`, `Bias`, and `KernelParameters.Scale`, respectively.

If `KernelParameters.Function` is not `'linear'`, then `Beta` is empty (`[]`).

`Bias`

Scalar corresponding to the trained classifier bias term.

`BoxConstraints`

n-by-1 numeric vector of box constraints. n is the number of observations in the training data (see the `NumObservations` property).

If you specify removing duplicates using `RemoveDuplicates`, then for a given set of duplicate observations, MATLAB sums the box constraints, and then attributes the sum to one observation and box constraints of `0` to all other observations in the set.

`CacheInfo`

Structure array containing:

• The cache size (in MB) that the software reserves to train the SVM classifier (`Mdl.CacheInfo.Size`). To set the cache size to `CacheSize` MB, set the `fitcsvm` name-value pair argument to `'CacheSize',CacheSize`.

• The caching algorithm that the software uses during optimization (`Mdl.CacheInfo.Algorithm`). Currently, the only available caching algorithm is `Queue`. You cannot set the caching algorithm.

`CategoricalPredictors`

Indices of categorical predictors, stored as a vector of positive integers. `CategoricalPredictors` contains index values corresponding to the columns of the predictor data that contain categorical predictors. If none of the predictors are categorical, then this property is empty (`[]`).

`ClassNames`

List of elements in `Y` with duplicates removed. `ClassNames` has the same data type as the data in the argument `Y`, and therefore can be a categorical or character array, logical or numeric vector, or cell array of character vectors.

`ConvergenceInfo`

Structure array containing convergence information.

FieldDescription
`Converged`Logical flag indicating whether the algorithm converged (`1` indicates convergence)
`ReasonForConvergence`Character vector indicating the criterion the software uses to detect convergence
`Gap`Scalar feasibility gap between the dual and primal objective functions
`GapTolerance`Scalar feasibility gap tolerance. Set this tolerance to, e.g., `gt`, using the name-value pair argument `'GapTolerance',gt` of `fitcsvm`.
`DeltaGradient`Scalar-attained gradient difference between upper and lower violators
`DeltaGradientTolerance`Scalar tolerance for gradient difference between upper and lower violators. Set this tolerance to, e.g., `dgt`, using the name-value pair argument `'DeltaGradientTolerance',dgt` of `fitcsvm`.
`LargestKKTViolation`Maximal, scalar Karush-Kuhn-Tucker (KKT) violation value
`KKTTolerance`Scalar tolerance for the largest KKT violation. Set this tolerance to, e.g., `kktt`, using the name-value pair argument `'KKTTolerance',kktt` of `fitcsvm`.
`History`Structure array containing convergence information at set optimization iterations. The fields are:
• `NumIterations`: numeric vector of iteration indices for which the software records convergence information

• `Gap`: numeric vector of `Gap` values at the iterations

• `DeltaGradient` numeric vector of `DeltaGradient` values at the iterations

• `LargestKKTViolation`: numeric vector of `LargestKKTViolation` values at the iterations

• `NumSupportVectors`: numeric vector indicating the number of support vectors at the iterations

• `Objective`: numeric vector of `Objective` values at the iterations

`Objective`Scalar value of the dual objective function

`Cost`

Square matrix, where `Cost(i,j)` is the cost of classifying a point into class `j` if its true class is `i`.

During training, the software updates the prior probabilities by incorporating the penalties described in the cost matrix. Therefore:

• For two-class learning, `Cost` always has this form: `Cost(i,j) = 1` if `i ~= j`, and `Cost(i,j) = 0` if `i = j` (i.e., the rows correspond to the true class and the columns correspond to the predicted class). The order of the rows and columns of `Cost` corresponds to the order of the classes in `ClassNames`.

• For one-class learning, `Cost = 0`.

This property is read-only. For more details, see Algorithms.

`ExpandedPredictorNames`

Expanded predictor names, stored as a cell array of character vectors.

If the model uses encoding for categorical variables, then `ExpandedPredictorNames` includes the names that describe the expanded variables. Otherwise, `ExpandedPredictorNames` is the same as `PredictorNames`.

`Gradient`

Numeric vector of training data gradient values. `Gradient` has length equal to the number of observations (i.e., `size(Mdl.X,1)`).

`HyperparameterOptimizationResults`

Description of the cross-validation optimization of hyperparameters, stored as a `BayesianOptimization` object or a table of hyperparameters and associated values. Nonempty when the `OptimizeHyperparameters` name-value pair is nonempty at creation. Value depends on the setting of the `HyperparameterOptimizationOptions` name-value pair at creation:

• `'bayesopt'` (default) — Object of class `BayesianOptimization`

• `'gridsearch'` or `'randomsearch'` — Table of hyperparameters used, observed objective function values (cross-validation loss), and rank of observations from lowest (best) to highest (worst)

`IsSupportVector`

n-by-1 logical vector indicating whether a corresponding row in the predictor data matrix is a support vector. n is the number of observations in the training data (see `NumObservations`).

If you specify removing duplicates using `RemoveDuplicates`, then for a given set of duplicate observations that are support vectors, `IsSupportVector` flags only one as a support vector.

`KernelParameters`

Structure array containing the kernel name and parameter values.

To display the values of `KernelParameters`, use dot notation, for example, `Mdl.KernelParameters.Scale` displays the scale parameter value.

The software accepts `KernelParameters` as inputs, and does not modify them. Alter `KernelParameters` by setting the appropriate name-value pair arguments when you train the SVM classifier using `fitcsvm`.

`ModelParameters`

Object containing parameter values, e.g., the name-value pair argument values, used to train the SVM classifier. `ModelParameters` does not contain estimated parameters.

Access fields of `ModelParameters` using dot notation. For example, access the initial values for estimating `Alpha` using `Mdl.ModelParameters.Alpha`.

`Mu`

Numeric vector of predictor means.

If you specify `'Standardize',1` or `'Standardize',true` when you train an SVM classifier using `fitcsvm`, then `Mu` has length equal to the number of predictors.

If your predictor data contains categorical variables, then the software uses full dummy encoding for these variables. The software creates one dummy variable for each level of each categorical variable. `Mu` stores one value for each predictor variable, including the dummy variables. However, the software does not standardize the columns that contain categorical variables.

If `'Standardize'` is `false` or `0`, then `Mu` is an empty vector (`[]`).

`NumIterations`

Positive integer indicating the number of iterations required by the optimization routine to attain convergence.

To set a limit on the number of iterations to, e.g., `k`, specify the name-value pair argument `'IterationLimit',k` of `fitcsvm`.

`Nu`

Positive scalar representing the ν parameter for one-class learning.

`NumObservations`

Numeric scalar representing the number of observations in the training data. If the input arguments `X` or `Y` contain missing values, then `NumObservations` is less than the length of `Y`.

`OutlierFraction`

Scalar indicating the expected proportion of outliers in the training data.

`PredictorNames`

Cell array of character vectors containing the predictor names, in the order that they appear in the training data.

`Prior`

Numeric vector of prior probabilities for each class. The order of the elements of `Prior` corresponds to the elements of `Mdl.ClassNames`.

For two-class learning, if you specify a cost matrix, then the software updates the prior probabilities by incorporating the penalties described in the cost matrix.

This property is read-only. For more details, see Algorithms.

`ResponseName`

Character vector describing the response variable `Y`.

`ScoreTransform`

Character vector representing a built-in transformation function, or a function handle for transforming predicted classification scores.

To change the score transformation function to, e.g., `function`, use dot notation.

• For a built-in function, enter a character vector.

`Mdl.ScoreTransform = 'function';`

This table contains the available, built-in functions.

ValueDescription
`'doublelogit'`1/(1 + e–2x)
`'invlogit'`log(x / (1–x))
`'ismax'`Set the score for the class with the largest score to `1`, and set the scores for all other classes to `0`.
`'logit'`1/(1 + ex)
`'none'` or `'identity'`x (no transformation)
`'sign'`–1 for x < 0
0 for x = 0
1 for x > 0
`'symmetric'`2x – 1
`'symmetricismax'`Set the score for the class with the largest score to `1`, and set the scores for all other classes to `-1`.
`'symmetriclogit'`2/(1 + ex) – 1

• For a MATLAB function, or a function that you define, enter its function handle.

`Mdl.ScoreTransform = @function;`

`function` should accept a matrix (the original scores) and return a matrix of the same size (the transformed scores).

`ShrinkagePeriod`

Nonnegative integer indicating the shrinkage period, i.e., number of iterations between reductions of the active set.

To set the shrinkage period to, e.g., `sp`, specify the name-value pair argument `'ShrinkagePeriod',sp` of `fitcsvm`.

`Sigma`

Numeric vector of predictor standard deviations.

If you specify `'Standardize',1` or `'Standardize',true` when you train the SVM classifier, then `Sigma` has length equal to the number of predictors.

If your predictor data contains categorical variables, then the software uses full dummy encoding for these variables. The software creates one dummy variable for each level of each categorical variable. `Sigma` stores one value for each predictor variable, including the dummy variables. However, the software does not standardize the columns that contain categorical variables.

If `'Standardize'` is `false` or `0`, then `Sigma` is an empty vector (`[]`).

`Solver`

Character vector indicating the solving routine that the software used to train the SVM classifier.

To set the solver to, e.g., `solver`, specify the name-value pair argument `'Solver',solver` of `fitcsvm`.

`SupportVectors`

s-by-p numeric matrix containing rows of `X` that MATLAB considers the support vectors. s is the number of support vectors in the trained classifier, that is, `sum(Mdl.IsSupportVector)` and p is the number of predictor variables in the predictor data.

If you specify `'Standardize',1` or `'Standardize',true`, then `SupportVectors` are the standardized rows of `X`.

If you specify removing duplicates using `RemoveDuplicates`, then for a given set of duplicate observations that are support vectors, `SupportVectors` contains one unique support vector.

`SupportVectorLabels`

s-by-1 numeric vector of support vector class labels. s is the number of support vectors in the trained classifier, that is, `sum(Mdl.IsSupportVector)`.

A value of `+1` indicates that the corresponding support vector is in the positive class (`Mdl.ClassNames{2}`). A value of `-1` indicates that the corresponding support vector is in the negative class (`Mdl.ClassNames{1}`).

If you specify removing duplicates using `RemoveDuplicates`, then for a given set of duplicate observations that are support vectors, `SupportVectorLabels` contains one unique support vector label.

`W`

Numeric vector of observation weights that the software used to train the SVM classifier.

The length of `W` is `Mdl.NumObservations`.

`fitcsvm` normalizes `Weights` so that the elements of `W` within a particular class sum up to the prior probability of that class.

`X`

Numeric matrix of unstandardized predictor values that the software used to train the SVM classifier.

Each row of `X` corresponds to one observation, and each column corresponds to one variable.

The software excludes predictor data rows removed due to `NaN`s from `X`.

`Y`

Categorical or character array, logical or numeric vector, or cell array of character vectors representing the observed class labels used to train the SVM classifier. `Y` is the same data type as the input argument `Y` of `fitcsvm`.

Each row of `Y` represents the observed classification of the corresponding row of `X`.

The software excludes elements removed due to `NaN`s from `Y`.

## Methods

 compact Compact support vector machine classifier crossval Cross-validated support vector machine classifier fitPosterior Fit posterior probabilities resubEdge Classification edge for support vector machine classifiers by resubstitution resubLoss Classification loss for support vector machine classifiers by resubstitution resubMargin Classification margins for support vector machine classifiers by resubstitution resubPredict Predict support vector machine classifier resubstitution responses resume Resume training support vector machine classifier

### Inherited Methods

 compareHoldout Compare accuracies of two classification models using new data discardSupportVectors Discard support vectors for linear support vector machine models edge Classification edge for support vector machine classifiers fitPosterior Fit posterior probabilities loss Classification error for support vector machine classifiers margin Classification margins for support vector machine classifiers predict Predict labels using support vector machine classification model

## Copy Semantics

Value. To learn how value classes affect copy operations, see Copying Objects (MATLAB).

## Examples

Load Fisher's iris data set. Remove the sepal lengths and widths, and all observed setosa irises.

```load fisheriris inds = ~strcmp(species,'setosa'); X = meas(inds,3:4); y = species(inds); ```

Train an SVM classifier using the processed data set.

```SVMModel = fitcsvm(X,y) ```
```SVMModel = ClassificationSVM ResponseName: 'Y' CategoricalPredictors: [] ClassNames: {'versicolor' 'virginica'} ScoreTransform: 'none' NumObservations: 100 Alpha: [24x1 double] Bias: -14.4149 KernelParameters: [1x1 struct] BoxConstraints: [100x1 double] ConvergenceInfo: [1x1 struct] IsSupportVector: [100x1 logical] Solver: 'SMO' ```

The Command Window shows that `SVMModel` is a trained `ClassificationSVM` classifier and a property list. Display the properties of `SVMModel`, for example, to determine the class order, by using dot notation.

```classOrder = SVMModel.ClassNames ```
```classOrder = 2x1 cell array {'versicolor'} {'virginica' } ```

The first class (`'versicolor'`) is the negative class, and the second (`'virginica'`) is the positive class. You can change the class order during training by using the `'ClassNames'` name-value pair argument.

Plot a scatter diagram of the data and circle the support vectors.

```sv = SVMModel.SupportVectors; figure gscatter(X(:,1),X(:,2),y) hold on plot(sv(:,1),sv(:,2),'ko','MarkerSize',10) legend('versicolor','virginica','Support Vector') hold off ```

The support vectors are observations that occur on or beyond their estimated class boundaries.

You can adjust the boundaries (and therefore the number of support vectors) by setting a box constraint during training using the `'BoxConstraint'` name-value pair argument.

Load the `ionosphere` data set.

```load ionosphere ```

Train and cross-validate an SVM classifier. It is good practice to standardize the predictors and specify the order of the classes.

```rng(1); % For reproducibility CVSVMModel = fitcsvm(X,Y,'Standardize',true,... 'ClassNames',{'b','g'},'CrossVal','on') ```
```CVSVMModel = classreg.learning.partition.ClassificationPartitionedModel CrossValidatedModel: 'SVM' PredictorNames: {1x34 cell} ResponseName: 'Y' NumObservations: 351 KFold: 10 Partition: [1x1 cvpartition] ClassNames: {'b' 'g'} ScoreTransform: 'none' ```

`CVSVMModel` is not a `ClassificationSVM` classifier, but a `ClassificationPartitionedModel` cross-validated, SVM classifier. By default, the software implements 10-fold cross-validation.

Alternatively, you can cross-validate a trained `ClassificationSVM` classifier by passing it to `crossval`.

Inspect one of the trained folds using dot notation.

```CVSVMModel.Trained{1} ```
```ans = classreg.learning.classif.CompactClassificationSVM ResponseName: 'Y' CategoricalPredictors: [] ClassNames: {'b' 'g'} ScoreTransform: 'none' Alpha: [78x1 double] Bias: -0.2209 KernelParameters: [1x1 struct] Mu: [1x34 double] Sigma: [1x34 double] SupportVectors: [78x34 double] SupportVectorLabels: [78x1 double] ```

Each fold is a `CompactClassificationSVM` classifier trained on 90% of the data.

Estimate the generalization error.

```genError = kfoldLoss(CVSVMModel) ```
```genError = 0.1168 ```

On average, the generalization error is approximately 12%.

## Algorithms

• For the mathematical formulation of the SVM binary classification algorithm, see Support Vector Machines for Binary Classification and Understanding Support Vector Machines.

• `NaN`, `<undefined>`, and empty character vector (`''`) values indicate missing values. `fitcsvm` removes entire rows of data corresponding to a missing response. When computing total weights (see the next bullets), `fitcsvm` ignores any weight corresponding to an observation with at least one missing predictor. This action can lead to unbalanced prior probabilities in balanced-class problems. Consequently, observation box constraints might not equal `BoxConstraint`.

• `fitcsvm` removes observations that have zero weight or prior probability.

• For two-class learning, if you specify the cost matrix $\mathcal{C}$ (see `Cost`), then the software updates the class prior probabilities p (see `Prior`) to pc by incorporating the penalties described in $\mathcal{C}$.

Specifically, `fitcsvm`:

1. Computes ${p}_{c}^{\ast }=p\prime \mathcal{C}.$

2. Normalizes pc* so that the updated prior probabilities sum 1:

`${p}_{c}=\frac{1}{\sum _{j=1}^{K}{p}_{c,j}^{\ast }}{p}_{c}^{\ast }.$`
K is the number of classes.

3. Resets the cost matrix to the default:

`$\mathcal{C}=\left[\begin{array}{cc}0& 1\\ 1& 0\end{array}\right].$`

4. Removes observations from the training data corresponding to classes with zero prior probability.

• For two-class learning, `fitcsvm` normalizes all observation weights (see `Weights`) to sum to 1. Then, renormalizes the normalized weights to sum up to the updated, prior probability of the class to which the observation belongs. That is, the total weight for observation j in class k is

wj is the normalized weight for observation j; pc,k is the updated prior probability of class k (see previous bullet).

• For two-class learning, `fitcsvm` assigns a box constraint to each observation in the training data. The formula for the box constraint of observation j is

`${C}_{j}=n{C}_{0}{w}_{j}^{\ast }.$`
n is the training sample size, C0 is the initial box constraint (see `BoxConstraint`), and ${w}_{j}^{\ast }$ is the total weight of observation j (see previous bullet).

• If you set `'Standardize',true` and any of `'Cost'`, `'Prior'`, or `'Weights'`, then `fitcsvm` standardizes the predictors using their corresponding weighted means and weighted standard deviations. That is, `fitcsvm` standardizes predictor j (xj) using

`${x}_{j}^{\ast }=\frac{{x}_{j}-{\mu }_{j}^{\ast }}{{\sigma }_{j}^{\ast }}.$`

• ${\mu }_{j}^{\ast }=\frac{1}{\sum _{k}{w}_{k}^{\ast }}\sum _{k}{w}_{k}^{\ast }{x}_{jk}.$

• xjk is observation k (row) of predictor j (column).

• ${\left({\sigma }_{j}^{\ast }\right)}^{2}=\frac{{v}_{1}}{{v}_{1}^{2}-{v}_{2}}\sum _{k}{w}_{k}^{\ast }{\left({x}_{jk}-{\mu }_{j}^{\ast }\right)}^{2}.$

• ${v}_{1}=\sum _{j}{w}_{j}^{\ast }.$

• ${v}_{2}=\sum _{j}{\left({w}_{j}^{\ast }\right)}^{2}.$

• Let `p` be the proportion of outliers that you expect in the training data. If you set `'OutlierFraction',p`, then:

• For one-class learning, the software trains the bias term such that 100`p`% of the observations in the training data have negative scores.

• The software implements robust learning for two-class learning. In other words, the software attempts to remove 100`p`% of the observations when the optimization algorithm converges. The removed observations correspond to gradients that are large in magnitude.

• If your predictor data contains categorical variables, then the software generally uses full dummy encoding for these variables. The software creates one dummy variable for each level of each categorical variable.

• The `PredictorNames` property stores one element for each of the original predictor variable names. For example, assume that there are three predictors, one of which is a categorical variable with three levels. Then `PredictorNames` is a 1-by-3 cell array of character vectors containing the original names of the predictor variables.

• The `ExpandedPredictorNames` property stores one element for each of the predictor variables, including the dummy variables. For example, assume that there are three predictors, one of which is a categorical variable with three levels. Then `ExpandedPredictorNames` is a 1-by-5 cell array of character vectors containing the names of the predictor variables and the new dummy variables.

• Similarly, the `Beta` property stores one beta coefficient for each predictor, including the dummy variables.

• The `SupportVectors` property stores the predictor values for the support vectors, including the dummy variables. For example, assume that there are m support vectors and three predictors, one of which is a categorical variable with three levels. Then `SupportVectors` is an n-by-5 matrix.

• The `X` property stores the training data as originally input. It does not include the dummy variables. When the input is a table, `X` contains only the columns used as predictors.

• For predictors specified in a table, if any of the variables contain ordered (ordinal) categories, the software uses ordinal encoding for these variables.

• For a variable having k ordered levels, the software creates k – 1 dummy variables. The jth dummy variable is -1 for levels up to j, and +1 for levels j + 1 through k.

• The names of the dummy variables stored in the `ExpandedPredictorNames` property indicate the first level with the value +1. The software stores k – 1 additional predictor names for the dummy variables, including the names of levels 2, 3, ..., k.

• All solvers implement L1 soft-margin minimization.

• `fitcsvm` and `svmtrain` use, among other algorithms, SMO for optimization. The software implements SMO differently between the two functions, but numerical studies show that there is sensible agreement in the results.

• For one-class learning, the software estimates the Lagrange multipliers, α1,...,αn, such that

`$\sum _{j=1}^{n}{\alpha }_{j}=n\nu .$`

## References

[1] Hastie, T., R. Tibshirani, and J. Friedman. The Elements of Statistical Learning, Second Edition. NY: Springer, 2008.

[2] Scholkopf, B., J. C. Platt, J. C. Shawe-Taylor, A. J. Smola, and R. C. Williamson. “Estimating the Support of a High-Dimensional Distribution.” Neural Comput., Vol. 13, Number 7, 2001, pp. 1443–1471.

[3] Christianini, N., and J. C. Shawe-Taylor. An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge, UK: Cambridge University Press, 2000.

[4] Scholkopf, B. and A. Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization and Beyond, Adaptive Computation and Machine Learning Cambridge, MA: The MIT Press, 2002.