A support vector machine (SVM) is a supervised learning algorithm that can be used for binary classification or regression. Support vector machines are popular in applications such as natural language processing, speech and image recognition, and computer vision.

A support vector machine constructs an optimal hyperplane as a decision surface such that the margin of separation between the two classes in the data is maximized. Support vectors refer to a small subset of the training observations that are used as support for the optimal location of the decision surface.

Support vector machines fall under a class of machine learning algorithms called kernel methods and are also referred to as kernel machines.

Training for a support vector machine has two phases:

- Transform predictors (input data) to a high-dimensional feature space. It is sufficient to just specify the kernel for this step and the data is never explicitly transformed to the feature space. This process is commonly known as the kernel trick.
- Solve a quadratic optimization problem to fit an optimal hyperplane to classify the transformed features into two classes. The number of transformed features is determined by the number of support vectors.

Only the support vectors chosen from the training data are required to construct the decision surface. Once trained, the rest of the training data are irrelevant.

Popular kernels used with SVMs include:

Type of SVM | Mercer Kernel | Description |
---|---|---|

Gaussian or Radial Basis Function (RBF) | \(K(x_1,x_2) = \exp\left(-\frac{\|x_1 - x_2\|^2}{2\sigma^2}\right)\) | One class learning. \(\sigma\)is the width of the kernel |

Linear | \(K(x_1,x_2) = x_1^{\mathsf{T}}x_2\) |
Two class learning. |

Polynomial | \(K(x_1,x_2) = \left( x_1^{\mathsf{T}}x_2 + 1 \right)^{\rho}\) |
\(\rho\) is the order of the polynomial |

Sigmoid | \(K(x_1,x_2) = \tanh\left( \beta_{0}x_1^{\mathsf{T}}x_2 + \beta_{1} \right)\) |
It is a mercer kernel for certain \(\beta_{0}\) and \(\beta_{1}\) values only |

For more on how to fit support vector machine classifiers, see Statistics and Machine Learning Toolbox™ for use with MATLAB^{®}.