Linear model for binary classification of high-dimensional data

`ClassificationLinear`

is a trained linear model object for binary
classification; the linear model is a support vector machine (SVM) or logistic
regression model. `fitclinear`

fits a
`ClassificationLinear`

model by minimizing the objective function
using techniques that reduce computation time for high-dimensional data sets (e.g.,
stochastic gradient descent). The classification loss plus the regularization term
compose the objective function.

Unlike other classification models, and for economical memory usage,
`ClassificationLinear`

model objects do not store the training
data. However, they do store, for example, the estimated linear model coefficients,
prior-class probabilities, and the regularization strength.

You can use trained `ClassificationLinear`

models to predict labels
or classification scores for new data. For details, see `predict`

.

Create a `ClassificationLinear`

object by using `fitclinear`

.

`Lambda`

— Regularization term strengthnonnegative scalar | vector of nonnegative values

Regularization term strength, specified as a nonnegative scalar or vector of nonnegative values.

**Data Types: **`double`

| `single`

`Learner`

— Linear classification model type`'logistic'`

| `'svm'`

Linear classification model type, specified as
`'logistic'`

or `'svm'`

.

In this table, $$f\left(x\right)=x\beta +b.$$

*β*is a vector of*p*coefficients.*x*is an observation from*p*predictor variables.*b*is the scalar bias.

Value | Algorithm | Loss Function | `FittedLoss` Value |
---|---|---|---|

`'logistic'` | Logistic regression | Deviance (logistic): $$\ell \left[y,f\left(x\right)\right]=\mathrm{log}\left\{1+\mathrm{exp}\left[-yf\left(x\right)\right]\right\}$$ | `'logit'` |

`'svm'` | Support vector machine | Hinge: $$\ell \left[y,f\left(x\right)\right]=\mathrm{max}\left[0,1-yf\left(x\right)\right]$$ | `'hinge'` |

`Beta`

— Linear coefficient estimatesnumeric vector

Linear coefficient estimates, specified as a numeric vector with length equal to the number of predictors.

**Data Types: **`double`

`Bias`

— Estimated bias termnumeric scalar

Estimated bias term or model intercept, specified as a numeric scalar.

**Data Types: **`double`

`FittedLoss`

— Loss function used to fit linear model`'hinge'`

| `'logit'`

Loss function used to fit the linear model, specified as `'hinge'`

or
`'logit'`

.

Value | Algorithm | Loss Function | `Learner` Value |
---|---|---|---|

`'hinge'` | Support vector machine | Hinge: $$\ell \left[y,f\left(x\right)\right]=\mathrm{max}\left[0,1-yf\left(x\right)\right]$$ | `'svm'` |

`'logit'` | Logistic regression | Deviance (logistic): $$\ell \left[y,f\left(x\right)\right]=\mathrm{log}\left\{1+\mathrm{exp}\left[-yf\left(x\right)\right]\right\}$$ | `'logistic'` |

`Regularization`

— Complexity penalty type`'lasso (L1)'`

| `'ridge (L2)'`

Complexity penalty type, specified as `'lasso (L1)'`

or ```
'ridge
(L2)'
```

.

The software composes the objective function for minimization from the sum of the average loss
function (see `FittedLoss`

) and a regularization value from this
table.

Value | Description |
---|---|

`'lasso (L1)'` | Lasso (L_{1}) penalty: $$\lambda {\displaystyle \sum _{j=1}^{p}\left|{\beta}_{j}\right|}$$ |

`'ridge (L2)'` | Ridge (L_{2}) penalty: $$\frac{\lambda}{2}{\displaystyle \sum _{j=1}^{p}{\beta}_{j}^{2}}$$ |

*λ* specifies the regularization term
strength (see `Lambda`

).

The software excludes the bias term (*β*_{0})
from the regularization penalty.

`CategoricalPredictors`

— Indices of categorical predictors`[]`

Indices of categorical predictors, whose value is always empty (`[]`

)
because a `ClassificationLinear`

model does not support categorical
predictors.

`ClassNames`

— Unique class labelscategorical array | character array | logical vector | numeric vector | cell array of character vectors

Unique class labels used in training, specified as a categorical or
character array, logical or numeric vector, or cell array of
character vectors. `ClassNames`

has the same
data type as the class labels `Y`

.
(The software treats string arrays as cell arrays of character
vectors.)
`ClassNames`

also determines the class
order.

**Data Types: **`categorical`

| `char`

| `logical`

| `single`

| `double`

| `cell`

`Cost`

— Misclassification costssquare numeric matrix

This property is read-only.

Misclassification costs, specified as a square numeric matrix. `Cost`

has *K* rows
and columns, where *K* is the number of classes.

`Cost(`

is
the cost of classifying a point into class * i*,

`j`

`j`

`i`

`Cost`

corresponds to the order of
the classes in `ClassNames`

.**Data Types: **`double`

`ModelParameters`

— Parameters used for training modelstructure

Parameters used for training the `ClassificationLinear`

model, specified as a structure.

Access fields of `ModelParameters`

using dot notation. For example, access
the relative tolerance on the linear coefficients and the bias term by using
`Mdl.ModelParameters.BetaTolerance`

.

**Data Types: **`struct`

`PredictorNames`

— Predictor namescell array of character vectors

Predictor names in order of their appearance in the predictor data
`X`

, specified as a cell array of
character vectors. The length of
`PredictorNames`

is equal to the
number of columns in `X`

.

**Data Types: **`cell`

`ExpandedPredictorNames`

— Expanded predictor namescell array of character vectors

Expanded predictor names, specified as a cell array of character vectors.

Because a `ClassificationLinear`

model does not support categorical predictors,
`ExpandedPredictorNames`

and `PredictorNames`

are equal.

**Data Types: **`cell`

`Prior`

— Prior class probabilitiesnumeric vector

This property is read-only.

Prior class probabilities, specified as a numeric vector.
`Prior`

has as many elements as
classes in `ClassNames`

, and the order of the
elements corresponds to the elements of
`ClassNames`

.

**Data Types: **`double`

`ScoreTransform`

— Score transformation function`'doublelogit'`

| `'invlogit'`

| `'ismax'`

| `'logit'`

| `'none'`

| function handle | ...Score transformation function to apply to predicted scores, specified as a function name or function handle.

For linear classification models and before transformation, the predicted
classification score for the observation *x* (row vector) is *f*(*x*) =
*x**β* + *b*, where *β* and *b* correspond to
`Mdl.Beta`

and `Mdl.Bias`

, respectively.

To change the score transformation function to, for example,
* function*, use dot notation.

For a built-in function, enter this code and replace

with a value in the table.`function`

Mdl.ScoreTransform = '

*function*';Value Description `'doublelogit'`

1/(1 + *e*^{–2x})`'invlogit'`

log( *x*/ (1 –*x*))`'ismax'`

Sets the score for the class with the largest score to `1`

, and sets the scores for all other classes to`0`

`'logit'`

1/(1 + *e*^{–x})`'none'`

or`'identity'`

*x*(no transformation)`'sign'`

–1 for *x*< 0

0 for*x*= 0

1 for*x*> 0`'symmetric'`

2 *x*– 1`'symmetricismax'`

Sets the score for the class with the largest score to `1`

, and sets the scores for all other classes to`–1`

`'symmetriclogit'`

2/(1 + *e*^{–x}) – 1For a MATLAB

^{®}function, or a function that you define, enter its function handle.Mdl.ScoreTransform = @

*function*;must accept a matrix of the original scores for each class, and then return a matrix of the same size representing the transformed scores for each class.`function`

**Data Types: **`char`

| `function_handle`

`ResponseName`

— Response variable namecharacter vector

Response variable name, specified as a character vector.

**Data Types: **`char`

edge | Classification edge for linear classification models |

loss | Classification loss for linear classification models |

margin | Classification margins for linear classification models |

predict | Predict labels for linear classification models |

selectModels | Choose subset of regularized, binary linear classification models |

Value. To learn how value classes affect copy operations, see Copying Objects (MATLAB).

Train a binary, linear classification model using support vector machines, dual SGD, and ridge regularization.

Load the NLP data set.

`load nlpdata`

`X`

is a sparse matrix of predictor data, and `Y`

is a categorical vector of class labels. There are more than two classes in the data.

Identify the labels that correspond to the Statistics and Machine Learning Toolbox™ documentation web pages.

`Ystats = Y == 'stats';`

Train a binary, linear classification model that can identify whether the word counts in a documentation web page are from the Statistics and Machine Learning Toolbox™ documentation. Train the model using the entire data set. Determine how well the optimization algorithm fit the model to the data by extracting a fit summary.

```
rng(1); % For reproducibility
[Mdl,FitInfo] = fitclinear(X,Ystats)
```

Mdl = ClassificationLinear ResponseName: 'Y' ClassNames: [0 1] ScoreTransform: 'none' Beta: [34023x1 double] Bias: -1.0059 Lambda: 3.1674e-05 Learner: 'svm' Properties, Methods

`FitInfo = `*struct with fields:*
Lambda: 3.1674e-05
Objective: 5.3783e-04
PassLimit: 10
NumPasses: 10
BatchLimit: []
NumIterations: 238561
GradientNorm: NaN
GradientTolerance: 0
RelativeChangeInBeta: 0.0562
BetaTolerance: 1.0000e-04
DeltaGradient: 1.4582
DeltaGradientTolerance: 1
TerminationCode: 0
TerminationStatus: {'Iteration limit exceeded.'}
Alpha: [31572x1 double]
History: []
FitTime: 0.1484
Solver: {'dual'}

`Mdl`

is a `ClassificationLinear`

model. You can pass `Mdl`

and the training or new data to `loss`

to inspect the in-sample classification error. Or, you can pass `Mdl`

and new predictor data to `predict`

to predict class labels for new observations.

`FitInfo`

is a structure array containing, among other things, the termination status (`TerminationStatus`

) and how long the solver took to fit the model to the data (`FitTime`

). It is good practice to use `FitInfo`

to determine whether optimization-termination measurements are satisfactory. Because training time is small, you can try to retrain the model, but increase the number of passes through the data. This can improve measures like `DeltaGradient`

.

Load the NLP data set.

load nlpdata n = size(X,1); % Number of observations

Identify the labels that correspond to the Statistics and Machine Learning Toolbox™ documentation web pages.

`Ystats = Y == 'stats';`

Hold out 5% of the data.

rng(1); % For reproducibility cvp = cvpartition(n,'Holdout',0.05)

cvp = Hold-out cross validation partition NumObservations: 31572 NumTestSets: 1 TrainSize: 29994 TestSize: 1578

`cvp`

is a `CVPartition`

object that defines the random partition of *n* data into training and test sets.

Train a binary, linear classification model using the training set that can identify whether the word counts in a documentation web page are from the Statistics and Machine Learning Toolbox™ documentation. For faster training time, orient the predictor data matrix so that the observations are in columns.

idxTrain = training(cvp); % Extract training set indices X = X'; Mdl = fitclinear(X(:,idxTrain),Ystats(idxTrain),'ObservationsIn','columns');

Predict observations and classification error for the hold out sample.

idxTest = test(cvp); % Extract test set indices labels = predict(Mdl,X(:,idxTest),'ObservationsIn','columns'); L = loss(Mdl,X(:,idxTest),Ystats(idxTest),'ObservationsIn','columns')

L = 7.1753e-04

`Mdl`

misclassifies fewer than 1% of the out-of-sample observations.

Generate C and C++ code using MATLAB® Coder™.

Usage notes and limitations:

When you train a linear classification model by using

`fitclinear`

, the following restrictions apply.The predictor data input argument value (

`X`

) must be a full, numeric matrix.The class labels input argument value (

`Y`

) cannot be a categorical array.The value of the

`'ClassNames'`

name-value pair argument or property cannot be a categorical array.You can specify only one regularization strength, either

`'auto'`

or a nonnegative scalar for the`'Lambda'`

name-value pair argument.The value of the

`'ScoreTransform'`

name-value pair argument cannot be an anonymous function.

For more information, see Introduction to Code Generation.

`ClassificationECOC`

| `ClassificationKernel`

| `ClassificationPartitionedLinear`

| `ClassificationPartitionedLinearECOC`

| `fitclinear`

| `predict`

A modified version of this example exists on your system. Do you want to open this version instead?

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

Select web siteYou can also select a web site from the following list:

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

- América Latina (Español)
- Canada (English)
- United States (English)

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)