Regression loss for Gaussian kernel regression model

returns the MSE for the model

= loss(`Mdl`

,`Tbl`

,`ResponseVarName`

)`Mdl`

using the predictor data
in `Tbl`

and the true responses in
`Tbl.ResponseVarName`

.

specifies options using one or more name-value pair arguments in addition to any
of the input argument combinations in previous syntaxes. For example, you can
specify a regression loss function and observation weights. Then,
`L`

= loss(___,`Name,Value`

)`loss`

returns the weighted regression loss using the
specified loss function.

Train a Gaussian kernel regression model for a tall array, then calculate the resubstitution mean squared error and epsilon-insensitive error.

When you perform calculations on tall arrays, MATLAB® uses either a parallel pool (default if you have Parallel Computing Toolbox™) or the local MATLAB session. To run the example using the local MATLAB session when you have Parallel Computing Toolbox, change the global execution environment by using the `mapreducer` function.

function.

mapreducer(0)

Create a datastore that references the folder location with the data. The data can be contained in a single file, a collection of files, or an entire folder. Treat `'NA'`

values as missing data so that `datastore`

replaces them with `NaN`

values. Select a subset of the variables to use. Create a tall table on top of the datastore.

varnames = {'ArrTime','DepTime','ActualElapsedTime'}; ds = datastore('airlinesmall.csv','TreatAsMissing','NA',... 'SelectedVariableNames',varnames); t = tall(ds);

Specify `DepTime`

and `ArrTime`

as the predictor variables (`X`

) and `ActualElapsedTime`

as the response variable (`Y`

). Select the observations for which `ArrTime`

is later than `DepTime`

.

daytime = t.ArrTime>t.DepTime; Y = t.ActualElapsedTime(daytime); % Response data X = t{daytime,{'DepTime' 'ArrTime'}}; % Predictor data

Standardize the predictor variables.

`Z = zscore(X); % Standardize the data`

Train a default Gaussian kernel regression model with the standardized predictors. Set `'Verbose',0`

to suppress diagnostic messages.

`[Mdl,FitInfo] = fitrkernel(Z,Y,'Verbose',0)`

Mdl = RegressionKernel PredictorNames: {'x1' 'x2'} ResponseName: 'Y' Learner: 'svm' NumExpansionDimensions: 64 KernelScale: 1 Lambda: 8.5385e-06 BoxConstraint: 1 Epsilon: 5.9303 Properties, Methods

`FitInfo = `*struct with fields:*
Solver: 'LBFGS-tall'
LossFunction: 'epsiloninsensitive'
Lambda: 8.5385e-06
BetaTolerance: 1.0000e-03
GradientTolerance: 1.0000e-05
ObjectiveValue: 30.7814
GradientMagnitude: 0.0191
RelativeChangeInBeta: 0.0228
FitTime: 62.8279
History: []

`Mdl`

is a trained `RegressionKernel`

model, and the structure array `FitInfo`

contains optimization details.

Determine how well the trained model generalizes to new predictor values by estimating the resubstitution mean squared error and epsilon-insensitive error.

`lossMSE = loss(Mdl,Z,Y) % Resubstitution mean squared error`

lossMSE = MxNx... tall array ? ? ? ... ? ? ? ... ? ? ? ... : : : : : :

lossEI = loss(Mdl,Z,Y,'LossFun','epsiloninsensitive') % Resubstitution epsilon-insensitive error

lossEI = MxNx... tall array ? ? ? ... ? ? ? ... ? ? ? ... : : : : : :

Evaluate the tall arrays and bring the results into memory by using `gather`

.

[lossMSE,lossEI] = gather(lossMSE,lossEI)

Evaluating tall expression using the Local MATLAB Session: - Pass 1 of 1: Completed in 1.6 sec Evaluation completed in 1.9 sec

lossMSE = 2.8851e+03

lossEI = 28.0050

Specify a custom regression loss (Huber loss) for a Gaussian kernel regression model.

Load the `carbig`

data set.

`load carbig`

Specify the predictor variables (`X`

) and the response variable (`Y`

).

X = [Weight,Cylinders,Horsepower,Model_Year]; Y = MPG;

Delete rows of `X`

and `Y`

where either array has `NaN`

values. Removing rows with `NaN`

values before passing data to `fitrkernel`

can speed up training and reduce memory usage.

R = rmmissing([X Y]); X = R(:,1:4); Y = R(:,end);

Reserve 10% of the observations as a holdout sample. Extract the training and test indices from the partition definition.

rng(10) % For reproducibility N = length(Y); cvp = cvpartition(N,'Holdout',0.1); idxTrn = training(cvp); % Training set indices idxTest = test(cvp); % Test set indices

Standardize the training data and train the regression kernel model.

```
Xtrain = X(idxTrn,:);
Ytrain = Y(idxTrn);
[Ztrain,tr_mu,tr_sigma] = zscore(Xtrain); % Standardize the training data
tr_sigma(tr_sigma==0) = 1;
Mdl = fitrkernel(Ztrain,Ytrain)
```

Mdl = RegressionKernel ResponseName: 'Y' Learner: 'svm' NumExpansionDimensions: 128 KernelScale: 1 Lambda: 0.0028 BoxConstraint: 1 Epsilon: 0.8617 Properties, Methods

`Mdl`

is a `RegressionKernel`

model.

Create an anonymous function that measures Huber loss $(\delta =1)$, that is,

$$L=\frac{1}{\sum {w}_{j}}\sum _{j=1}^{n}{w}_{j}{\ell}_{j},$$

where

$$\begin{array}{l}\\ {\ell}_{j}=\{\begin{array}{c}0.5{\underset{}{\overset{\u02c6}{{e}_{j}}}}^{2};\\ \left|\underset{}{\overset{\u02c6}{{e}_{j}}}\right|-0.5;\phantom{\rule{0.2777777777777778em}{0ex}}\phantom{\rule{0.2777777777777778em}{0ex}}\end{array}\begin{array}{c}\phantom{\rule{0.2777777777777778em}{0ex}}\phantom{\rule{0.2777777777777778em}{0ex}}\left|\underset{}{\overset{\u02c6}{{e}_{j}}}\right|\le 1\\ \phantom{\rule{0.2777777777777778em}{0ex}}\phantom{\rule{0.2777777777777778em}{0ex}}\left|\underset{}{\overset{\u02c6}{{e}_{j}}}\right|>1\end{array}.\end{array}$$

$\underset{}{\overset{\u02c6}{{e}_{j}}}$ is the residual for observation *j*. Custom loss functions must be written in a particular form. For rules on writing a custom loss function, see the `'LossFun'`

name-value pair argument.

```
huberloss = @(Y,Yhat,W)sum(W.*((0.5*(abs(Y-Yhat)<=1).*(Y-Yhat).^2) + ...
((abs(Y-Yhat)>1).*abs(Y-Yhat)-0.5)))/sum(W);
```

Estimate the training set regression loss using the Huber loss function.

`eTrain = loss(Mdl,Ztrain,Ytrain,'LossFun',huberloss)`

eTrain = 1.7210

Standardize the test data using the same mean and standard deviation of the training data columns. Estimate the test set regression loss using the Huber loss function.

Xtest = X(idxTest,:); Ztest = (Xtest-tr_mu)./tr_sigma; % Standardize the test data Ytest = Y(idxTest); eTest = loss(Mdl,Ztest,Ytest,'LossFun',huberloss)

eTest = 1.3062

`Mdl`

— Kernel regression model`RegressionKernel`

model objectKernel regression model, specified as a `RegressionKernel`

model object. You can create a
`RegressionKernel`

model object using `fitrkernel`

.

`X`

— Predictor dataPredictor data, specified as an
*n*-by-*p* numeric matrix, where
*n* is the number of observations and
*p* is the number of predictors. *p*
must be equal to the number of predictors used to train
`Mdl`

.

**Data Types: **`single`

| `double`

`Tbl`

— Sample datatable

Sample data used to train the model, specified as a table. Each row of
`Tbl`

corresponds to one observation, and each column corresponds
to one predictor variable. Optionally, `Tbl`

can contain additional
columns for the response variable and observation weights. `Tbl`

must
contain all the predictors used to train `Mdl`

. Multicolumn variables
and cell arrays other than cell arrays of character vectors are not allowed.

If `Tbl`

contains the response variable used to train `Mdl`

, then you do not need to specify `ResponseVarName`

or `Y`

.

If you train `Mdl`

using sample data contained in a table, then the input
data for `loss`

must also be in a table.

`ResponseVarName`

— Response variable namename of variable in

`Tbl`

Response variable name, specified as the name of a variable in
`Tbl`

. The response variable must be a numeric
vector. If `Tbl`

contains the response variable used to
train `Mdl`

, then you do not need to specify
`ResponseVarName`

.

If you specify `ResponseVarName`

, then you must specify
it as a character vector or string scalar. For example, if the response
variable is stored as `Tbl.Y`

, then specify
`ResponseVarName`

as `'Y'`

.
Otherwise, the software treats all columns of `Tbl`

,
including `Tbl.Y`

, as predictors.

**Data Types: **`char`

| `string`

Specify optional
comma-separated pairs of `Name,Value`

arguments. `Name`

is
the argument name and `Value`

is the corresponding value.
`Name`

must appear inside quotes. You can specify several name and value
pair arguments in any order as
`Name1,Value1,...,NameN,ValueN`

.

```
L =
loss(Mdl,X,Y,'LossFun','epsiloninsensitive','Weights',weights)
```

returns
the weighted regression loss using the epsilon-insensitive loss
function.`'LossFun'`

— Loss function`'mse'`

(default) | `'epsiloninsensitive'`

| function handleLoss function, specified as the comma-separated pair consisting of
`'LossFun'`

and a built-in loss function name or a
function handle.

The following table lists the available loss functions. Specify one using its corresponding character vector or string scalar. Also, in the table, $$f\left(x\right)=T(x)\beta +b.$$

*x*is an observation (row vector) from*p*predictor variables.$$T(\xb7)$$ is a transformation of an observation (row vector) for feature expansion.

*T*(*x*) maps*x*in $${\mathbb{R}}^{p}$$ to a high-dimensional space ($${\mathbb{R}}^{m}$$).*β*is a vector of*m*coefficients.*b*is the scalar bias.

Value Description `'epsiloninsensitive'`

Epsilon-insensitive loss: $$\ell \left[y,f\left(x\right)\right]=\mathrm{max}\left[0,\left|y-f\left(x\right)\right|-\epsilon \right]$$ `'mse'`

MSE: $$\ell \left[y,f\left(x\right)\right]={\left[y-f\left(x\right)\right]}^{2}$$ `'epsiloninsensitive'`

is appropriate for SVM learners only.Specify your own function by using function handle notation.

Let

`n`

be the number of observations in`X`

. Your function must have this signature:`lossvalue =`

(Y,Yhat,W)`lossfun`

The output argument

`lossvalue`

is a scalar.You choose the function name (

).`lossfun`

`Y`

is an*n*-dimensional vector of observed responses.`loss`

passes the input argument`Y`

in for`Y`

.`Yhat`

is an*n*-dimensional vector of predicted responses, which is similar to the output of`predict`

.`W`

is an`n`

-by-1 numeric vector of observation weights.

Specify your function using

`'LossFun',@`

.`lossfun`

**Data Types: **`char`

| `string`

| `function_handle`

`'Weights'`

— Observation weights`ones(size(X,1),1)`

(default) | numeric vector | name of variable in `Tbl`

Observation weights, specified as the comma-separated pair consisting
of `'Weights'`

and a numeric vector or the name of a
variable in `Tbl`

.

If

`Weights`

is a numeric vector, then the size of`Weights`

must be equal to the number of rows in`X`

or`Tbl`

.If

`Weights`

is the name of a variable in`Tbl`

, you must specify`Weights`

as a character vector or string scalar. For example, if the weights are stored as`Tbl.W`

, then specify`Weights`

as`'W'`

. Otherwise, the software treats all columns of`Tbl`

, including`Tbl.W`

, as predictors.

If you supply the observation weights, `loss`

computes the weighted regression loss, that is, the Weighted Mean Squared Error or
Epsilon-Insensitive Loss Function.

`loss`

normalizes `Weights`

to
sum to 1.

**Data Types: **`double`

| `single`

| `char`

| `string`

The weighted mean squared error is calculated as follows:

$$\text{mse}=\frac{{\displaystyle \sum _{j=1}^{n}{w}_{j}{\left(f\left({x}_{j}\right)-{y}_{j}\right)}^{2}}}{{\displaystyle \sum _{j=1}^{n}{w}_{j}}}\text{\hspace{0.17em}},$$

where:

*n*is the number of observations.*x*is the_{j}*j*th observation (row of predictor data).*y*is the observed response to_{j}*x*._{j}*f*(*x*) is the response prediction of the Gaussian kernel regression model_{j}`Mdl`

to*x*._{j}*w*is the vector of observation weights.

Each observation weight in *w* is equal to
`ones(`

by
default. You can specify different values for the observation weights by using the
*n*,1)/*n*`'Weights'`

name-value pair argument.
`loss`

normalizes `Weights`

to sum to
1.

The epsilon-insensitive loss function ignores errors that are within the distance epsilon (ε) of the function value. The function is formally described as:

$$Los{s}_{\epsilon}=\{\begin{array}{c}0\text{\hspace{0.17em}},\text{\hspace{0.17em}}if\text{\hspace{0.17em}}\left|y-f\left(x\right)\right|\le \epsilon \\ \left|y-f\left(x\right)\right|-\epsilon \text{\hspace{0.17em}},\text{\hspace{0.17em}}otherwise.\end{array}$$

The mean epsilon-insensitive loss is calculated as follows:

$$Loss=\frac{{\displaystyle \sum _{j=1}^{n}{w}_{j}\mathrm{max}\left(0,\left|{y}_{j}-f\left({x}_{j}\right)\right|-\epsilon \right)}}{{\displaystyle \sum _{j=1}^{n}{w}_{j}}}\text{\hspace{0.17em}},$$

where:

*n*is the number of observations.*x*is the_{j}*j*th observation (row of predictor data).*y*is the observed response to_{j}*x*._{j}*f*(*x*) is the response prediction of the Gaussian kernel regression model_{j}`Mdl`

to*x*._{j}*w*is the vector of observation weights.

Each observation weight in *w* is equal to
`ones(`

by
default. You can specify different values for the observation weights by using the
*n*,1)/*n*`'Weights'`

name-value pair argument.
`loss`

normalizes `Weights`

to sum to
1.

Calculate with arrays that have more rows than fit in memory.

Usage notes and limitations:

`loss`

does not support tall`table`

data.

For more information, see Tall Arrays.

