Resubstitution loss for support vector machine regression model
L = resubLoss(mdl)
L = resubLoss(mdl,Name,Value)
the resubstitution loss for the support vector machine (SVM) regression
L = resubLoss(
mdl, using the training data stored in
corresponding response values stored in
the resubstitution loss with additional options specified by one or
L = resubLoss(
Name,Value pair arguments. For example,
you can specify the loss function or observation weights.
comma-separated pairs of
the argument name and
Value is the corresponding value.
Name must appear inside quotes. You can specify several name and value
pair arguments in any order as
'LossFun'— Loss function
'epsiloninsensitive'| function handle
Loss function, specified as the comma-separated pair consisting of
'epsiloninsensitive', or a function
The following table lists the available loss functions. Specify one using its corresponding value.
Specify your own function using function handle notation.
n = size(X,1) is the sample
size. Your function must have the signature
lossvalue = lossfun(Y,Yfit,W),
The output argument
a numeric value.
You choose the function name (
Y is an n-by-1
numeric vector of observed response values.
Yfit is an n-by-1
numeric vector of predicted response values, calculated using the
corresponding predictor values in
to the output of
W is an n-by-1
numeric vector of observation weights.
Specify your function using
'Weights'— Observation weights
ones(size(X,1),1)(default) | numeric vector
Observation weights, specified as the comma-separated pair consisting
'Weights' and a numeric vector.
be the same length as the number of rows in
The software weighs the observations in each row of
the corresponding weight value in
L— Resubstitution loss
Resubstitution loss, returned as a scalar value.
The resubstitution loss is the loss calculated between the response training data and the model’s predicted response values based on the input training data.
Resubstitution loss can be an overly optimistic estimate of the predictive error on new data. If the resubstitution loss is high, the model’s predictions are not likely to be very good. However, having a low resubstitution loss does not guarantee good predictions for new data.
To better assess the predictive accuracy of your model, cross
validate the model using
This example shows how to train an SVM regression model, then calculate the resubstitution loss using mean square error (MSE) and epsilon-insensitive loss.
This example uses the abalone data from the UCI Machine
Learning Repository. Download the data and save it in your current
directory with the name
Read the data into a
tbl = readtable('abalone.data','Filetype','text','ReadVariableNames',false); rng default % for reproducibility
The sample data contains 4177 observations. All of the predictor
variables are continuous except for
is a categorical variable with possible values
'F' (for females), and
infants). The goal is to predict the number of rings on the abalone,
and thereby determine its age, using physical measurements.
Train an SVM regression model to the data, using a Gaussian kernel function with an automatic kernel scale. Standardize the data.
mdl = fitrsvm(tbl,'Var9','KernelFunction','gaussian','KernelScale','auto','Standardize',true);
Calculate the resubstitution loss using mean square error (MSE).
mse_loss = resubLoss(mdl)
mse_loss = 4.0603
Calculate the epsilon-insensitive loss.
eps_loss = resubLoss(mdl,'LossFun','epsiloninsensitive')
eps_loss = 1.1027
The weighted mean squared error is calculated as follows:
n is the number of rows of data
xj is the jth row of data
yj is the true response to xj
the response prediction of the SVM regression model
mdl to xj
w is the vector of weights.
The weights in w are all equal to one by
default. You can specify different values for weights using the
pair argument. If you specify weights, each value is divided by the
sum of all weights, such that the normalized weights add to one.
The epsilon-insensitive loss function ignores errors that are within the distance epsilon (ε) of the function value. It is formally described as:
The mean epsilon-insensitive loss is calculated as follows:
 Nash, W.J., T. L. Sellers, S. R. Talbot, A. J. Cawthorn, and W. B. Ford. The Population Biology of Abalone (Haliotis species) in Tasmania. I. Blacklip Abalone (H. rubra) from the North Coast and Islands of Bass Strait, Sea Fisheries Division, Technical Report No. 48, 1994.
 Waugh, S. Extending and benchmarking Cascade-Correlation, Ph.D. thesis, Computer Science Department, University of Tasmania, 1995.
 Clark, D., Z. Schreter, A. Adams. A Quantitative Comparison of Dystal and Backpropagation, submitted to the Australian Conference on Neural Networks, 1996.
 Lichman, M. UCI Machine Learning Repository, [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.