Class: CompactClassificationEnsemble

Classification error


L = loss(ens,tbl,ResponseVarName)
L = loss(ens,tbl,Y)
L = loss(ens,X,Y)
L = loss(___,Name,Value)


L = loss(ens,tbl,ResponseVarName) returns the classification error for ensemble ens computed using table of predictors tbl and true class labels tbl.ResponseVarName.

L = loss(ens,tbl,Y) returns the classification error for ensemble ens computed using table of predictors tbl and true class labels Y.

L = loss(ens,X,Y) returns the classification error for ensemble ens computed using matrix of predictors X and true class labels Y.

L = loss(___,Name,Value) computes classification error with additional options specified by one or more Name,Value pair arguments, using any of the previous syntaxes.

When computing the loss, loss normalizes the class probabilities in ResponseVarName or Y to the class probabilities used for training, stored in the Prior property of ens.

Input Arguments


Classification ensemble created with fitensemble, or a compact classification ensemble created with compact.


Sample data, specified as a table. Each row of tbl corresponds to one observation, and each column corresponds to one predictor variable. tbl must contain all of the predictors used to train the model. Multi-column variables and cell arrays other than cell arrays of strings are not allowed.

If you trained ens using sample data contained in a table, then the input data for this method must also be in a table.


Response variable name, specified as the name of a variable in tbl. The response variable must be a numeric vector.

You must specify ResponseVarName as a string. For example, if the response variable Y is stored as tbl.Y, then specify it as 'Y'. Otherwise, the software treats all columns of tbl, including Y, as predictors when training the model.


Matrix of data to classify. Each row of X represents one observation, and each column represents one predictor. X must have the same number of columns as the data used to train ens. X should have the same number of rows as the number of elements in Y.

If you trained ens using sample data contained in a matrix, then the input data for this method must also be in a matrix.


Classification of tbl or X. Y should be of the same type as the classification used to train ens, and its number of elements should equal the number of rows of tbl or X.

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.


Indices of weak learners in the ensemble ranging from 1 to ens.NumTrained. loss uses only these learners for calculating loss.

Default: 1:NumTrained

'lossfun '

Function handle or string representing a loss function. Built-in loss functions:

You can write your own loss function in the syntax described in Loss Functions.

Default: 'classiferror'


String representing the meaning of the output L:

  • 'ensemble'L is a scalar value, the loss for the entire ensemble.

  • 'individual'L is a vector with one element per trained learner.

  • 'cumulative'L is a vector in which element J is obtained by using learners 1:J from the input list of learners.

Default: 'ensemble'


A logical matrix of size N-by-T, where:

  • N is the number of rows of X.

  • T is the number of weak learners in ens.

When UseObsForLearner(i,j) is true, learner j is used in predicting the class of row i of X.

Default: true(N,T)


Vector of observation weights, with nonnegative entries. The length of weights must equal the number of rows in X. When you specify weights, loss normalizes the weights so that observation weights in each class sum to the prior probability of that class.

Default: ones(size(X,1),1)

Output Arguments


Loss, by default the fraction of misclassified data. L can be a vector, and can mean different things, depending on the name-value pair settings.


Classification Error

The default classification error is the fraction of the data X that ens misclassifies, where Y are the true classifications.

Weighted classification error is the sum of weight i times the Boolean value that is 1 when tree misclassifies the ith row of X, divided by the sum of the weights.

Loss Functions

The built-in loss functions are:

  • 'binodeviance' — For binary classification, assume the classes yn are -1 and 1. With weight vector w normalized to have sum 1, and predictions of row n of data X as f(Xn), the binomial deviance is


  • 'classiferror' — Fraction of misclassified data, weighted by w.

  • 'exponential' — With the same definitions as for 'binodeviance', the exponential loss is


  • 'hinge' — Classification error measure that has the form



    • wj is weight j.

    • For binary classification, yj = 1 for the positive class and -1 for the negative class. For problems where the number of classes K > 3, yj is a vector of 0s, but with a 1 in the position corresponding to the true class, e.g., if the second observation is in the third class and K = 4, then y2 = [0 0 1 0]′.

    • f(Xj) is, for binary classification, the posterior probability or, for K > 3, a vector of posterior probabilities for each class given observation j.

  • 'mincost' — Predict the label with the smallest expected misclassification cost, with expectation taken over the posterior probability, and cost as given by the Cost property of the classifier (a matrix). The loss is then the true misclassification cost averaged over the observations.

To write your own loss function, create a function file of the form

function loss = lossfun(C,S,W,COST)
  • N is the number of rows of ens.X.

  • K is the number of classes in ens, represented in ens.ClassNames.

  • C is an N-by-K logical matrix, with one true per row for the true class. The index for each class is its position in tree.ClassNames.

  • S is an N-by-K numeric matrix. S is a matrix of posterior probabilities for classes with one row per observation, similar to the score output from predict.

  • W is a numeric vector with N elements, the observation weights.

  • COST is a K-by-K numeric matrix of misclassification costs. The default 'classiferror' gives a cost of 0 for correct classification, and 1 for misclassification.

  • The output loss should be a scalar.

Pass the function handle @lossfun as the value of the lossfun name-value pair.


Create a compact classification ensemble for the ionosphere data, and find the fraction of training data that the ensemble misclassifies:

load ionosphere
ada = fitensemble(X,Y,'AdaBoostM1',100,'tree');
adb = compact(ada);
L = loss(adb,X,Y)

L =

See Also

| | |

Was this topic helpful?