Score transform for RUSBoost in fitcensemble

25 views (last 30 days)
the cyclist
the cyclist on 12 Apr 2024 at 15:57
Edited: the cyclist on 25 Apr 2024 at 15:32
The documentation for the predict function of fitcensemble lists the score transforms (to convert scores to probabilities) for the following model methods:
  • Bag (none)
  • AdaBoostM1 (doublelogit)
  • GentleBoost (doublelogit)
  • LogitBoost (doublelogit)
But it does not list a score transform for several other possible fitcensemble methods. Are these documented somewhere else?
I'm currently most interested in RUSBoost, because that is the method I am using.

Answers (1)

Sarthak
Sarthak on 24 Apr 2024 at 10:36
Hey,
The RUSBoost is a boosting algorithm that uses random undersampling, which is an effective method for the cases where the classes are imbalanced. The actual boosting mechanism used under the hood is still the multi-class AdaBoost mechanism as mentioned in the following MathWorks documentation:
If we don't have a binary classification problem, the score transforms logit, or doublelogit, do not convert the scores of the ensembles into to 0-1 range; the operations for these score transforms are given as
out = 1./(1+exp(-in));
out = 1./(1+exp(-2*in));
That is why the idea of using ScoreTransform as doublelogit is mentioned for only the binary classification models.
Now, if you look at the Multi-class AdaBoost paper (https://hastie.su.domains/Papers/samme.pdf), in particular, Eq. 4 and the equation right below it, you can see how the scores can be converted to probabilities. We can basically use the softmax function to do it. The way it would work is, we would pass (1/K-1)*s_i, where K is the number of classes and i is the score corresponding to ith class, with i = 1,...,K, into the softmax function. For your convenience, I've written this function:
function out = mysoftmax(in)
% number of columns of in is the number of classes
K = size(in,2);
in = (1/(K-1)).*in;
inmax = max(in,[],2);
in = in-inmax;
in = bsxfun(@minus,in,inmax);
numerator = exp(in);
denominator = sum(numerator,2);
denominator(denominator == 0) = 1;
out= numerator./denominator;
end
After training the ensemble, the user can create this MATLAB file and then set the ScoreTransform to this function (note that ScoreTransform can be set to a function handle):
mdl.ScoreTransform = @mysoftmax;
Once you do that and call predict, the scores will not be normalized to lie between 0 and 1, and it will be using the conditional probability formulation provided in the aforementioned paper.
I hope this helps!
  1 Comment
the cyclist
the cyclist on 25 Apr 2024 at 12:57
Edited: the cyclist on 25 Apr 2024 at 15:32
Thanks for the thoughtful answer. I'll need to spend some time to really get into the details, but I have a couple questions that you might know the answer to more quickly!
I realize that the softmax function you wrote is more general, for multiclass classification, but does it collapse to either logit or double logit, for binary classification? I would guess double logit, since that is what AdaBoostM1 uses, but a quick glance at your function doesn't suggest to me that that is the case. (I'm doing binary classification, and RUSBoost happens to be chosen as the best model sometimes, which is why I need to know the score transform for it specifically.)
Finally, I'm a bit confused by these two lines of your code:
in = in-inmax;
in = bsxfun(@minus,in,inmax);
Don't those lines do the same thing (in versions of MATLAB with implicit expansion)? Should there be only one of them?
Thanks!

Sign in to comment.

Products


Release

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!