File Exchange

image thumbnail


version (5.37 MB) by Barnan Das
Implementation of SMOTEBoost algorithm used to handle class imbalance problem in data.


Updated 26 Jun 2012

View License

This code implements SMOTEBoost. SMOTEBoost is an algorithm to handle class imbalance problem in data with discrete class labels. It uses a combination of SMOTE and the standard boosting procedure AdaBoost to better model the minority class by providing the learner not only with the minority class examples that were misclassified in the previous boosting iteration but also with broader representation of those instances (achieved by SMOTE). Since boosting algorithms give equal weight to all misclassified examples and sample from a
pool of data that predominantly consists of majority class, subsequent sampling
of the training set is still skewed towards the majority class. Thus, to reduce the bias inherent in the learning procedure due to class imbalance and to
increase the sampling weights of minority class, SMOTE is introduced at each
round of boosting. Introduction of SMOTE increases the number of minority class
samples for the learner and focus on these cases in the distribution at each
boosting round. In addition to maximizing the margin for the skewed class dataset, this procedure also increases the diversity among the classifiers in the ensemble because at each iteration a different set of synthetic samples are

For more detail on the theoretical description of the algorithm please refer to the following paper:
N.V. Chawla, A.Lazarevic, L.O. Hall, K. Bowyer, "SMOTEBoost: Improving Prediction of Minority Class in Boosting, Journal of Knowledge Discovery in Databases: PKDD, 2003.

The current implementation of SMOTEBoost has been independently done by the author
for the purpose of research. In order to enable the users use a lot of different
weak learners for boosting, an interface is created with Weka API. Currently,
four Weka algortihms could be used as weak learner: J48, SMO, IBk, Logistic.

Cite As

Barnan Das (2021). SMOTEBoost (, MATLAB Central File Exchange. Retrieved .

Comments and Ratings (8)

li sheng

The same error as above,there is no class named SMOTE.How to sovle this problem?

Ahmad Obeid

Thank you for the upload.
The funciton doesn't seem to be finding a class titled SMOTE.class and is prompting the following error:

No class weka.filters.supervised.instance.SMOTE can be located on the Java class path.

when calling the following line in SMOTEBoost.m

smote = javaObject('weka.filters.supervised.instance.SMOTE');

I was able to get into the file weka.jar and traced down the file going from weka to filters to supervised...etc, and indeed there was no mention of any file titled SMOTE.class.

Any ideas how this can be resolved?

bejoy abraham

Han Yan

albert wang

from this webiste,you can found the new code for changing the .mat,.txt and .csv to the arff format.


If you get the above java error, then you need to reinstall Weka machine learning toolbox:

Once installed, replace the weka.jar in SMOTE directory with the one installed.

ben Glampson


I'm really interested in using SMOTEboost for some research I'm working on. However, after running the file test.m I get the following error:

Error using SMOTEBoost (line 122)
Java exception occurred:
java.lang.IllegalArgumentException: Comparison method violates its general contract!

at java.util.TimSort.mergeLo(Unknown Source)

at java.util.TimSort.mergeAt(Unknown Source)

at java.util.TimSort.mergeCollapse(Unknown Source)

at java.util.TimSort.sort(Unknown Source)

at java.util.TimSort.sort(Unknown Source)

at java.util.Arrays.sort(Unknown Source)

at java.util.Collections.sort(Unknown Source)

at weka.filters.supervised.instance.SMOTE.doSMOTE(

at weka.filters.supervised.instance.SMOTE.batchFinished(

at weka.filters.Filter.useFilter(

Error in Test (line 34)
prediction = SMOTEBoost(train_data,test_data,'tree',false);

Any help would be most welcome.



Alex Kong

I have a question about this implementation: in every iteration, you don't generate new synthetic samples using SMOTE, and just re-assign the weights by averaging the positive instance set and negative instance set. Could you please explain why?
In the SMOTEBoost paper, the authors didn't say how to assign weights to those newly-generated samples, do you have any idea about this?

MATLAB Release Compatibility
Created with R2011a
Compatible with any release
Platform Compatibility
Windows macOS Linux

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!