Hi!
I have imbalanced data to classify thus the mse performance function is not suitable, just as the mae, sae and sse.
So, I would create a new performance function based on sensibility and specificity but I have not found any way to edit it.
The only thing I found is "template_performance" but it's obsoleted for Matlab 2012 and, anyway, I don't understand how manage with it.
So, please, could you provide me with an example or a tutorial ?
Thanks by advance
I have never had reliable results with a MLP when the training priors differed by more than a factor of 2.
If you cannot oversample the underrepresented class, then undersample the overrepresented class (with each subsample no larger than twice the size of the smaller class).
A good way to subsample the larger class is to cluster it into multiple localized subsets that are subsequently randomly sampled.
Combine the results of independently trained multiple nets in either an ensemble (combine probability estimates) or a commitee (combine classification votes).
I have never had reliable results with a MLP using the noncontinuous misclassification error as a direct minimization goal.
Minimize MSE or weighted MSE for 0 or 1 targets.
Vary the MSE error weights until you can get approximately equal MSEs for both classes.
Use a holdout validation set and a varying threshold from 0 to 1 in order to get your operating curve.
Hope this helps.
Thank you for formally accepting my answer.
Greg
I have written about the unbalanced classification problem many times.
Try searching comp.ai.neural-nets and the CSSM newgroup
heath unbalanced
Stop laughing.
The quickest solution is to duplicate vectors in the smaller classes so that all classes have equal sizes.
Then, for c classes, use columns of the c-dimensional unit matrix as targets
Hope this helps.
Thank you for formally accepting my answer.
Greg
Hi Greg,
First, sorry for not having answered you earlier, I had issues with my internet connection.
Thanks for your answer but I really want to change my performance function because over-sampling is not the most appropriate way for my work.
I must use a performance function based on sensibility and specificity.
Thanks again,
CHC
0 Comments