Thanks a lot for sharing the codes. Can I ask one simple question?
As the ICML paper mentioned, in PR curve, recall values do not necessarily change linearly with precisions. Hence, the ROC curve is first constructed, and next, the PR curve is inferred from the ROC curve.
Could you please help confirm whether the provided codes do the similar stuff?
Thanks a lot
I used this toolbox successfully with Matlab R2013a. It is very beneficial to have the book referenced in the above description. My application was classification of sounds in a trained NN into one of several categories. here are a few notes from my specific application:
Activation Functions Investigated
Linear – simplest, gives good results
Softmax – best general purpose for 1 of N classification
Logistic – good for binary classifications
Conjugate gradient descent – worst performing method
Scaled conjugate gradient descent (SCG) – sometimes superior
Quasi-Newton – gives most consistent results for current data set
Search for best number of hidden units
Smaller number runs faster/simpler
Larger number may provide more accurate results with the possibility of over-fitting the available data
Current data set, with 4 possible sound classifications, gave best result with about 15 hidden units
I also tried using a support vector machine for the same application and it performed slightly better.