Crossvalidation: anonymous function handle with toolbox classifiers
3 views (last 30 days)
Show older comments
Hi everyone,
I'll like to use the matlab crossvalidation function (crossval) with a randomforest classification toolbox (specifically http://code.google.com/p/randomforest-matlab/). As the predfun is defined in the documentation ( http://www.mathworks.com/help/toolbox/stats/crossval.html) I should give a function that retrieves the predictions for a set of test data XTEST. So, in agreement with the syntax, I should give a function like this:
classf= @(XTRAIN,ytrain,XTEST) classRF_predict(XTEST,classRF_train(XTRAIN,ytrain,1000));
such function takes as input the XTEST, the model itself that needs XTRAIN and ytrain. The problem comes when I try to run the cross validation, getting the follow error message.
cvMCR = crossval('mcr',X,y,'predfun',classf)
Error using crossval>evalFun (line 465)
The function
'@(XTRAIN,ytrain,XTEST)classRF_predict(XTEST,classRF_train(XTRAIN,ytrain,1000))'
generated the following error:
Cannot concatenate a double array and a nominal array.
Error in crossval>getLossVal (line 502)
funResult = evalFun(funorStr,arg(1:end-1));
Error in crossval (line 401)
[funResult,outarg] = getLossVal(i, nData, cvp, data,
predfun);
I'll really appreciate help.
Regards!
0 Comments
Answers (4)
Ilya
on 26 Apr 2012
I think you've hit a bug in the crossval function. My guess is that classRF_predict returns numeric labels, and crossval does not process them correctly for the 'mcr' criterion. The workaround is to convert class labels returned by classRF_predict to the nominal type:
classf= @(XTRAIN,ytrain,XTEST) nominal(classRF_predict(XTEST,classRF_train(XTRAIN,ytrain,1000)));
and execute the call to crossval in the same way as before
cvMCR = crossval('mcr',X,y,'predfun',classf)
Alternatively, you could use the other signature for crossval
vals = crossval(fun,X,y)
and define
fun = @(Xtrain,Ytrain,Xtest,Ytest) mean(Ytest ~= classRF_predict(Xtest,classRF_train(Xtrain,Ytrain,1000)));
In this case, since you are comparing the true and predicted labels yourself, you can keep them numeric.
Let me know if either solution works for you.
0 Comments
Ilya
on 26 Apr 2012
I am not an expert on the randomforest-matlab package, so my advice could be off. I find two things in your post worth investigating:
- It is strange that you use Xtest as the 1st input to classRF_predict(XTEST,classRF_train(XTRAIN,ytrain,1000)). Usually it is the trained object that is the 1st argument.
- Make sure that the array of class labels, y, you pass to crossval has the same type as labels returned by classRF_predict.
0 Comments
Cristobal
on 26 Apr 2012
1 Comment
Ilya
on 27 Apr 2012
Did you see my answer above?
You can do modify crossval if you'd like, but in that case do
temploss = sum(outarg ~= nominal(funResult));
That way you can continue using crossval with labels of all types. After what you did, you can only use crossval with handles that return labels of type double.
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!