"Sadik " <sadik.hava@gmail.com> wrote in message <hqvsqd$i5d$1@fred.mathworks.com>...
> Hi Mohammad,
>
> The following example from the documentation is very illustrative. I am going to explain it a bit for you to better understand it:
>
> 1. load fisheriris
> %matlab's own dataset. Basically, there are three types of fish: setosa, versicolor and virginica [these names are in the variable species] and 50 samples per type. The first fifty is setosa, second fifty is versicolor and the third is virginica.
> 2. x = meas(51:end,1:2);
> % If you load the data, you will see that meas is a 150x4 matrix. There are 150 samples with 4 features per sample. x = meas(51:end,1:2) chooses the data pertaining to versicolor and virginica, and it is getting only 2 of the 4 features.
> 3. y = (1:100)'>50;
> % versicolor=0, virginica=1
> % 50 zeros and 50 ones. This means, versicolor will be represented by zeroes and virginica by ones in the glm.
> 4. b = glmfit(x,y,'binomial');
> % Obtain the model parameters.
> 5. p = glmval(b,x,'logit');
> % Using these parameters, compute the output of the classifier. This is what goes into "scores" in perfcurve.
> 6. [X,Y,T,AUC] = perfcurve(species(51:end,:),p,'virginica');
> % You can very easily see now. "labels" is nothing but a list of true labels you had in your data set. Since, after reduction, the dataset had 50 versicolors and 50 virginicas, "labels" is now a cell array where the first 50 elements are equal to the string 'versicolor' and the last 50 is equal to 'virginica'.
> % The last input to perfcurve "posclass" is the label of the positive class. If you look at line 3. above, we are assigning 1 to the second fifty, which is virginica. Therefore, the label of the positive class is 'virginica'.
> plot(X,Y)
> xlabel('False positive rate'); ylabel('True positive rate')
> title('ROC for classification by logistic regression')
>
> Best.
Thank you Sadik for your reply,
i think i am missing something here you did not use any type of training and testing data for example:
load ionodata % ionosphere dataset has A for data and groups for their labels
indices = crossvalind('Kfold',groups,3);
test = (indices == i); train = ~test;
svmStruct = svmtrain(A(train),groups(train));
classes = svmclassify(svmStruct,A(test));
I am stuck here what kind of data i have to use in the perfcurve function
pos =0; % for positive labels
[X,Y,T,AUC] = perfcurve(labels,scores,pos);
are the labels and scores here for labels and the confidence of training data or for the test data or for all dataset??
Thank you in advance..
