Understanding MatLab's built-in SVM cross-validation on fitcsvm

Question

Carlos Mendoza on 30 Aug 2020

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/586469-understanding-matlab-s-built-in-svm-cross-validation-on-fitcsvm

Commented: Xingwang Yong on 3 Oct 2020

svm_crossval_data.mat

I have a dataset of 53 trials and I want to do leave-one-out cross-validation of a binary classifier. I tried to explicitly do the cross-validation of an SVM, with this code:

SVM_params = {'KernelFunction', 'linear', 'Standardize', true, ...
    'BoxConstraint', 0.046125, 'ClassNames', class_names};
SVMModel = cell(53,1);
for i_trial = 1:53
%% Train
train_set_indices = [1:i_trial-1 i_trial+1:n_trials];
SVMModel{i_trial} = fitcsvm(input_data(train_set_indices, :), ...
    true_labels(train_set_indices), SVM_params{:});
%% Predict
[estimated_labels(i_trial), score] = predict(SVMModel{i_trial}, ...
    input_data(i_trial, :));
end
error_count = sum(~strcmp(true_labels, estimated_labels));
class_error = error_count / n_trials;

which gives me class_error equals to 0.4151.

However, if I tried MatLab's built-in SVM cross-validation

SVM_params = {'KernelFunction', 'linear', 'Standardize', true, ...
    'Leaveout', 'on', 'BoxConstraint',  0.046125, 'ClassNames', class_names};
CSVM = fitcsvm(input_data, true_labels, SVM_params{:});

CSVM.kfoldLoss would be equal to 0.3208. Why the difference? What I am doing wrong in my explicit cross-validation?

I did the same exercise with 'Standarize', off and 'KernelScale', 987.8107 (optimized hyperparameters), and the difference is more dramatic: class_error=0.4528, while CSVM.kfoldLoss=0.

Finally, I would also like to know how what was the training and validation set for each of the trained models in CSVM.Trained. I would like to call predict on each trained model with the left-out sample (trial) and compare the result with CSVM.kfoldPredict.

Update 1: I found that c.traininig and c.test return the indices of the training and test sets. However, this code

SVM_params = {'KernelFunction', 'linear', 'Standardize', true, 'CVPartition', c,...
    'BoxConstraint', BoxConstraint, 'ClassNames', class_names};
estimated_labels = cell(1,53);
CSVM = fitcsvm(input_data, true_labels, SVM_params{:});
for ii=1:53
    estimated_labels(ii) = predict(CSVM.Trained{ii}, input_data(c.test(ii),:,1));
end
error_count = sum(~strcmp(true_labels, estimated_labels));
class_error = error_count / n_trials;

gives me class_error=0.5849, which is different to CSVM.kfoldLoss (0.3208). Why the difference? Is this the right way to double-check the cross-validation?

Update 2: I attached the data.

Thanks!

2 Comments
Show NoneHide None

Image Analyst on 31 Aug 2020

No answers probably because you forgot to attach your data.

Carlos Mendoza on 31 Aug 2020

I didn't forget. I thought that the code would be enough. Probably an error.

Sign in to comment.

Sign in to answer this question.

Answer 1

Xingwang Yong on 29 Sep 2020

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/586469-understanding-matlab-s-built-in-svm-cross-validation-on-fitcsvm#answer_502303

Maybe kfoldLoss uses a different definition of loss than yours. Your definition is 1-accuracy.

https://www.mathworks.com/help/stats/classreg.learning.partition.regressionpartitionedkernel.kfoldloss.html?s_tid=srchtitle

2 Comments
Show NoneHide None

Carlos Mendoza on 1 Oct 2020

The default is 'classiferror', which is what I am using:

https://www.mathworks.com/help/stats/classificationpartitionedmodel.kfoldloss.html#bswic2v-2

What do you mean by "1-accuracy"?

Xingwang Yong on 3 Oct 2020

Open in MATLAB Online

class_error = error_count / n_trials;
            = (n_trials - correct_count) / n_trials
            = 1 - correct_count / n_trials
            = 1 - accuracy

That is your definition of loss.

Sign in to comment.

Understanding MatLab's built-in SVM cross-validation on fitcsvm

2 Comments
Show NoneHide None

Answers (1)

2 Comments
Show NoneHide None

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Understanding MatLab's built-in SVM cross-validation on fitcsvm

2 Comments Show NoneHide None

Answers (1)

2 Comments Show NoneHide None

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

2 Comments
Show NoneHide None

2 Comments
Show NoneHide None