Why is the reported accuracy of the Classification learner app very low (51%), while in the scatter plots, no incorrect model predictions are reported?

Question

Martin Berndsen on 8 Feb 2019

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/443914-why-is-the-reported-accuracy-of-the-classification-learner-app-very-low-51-while-in-the-scatter

Commented: Martin Berndsen on 9 Feb 2019

I have loaded a training dataset to the learner app. The dataset contains 22 predictors and I used the default 5 fold cross validation (other values have no influence). All classifiers score very low accuracies (51% or lower) apart from the logistic reggression model, that one scores 100%. It doesnt matter which features I chose, no real big changes occur, only less accuracy is acquired. When I check these results in the scatter plots, no incorrect model predictions are reported as shown in the image below. You can see that the dataset contains 4433 NaN's which are hidden. The confusion matrix shows the 4433 hidden observations as False positives, which is why the accuracy is so low. Apparently the training dataset contains 4433 observations which only contains NaN values. Since they were supposed to be hidden, I assumed they will not be used in the calculation of the accuracy, but apparently they are. Is that correct?

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Stephan on 9 Feb 2019

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/443914-why-is-the-reported-accuracy-of-the-classification-learner-app-very-low-51-while-in-the-scatter#answer_360256

Edited: Stephan on 9 Feb 2019

Hi,

the red area in the confusion matrix tells us, that there were 4433 times when the predicted class was 1 and the true class was 0. The green areas show correct predictions. You got it?

Best regards

Stephan

1 Comment
Show -1 older commentsHide -1 older comments

Martin Berndsen on 9 Feb 2019

Stephan,

Thanks for your reply, I understand the principle of the confusion martix, but I was not expecting that the hidden entries from the training dataset were included in the calculation of the accuracy. What is the point in hiding those entries then?

Anyway, I resolved this issue by first removing the entries from the training dataset which contained NaN values only. After that operation I loaded the training dataset in the classification learner app.

Best regards,

Martin

Sign in to comment.

Why is the reported accuracy of the Classification learner app very low (51%), while in the scatter plots, no incorrect model predictions are reported?

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

1 Comment
Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Why is the reported accuracy of the Classification learner app very low (51%), while in the scatter plots, no incorrect model predictions are reported?

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

1 Comment Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments