ROC Curve

What Is a ROC Curve?

ROC curves (receiver operating characteristic curves) are an important tool for evaluating the performance of a machine learning model. They are most commonly used for binary classification problems – those that have two distinct output classes. The ROC curve shows the relationship between the true positive rate (TPR) for the model and the false positive rate (FPR). The TPR is the rate at which the classifier predicts “positive” for observations that are “positive.” The FPR is the rate at which the classifier predicts “positive” for observations that are actually “negative.” A perfect classifier will have a TPR of 1 and an FPR of 0.

You can calculate ROC curves in MATLAB^® using the perfcurve function from Statistics and Machine Learning Toolbox™. Additionally, the Classification Learner app generates ROC curves to help you assess model performance. The app lets you specify different classes to plot, so you can view ROC curves for multiclass classification problems that have more than two distinct output classes.

How ROC Curves Work

Most machine learning models for binary classification do not output just 1 or 0 when they make a prediction. Instead, they output a continuous value somewhere in the range [0,1]. Values at or above a certain threshold (for example 0.5) are then classified as 1 and values below that threshold are classified as 0. The points on the ROC curve represent the FPR and TPR for different threshold values.

The selected threshold can be anywhere on the range [0,1], and the resulting classifications will change based on the value of this threshold. For example, if the threshold is set all the way to 0, the model will always predict 1 (anything at or above 0 is classified as 1) resulting in a TPR of 1 and an FPR of 1. At the other end of the ROC curve, if the threshold is set to 1, the model will always predict 0 (anything below 1 is classified as 0) resulting in a TPR of 0 and an FPR of 0.

When evaluating the performance of a classification model, you are most interested in what happens in between these extreme cases. In general, the more “up and to the left” the ROC curve is, the better the classifier.

ROC curves are typically used with cross-validation to assess the performance of the model on validation or test data .

ROC curves calculated with the perfcurve function for (from left to right) a perfect classifier, a typical classifier, and a classifier that does no better than a random guess. — ROC curves calculated with the `perfcurve` function for (from left to right) a perfect classifier, a typical classifier, and a classifier that does no better than a random guess.

Examples and How To

Plot ROC Curve for Logistic Regression - Example
Detector Performance Analysis Using ROC Curves - Example

Software Reference

Assess Classifier Performance in Classification Learner - Documentation
Performance Curves - Documentation
perfcurve - Function
Model Building and Assessment - Documentation

ROC Curve FAQs

A ROC curve (receiver operating characteristic curve) is a tool for evaluating the performance of a machine learning model by showing the relationship between the true positive rate (TPR) and false positive rate (FPR) across different classification thresholds.

The TPR is the rate at which the classifier correctly predicts “positive” for observations that are actually “positive.”

The FPR is the rate at which the classifier incorrectly predicts “positive” for observations that are actually “negative.”

The more “up and to the left” the ROC curve is, the better the classifier performs. A perfect classifier will have a TPR of 1 and an FPR of 0, appearing as a curve through the upper left corner.

Each point on the ROC curve represents the FPR and TPR for different threshold values in the range [0,1].

Yes, the Classification Learner app in MATLAB lets you specify different classes to plot ROC curves for multiclass classification problems that have more than two distinct output classes.

You can calculate ROC curves in MATLAB using the perfcurve function from Statistics and Machine Learning Toolbox or generate them automatically using the Classification Learner app.

ROC curves are most commonly used for binary classification problems and are typically used with cross-validation to assess model performance on validation or test data.

See also: cross-validation

Machine Learning Q&A: All About Model Validation

Online Course

Machine Learning Onramp

Get started