Supervised Learning

What Is Supervised Learning?

Supervised learning is the most common type of machine learning algorithms. It uses a known dataset (called the training dataset) to train an algorithm with a known set of input data (called features) and known responses to make predictions. The training dataset includes labeled input data that pair with desired outputs or response values. From it, the supervised learning algorithm seeks to create a model by discovering relationships between the features and output data and then makes predictions of the response values for a new dataset.

Video length is 4:35

Prior to applying supervised learning, unsupervised learning is frequently used to discover patterns in the input data that suggest candidate features, and feature engineering transforms them to be more suitable for supervised learning. In addition to identifying features, the correct category or response needs to be identified for all observations in the training set, which is a very labor-intensive step. Semi-supervised learning lets you train models with very limited labeled data and thus reduce the labelling effort.

Once the algorithm is trained, a test dataset, which hasn’t been used for training, is typically used to predict the performance of the algorithm and validate it. To obtain accurate performance results, it is critical that both the training and test set are a good representation of “reality”( i.e., data from the production environment and the model were both validated correctly).

Q&A on model validation

Q&A on model validation

You can train, validate, and tune predictive supervised learning models in MATLAB® with Deep Learning Toolbox™, and Statistics and Machine Learning Toolbox™.

Supervised Learning Algorithms Categories

Classification: Used for categorical response values, where the data can be separated into specific classes. A binary classification model has two classes and a multiclass classification model has more. You can train classification models with the Classification Learner app with MATLAB.

Common classification algorithms for this category include:

Regression: Used for numerical continuous-response values. Regression models can be easily trained with the Regression Learner app.

Common regression algorithms include:

Supervised Learning Applications

Supervised learning is used in financial applications for credit scoring, algorithmic trading, and bond classification; in biological applications for tumor detection and drug discovery; in energy applications for price and load forecasting; in pattern recognition applications for speech and images; and in predictive maintenance for life of equipment estimates.

See also: Statistics and Machine Learning Toolbox, Deep Learning Toolbox, machine learning, unsupervised learning, AdaBoost, linear regression, nonlinear regression, data fitting, data analysis, mathematical modeling, predictive modeling, artificial intelligence, AutoML, regularization, biomedical signal processing