MATLAB Examples

In this example, use a database of 1985 car imports with 205 observations, 25 predictors, and 1 response, which is insurance risk rating, or "symboling." The first 15 variables are numeric

Use Cook's Distance to determine the outliers in the data.

Test for the significance of the regression coefficients using t-statistic.

Load the sample data.

Display R-squared (coefficient of determination) and adjusted R-squared. Load the sample data and define the response and independent variables.

Fit a linear regression model. A typical workflow involves the following: import data, fit a regression, test its quality, modify it to improve the quality, and share it.

Compute the covariance matrix and standard errors of the coefficients.

Use the CovRatio statistics to determine the influential points in data. Load the sample data and define the response and predictor variables.

Uses a bagged ensemble so it can use all three methods of evaluating ensemble quality.

Identify and remove redundant predictors from a generalized linear model.

View a classification or regression tree. There are two ways to view a tree: view(tree) returns a text description and view(tree,'mode','graph') returns a graphic description of the tree.

Uses data for predicting the insurance risk of a car based on its many attributes.

Determine the observations that are influential on the fitted response values using Dffits values. Load the sample data and define the response and independent variables.

Test for autocorrelation among the residuals of a linear regression model.

Fit a generalized linear model and analyze the results. A typical workflow involves the following: import data, fit a generalized linear model, test its quality, modify it to improve the

Determine the observations that have large influence on coefficients using Dfbetas . Load the sample data and define the response and independent variables.

Compute coefficient confidence intervals.

Regularize binomial regression. The default (canonical) link function for binomial regression is the logistic function.

Compute Leverage values and assess high leverage observations. Load the sample data and define the response and independent variables.

Assess the model assumptions by examining the residuals of a fitted linear regression model.

Assess the fit of the model and the significance of the regression coefficients using the F-statistic.

There are diagnostic plots to help you examine the quality of a model. plotDiagnostics(mdl) gives a variety of plots, including leverage and Cook's distance plots. plotResiduals(mdl)

Use the methods predict , feval , and random to predict and simulate responses to new data.

Compute and plot S2_i values to examine the change in the mean squared error when an observation is removed from the data. Load the sample data and define the response and independent

Hows how to predict out-of-sample responses of regression trees, and then plot the results.

Create a regression ensemble to predict mileage of cars based on their horsepower and weight, trained on the carsmall data.

Two ways of fitting a nonlinear logistic regression model. The first method uses maximum likelihood (ML) and the second method uses generalized least squares (GLS) via the function fitnlm

Make Bayesian inferences for a logistic regression model using slicesample.

Predict class labels or responses using trained classification and regression trees.

Apply Partial Least Squares Regression (PLSR) and Principal Components Regression (PCR), and discusses the effectiveness of the two methods. PLSR and PCR are both methods to model a

Train a regression tree.

Train a classification tree.

Analyze time series data using Statistics and Machine Learning Toolbox™ features.

Fit a nonlinear regression model for data with nonconstant error variance.

Pitfalls that can occur when fitting a nonlinear model by transforming to linearity. Imagine that we have collected measurements on two variables, x and y, and we want to model y as a function

Fitting a Gaussian Process Regression (GPR) model to data with a large number of observations, using the Block Coordinate Descent (BCD) Approximation.

Fit and evaluate generalized linear models using glmfit and glmval. Ordinary linear regression can be used to fit a straight line, or any function that is linear in its parameters, to data

Choose the appropriate split predictor selection technique for your data set when growing a random forest of regression trees. This example also shows how to decide which predictors are

Create a custom plot function for bayesopt. It further shows how to use information in the UserData property of a BayesianOptimization object.

Optimize hyperparameters of a boosted regression ensemble. The optimization minimizes the cross-validation loss of the model.

Use a custom output function with Bayesian optimization. The output function halts the optimization when the objective function, which is the cross-validation error rate, drops below

Detect outliers using quantile random forest. Quantile random forest can detect outliers with respect to the conditional distribution of Y given X . However, this method cannot detect

Estimate conditional quantiles of a response given predictor data using quantile random forest and by estimating the conditional distribution function of the response using kernel

Implement Bayesian optimization to tune the hyperparameters of a random forest of regression trees using quantile error. Tuning a model using quantile error, rather than mean squared

Examine the resubstitution and cross-validation accuracy of a regression tree for predicting mileage based on the carsmall data.

You how to:

Compare models that stepwiselm returns starting from a constant model and starting from a full interaction model.

Use lasso along with cross validation to identify important predictors.

Display and interpret linear regression output statistics.

Construct and analyze a linear regression model with interaction effects and interpret the results.

Fit a mixed-effects linear spline model.

Creates a classification tree for the ionosphere data, and prunes it to a good level.

Use robust regression. It compares the results of a robust fit to a standard least-squares fit.

Examine the resubstitution error of a classification tree.

Do a typical nonlinear regression workflow: import data, fit a nonlinear regression, test its quality, modify it to improve the quality, and make predictions based on the model.

Perform ridge regression.

How lasso identifies and discards unnecessary predictors.

Consider a data set with 100 observations of 10 predictors. Generate the random data from a logistic model, with a binomial distribution of responses at each set of values for the predictors.

Regularize a model with many more predictors than observations. Wide data is data with more predictors than observations. Typically, with wide data you want to identify important

Fit and analyze a linear mixed-effects model (LME).

Control the depth of a decision tree, and how to choose an appropriate depth.

Perform linear and stepwise regression analyses using tables.

Predict the mileage (MPG) of a car based on its weight, displacement, horsepower, and acceleration, using the lasso and elastic net methods.

Set up a multivariate general linear model for estimation using mvregress .

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

You can also select a web site from the following list:

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

Contact your local office