# NonLinearModel class

Nonlinear regression model class

## Description

An object comprising training data, model description, diagnostic information, and fitted coefficients for a nonlinear regression. Predict model responses with the predict or feval methods.

## Construction

nlm = fitnlm(tbl,modelfun,beta0) or nlm = fitnlm(X,y,modelfun,beta0) create a nonlinear model of a table or dataset array tbl, or of the responses y to a data matrix X. For details, see fitnlm.

expand all

### tbl — Input datatable | dataset array

Input data, specified as a table or dataset array. When modelspec is a formula, it specifies the variables to be used as the predictors and response. Otherwise, if you do not specify the predictor and response variables, the last variable is the response variable and the others are the predictor variables by default.

Predictor variables can be numeric, or any grouping variable type, such as logical or categorical (see Grouping Variables). The response must be numeric or logical.

To set a different column as the response variable, use the ResponseVar name-value pair argument. To use a subset of the columns as predictors, use the PredictorVars name-value pair argument.

Data Types: single | double | logical

### X — Predictor variablesmatrix

Predictor variables, specified as an n-by-p matrix, where n is the number of observations and p is the number of predictor variables. Each column of X represents one variable, and each row represents one observation.

By default, there is a constant term in the model, unless you explicitly remove it, so do not include a column of 1s in X.

Data Types: single | double | logical

### y — Response variablevector

Response variable, specified as an n-by-1 vector, where n is the number of observations. Each entry in y is the response for the corresponding row of X.

Data Types: single | double

### modelfun — Functional form of the modelfunction handle | string of the form 'y ~ f(b1,b2,...,bj,x1,x2,...,xk)'

Functional form of the model, specified as either of the following.

• Function handle @modelfun or @(b,x)modelfun, where

• b is a coefficient vector with the same number of elements as beta0.

• x is a matrix with the same number of columns as X or the number of predictor variable columns of tbl.

modelfun(b,x) returns a column vector that contains the same number of rows as x. Each row of the vector is the result of evaluating modelfun on the corresponding row of x. In other words, modelfun is a vectorized function, one that operates on all data rows and returns all evaluations in one function call. modelfun should return real numbers to obtain meaningful coefficients.

• String of the form 'y ~ f(b1,b2,...,bj,x1,x2,...,xk)', where f represents a scalar function of the scalar coefficient variables b1,...,bj and the scalar data variables x1,...,xk.

### beta0 — Coefficientsnumeric vector

Coefficients for the nonlinear model, specified as a numeric vector. NonLinearModel starts its search for optimal coefficients from beta0.

Data Types: single | double

## Properties

expand all

### CoefficientCovariance — Covariance matrix of coefficient estimatesnumeric matrix

Covariance matrix of coefficient estimates, stored as a p-by-p matrix of numeric values. p is the number of coefficients in the fitted model.

### CoefficientNames — Coefficient namescell array of strings

Coefficient names, stored as a cell array of strings containing a label for each coefficient.

### Coefficients — Coefficient valuestable

Coefficient values, stored as a table. Coefficients has one row for each coefficient and the following columns:

• Estimate — Estimated coefficient value

• SE — Standard error of the estimate

• tStatt statistic for a test that the coefficient is zero

• pValuep-value for the t statistic

To obtain any of these columns as a vector, index into the property using dot notation. For example, in mdl the estimated coefficient vector is

beta = mdl.Coefficients.Estimate

Use coefTest to perform other tests on the coefficients.

### Diagnostics — Diagnostic informationtable

Diagnostic information for the model, stored as a table. Diagnostics can help identify outliers and influential observations. Diagnostics contains the following fields.

FieldMeaningUtility
LeverageDiagonal elements of HatMatrixLeverage indicates to what extent the predicted value for an observation is determined by the observed value for that observation. A value close to 1 indicates that the prediction is largely determined by that observation, with little contribution from the other observations. A value close to 0 indicates the fit is largely determined by the other observations. For a model with P coefficients and N observations, the average value of Leverage is P/N. An observation with Leverage larger than 2*P/N can be regarded as having high leverage.
CooksDistanceCook's measure of scaled change in fitted valuesCooksDistance is a measure of scaled change in fitted values. An observation with CooksDistance larger than three times the mean Cook's distance can be an outlier.
HatMatrixProjection matrix to compute fitted from observed responsesHatMatrix is an N-by-N matrix such that Fitted = HatMatrix*Y, where Y is the response vector and Fitted is the vector of fitted response values.

### DFE — Degrees of freedom for errorpositive integer value

Degrees of freedom for error (residuals), equal to the number of observations minus the number of estimated coefficients, stored as a positive integer value.

### Fitted — Fitted response values based on input datanumeric vector

Fitted (predicted) values based on the input data, stored as a numeric vector. fitnlm attempts to make Fitted as close as possible to the response data.

### Formula — Model informationLinearFormula object | NonLinearFormula object

Model information, stored as a LinearFormula object or NonLinearFormula object. If you fit a linear or generalized linear regression model, then Formula is a LinearFormula object. If you fit a nonlinear regression model, then Formula is a NonLinearFormula object.

### Iterative — Information about fitting processstructure

Information about the fitting process, stored as a structure with the following fields:

• InitialCoefs — Initial coefficient values (the beta0 vector)

• IterOpts — Options included in the Options name-value pair argument for fitnlm.

### LogLikelihood — Log likelihoodnumeric value

Log likelihood of the model distribution at the response values, stored as a numeric value. The mean is fitted from the model, and other parameters are estimated as part of the model fit.

### ModelCriterion — Criterion for model comparisonstructure

Criterion for model comparison, stored as a structure with the following fields:

• AIC — Akaike information criterion

• AICc — Akaike information criterion corrected for sample size

• BIC — Bayesian information criterion

• CAIC — Consistent Akaike information criterion

To obtain any of these values as a scalar, index into the property using dot notation. For example, in a model mdl, the AIC value aic is:

aic = mdl.ModelCriterion.AIC

### MSE — Mean squared errornumeric value

Mean squared error, stored as a numeric value. The mean squared error is an estimate of the variance of the error term in the model.

### NumCoefficients — Number of model coefficientspositive integer

Number of coefficients in the fitted model, stored as a positive integer. NumCoefficients is the same as NumEstimatedCoefficients for NonLinearModel objects. NumEstimatedCoefficients is equal to the degrees of freedom for regression.

### NumEstimatedCoefficients — Number of estimated coefficientspositive integer

Number of estimated coefficients in the fitted model, stored as a positive integer. NumEstimatedCoefficients is the same as NumCoefficients for NonLinearModel objects. NumEstimatedCoefficients is equal to the degrees of freedom for regression.

### NumPredictors — Number of predictor variablespositive integer

Number of predictor variables used to fit the model, stored as a positive integer.

### NumVariables — Number of variablespositive integer

Number of variables in the input data, stored as a positive integer. NumVariables is the number of variables in the original table or dataset, or the total number of columns in the predictor matrix and response vector when the fit is based on those arrays. It includes variables, if any, that are not used as predictors or as the response.

### ObservationInfo — Observation informationtable

Observation information, stored as a n-by-4 table, where n is equal to the number of rows of input data. The four columns of ObservationInfo contain the following:

FieldDescription
WeightsObservation weights. Default is all 1.
ExcludedLogical value, 1 indicates an observation that you excluded from the fit with the Exclude name-value pair.
MissingLogical value, 1 indicates a missing value in the input. Missing values are not used in the fit.
SubsetLogical value, 1 indicates the observation is not excluded or missing, so is used in the fit.

### ObservationNames — Observation namescell array

Observation names, stored as a cell array of strings containing the names of the observations used in the fit.

• If the fit is based on a table or dataset containing observation names, ObservationNames uses those names.

• Otherwise, ObservationNames is an empty cell array

### PredictorNames — Names of predictors used to fit the modelcell array

Names of predictors used to fit the model, stored as a cell array of strings.

### Residuals — Residuals for fitted modeltable

Residuals for fitted model, stored as a table that contains one row for each observation and the following columns.

FieldDescription
RawObserved minus fitted values.
PearsonRaw residuals divided by RMSE.
StandardizedRaw residuals divided by their estimated standard deviation.
StudentizedResidual divided by an independent estimate of the residual standard deviation. The residual for observation i is divided by an estimate of the error standard deviation based on all observations except for observation i.

To obtain any of these columns as a vector, index into the property using dot notation. For example, in a model mdl, the ordinary raw residual vector r is:

r = mdl.Residuals.Raw

Rows not used in the fit because of missing values (in ObservationInfo.Missing) contain NaN values.

Rows not used in the fit because of excluded values (in ObservationInfo.Excluded) contain NaN values, with the following exceptions:

• raw contains the difference between the observed and predicted values.

• standardized is the residual, standardized in the usual way.

• studentized matches the standardized values because this residual is not used in the estimate of the residual standard deviation.

### ResponseName — Response variable namestring

Response variable name, stored as a string.

### RMSE — Root mean squared errornumeric value

Root mean squared error, stored as a numeric value. The root mean squared error is an estimate of the standard deviation of the error term in the model.

### Robust — Robust fit informationstructure

Robust fit information, stored as a structure with the following fields:

FieldDescription
WgtFunRobust weighting function, such as 'bisquare' (see robustfit)
TuneValue specified for tuning parameter (can be [])
WeightsVector of weights used in final iteration of robust fit

This structure is empty unless fitnlm constructed the model using robust regression.

### Rsquared — R-squared value for the modelstructure

R-squared value for the model, stored as a structure.

For a linear or nonlinear model, Rsquared is a structure with two fields:

• Ordinary — Ordinary (unadjusted) R-squared

For a generalized linear model, Rsquared is a structure with five fields:

• Ordinary — Ordinary (unadjusted) R-squared

• LLR — Log-likelihood ratio

• Deviance — Deviance

The R-squared value is the proportion of total sum of squares explained by the model. The ordinary R-squared value relates to the SSR and SST properties:

Rsquared = SSR/SST = 1 - SSE/SST.

To obtain any of these values as a scalar, index into the property using dot notation. For example, the adjusted R-squared value in mdl is

### SSE — Sum of squared errorsnumeric value

Sum of squared errors (residuals), stored as a numeric value.

The Pythagorean theorem implies

SST = SSE + SSR.

### SSR — Regression sum of squaresnumeric value

Regression sum of squares, stored as a numeric value. The regression sum of squares is equal to the sum of squared deviations of the fitted values from their mean.

The Pythagorean theorem implies

SST = SSE + SSR.

### SST — Total sum of squaresnumeric value

Total sum of squares, stored as a numeric value. The total sum of squares is equal to the sum of squared deviations of y from mean(y).

The Pythagorean theorem implies

SST = SSE + SSR.

### VariableInfo — Information about input variablestable

Information about input variables contained in Variables, stored as a table with one row for each model term and the following columns.

FieldDescription
ClassString giving variable class, such as 'double'
RangeCell array giving variable range:
• Continuous variable — Two-element vector [min,max], the minimum and maximum values

• Categorical variable — Cell array of distinct variable values

InModelLogical vector, where true indicates the variable is in the model
IsCategoricalLogical vector, where true indicates a categorical variable

### VariableNames — Names of variables used in fitcell array

Names of variables used in fit, stored as a cell array of strings.

• If the fit is based on a table or dataset, this property provides the names of the variables in that table or dataset.

• If the fit is based on a predictor matrix and response vector, VariableNames is the values in the VarNames name-value pair of the fitting method.

• Otherwise the variables have the default fitting names.

### Variables — Data used to fit the modeltable

Data used to fit the model, stored as a table. Variables contains both observation and response values. If the fit is based on a table or dataset array, Variables contains all of the data from that table or dataset array. Otherwise, Variables is a table created from the input data matrix X and response vector y.

## Methods

 coefCI Confidence intervals of coefficient estimates of nonlinear regression model coefTest Linear hypothesis test on nonlinear regression model coefficients disp Display nonlinear regression model feval Evaluate nonlinear regression model prediction fit Fit nonlinear regression model plotDiagnostics Plot diagnostics of nonlinear regression model plotResiduals Plot residuals of nonlinear regression model plotSlice Plot of slices through fitted nonlinear regression surface predict Predict response of nonlinear regression model random Simulate responses for nonlinear regression model

## Definitions

### Hat Matrix

The hat matrix H is defined in terms of the data matrix X and the Jacobian matrix J:

${J}_{i,j}={\frac{\partial f}{\partial {\beta }_{j}}|}_{{x}_{i},\beta }$

Here f is the nonlinear model function, and β is the vector of model coefficients.

The Hat Matrix H is

H = J(JTJ)–1JT.

The diagonal elements Hii satisfy

$\begin{array}{l}0\le {h}_{ii}\le 1\\ \sum _{i=1}^{n}{h}_{ii}=p,\end{array}$

where n is the number of observations (rows of X), and p is the number of coefficients in the regression model.

### Leverage

The leverage of observation i is the value of the ith diagonal term, hii, of the hat matrix H. Because the sum of the leverage values is p (the number of coefficients in the regression model), an observation i can be considered to be an outlier if its leverage substantially exceeds p/n, where n is the number of observations.

### Cook's Distance

The Cook's distance Di of observation i is

${D}_{i}=\frac{\sum _{j=1}^{n}{\left({\stackrel{^}{y}}_{j}-{\stackrel{^}{y}}_{j\left(i\right)}\right)}^{2}}{p\text{\hspace{0.17em}}MSE},$

where

• ${\stackrel{^}{y}}_{j}$ is the jth fitted response value.

• ${\stackrel{^}{y}}_{j\left(i\right)}$ is the jth fitted response value, where the fit does not include observation i.

• MSE is the mean squared error.

• p is the number of coefficients in the regression model.

Cook's distance is algebraically equivalent to the following expression:

${D}_{i}=\frac{{r}_{i}^{2}}{p\text{\hspace{0.17em}}MSE}\left(\frac{{h}_{ii}}{{\left(1-{h}_{ii}\right)}^{2}}\right),$

where ei is the ith residual.

## Copy Semantics

Value. To learn how value classes affect copy operations, see Copying Objects in the MATLAB® documentation.

## Examples

expand all

### Fit a Nonlinear Regression Model

Fit a nonlinear regression model for auto mileage based on the carbig data. Predict the mileage of an average car.

Load the sample data. Create a matrix X containing the measurements for the horsepower (Horsepower) and weight (Weight) of each car. Create a vector y containing the response values in miles per gallon (MPG).

X = [Horsepower,Weight];
y = MPG;

Fit a nonlinear regression model.

modelfun = @(b,x)b(1) + b(2)*x(:,1).^b(3) + ...
b(4)*x(:,2).^b(5);
beta0 = [-50 500 -1 500 -1];
mdl = fitnlm(X,y,modelfun,beta0)
mdl =

Nonlinear regression model:
y ~ b1 + b2*x1^b3 + b4*x2^b5

Estimated Coefficients:
Estimate      SE        tStat       pValue
________    _______    ________    ________

b1     -49.383     119.97    -0.41164     0.68083
b2      376.43     567.05     0.66384     0.50719
b3    -0.78193    0.47168     -1.6578    0.098177
b4      422.37     776.02     0.54428     0.58656
b5    -0.24127    0.48325    -0.49926     0.61788

Number of observations: 392, Error degrees of freedom: 387
Root Mean Squared Error: 3.96
F-statistic vs. constant model: 283, p-value = 1.79e-113

Find the predicted mileage of an average auto. Since the sample data contains some missing (NaN) observations, compute the mean using nanmean.

Xnew = nanmean(X)
MPGnew = predict(mdl,Xnew)
Xnew =

1.0e+03 *

0.1051    2.9794

MPGnew =

21.8073