# Documentation

### This is machine translation

Translated by
Mouseover text to see original. Click the button below to return to the English verison of the page.

Note: This page has been translated by MathWorks. Please click here
To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

## Regression Using Dataset Arrays

This example shows how to perform linear and stepwise regression analyses using dataset arrays.

### Load sample data.

```load imports-85 ```

### Store predictor and response variables in dataset array.

```ds = dataset(X(:,7),X(:,8),X(:,9),X(:,15),'Varnames',... {'curb_weight','engine_size','bore','price'}); ```

### Fit linear regression model.

Fit a linear regression model that explains the price of a car in terms of its curb weight, engine size, and bore.

```fitlm(ds,'price~curb_weight+engine_size+bore') ```
```ans = Linear regression model: price ~ 1 + curb_weight + engine_size + bore Estimated Coefficients: Estimate SE tStat pValue __________ _________ _______ __________ (Intercept) 64.095 3.703 17.309 2.0481e-41 curb_weight -0.0086681 0.0011025 -7.8623 2.42e-13 engine_size -0.015806 0.013255 -1.1925 0.23452 bore -2.6998 1.3489 -2.0015 0.046711 Number of observations: 201, Error degrees of freedom: 197 Root Mean Squared Error: 3.95 R-squared: 0.674, Adjusted R-Squared 0.669 F-statistic vs. constant model: 136, p-value = 1.14e-47 ```

The command `fitlm(ds)` also returns the same result because `fitlm`, by default, assumes the predictor variable is in the last column of the dataset array `ds`.

### Recreate dataset array and repeat analysis.

This time, put the response variable in the first column of the dataset array.

``` ds = dataset(X(:,15),X(:,7),X(:,8),X(:,9),'Varnames',... {'price','curb_weight','engine_size','bore'}); ```

When the response variable is in the first column of `ds`, define its location. For example, `fitlm`, by default, assumes that `bore` is the response variable. You can define the response variable in the model using either:

```fitlm(ds,'ResponseVar','price'); ```

or

```fitlm(ds,'ResponseVar',logical([1 0 0 0])); ```

### Perform stepwise regression.

```stepwiselm(ds,'quadratic','lower','price~1',... 'ResponseVar','price') ```
```1. Removing bore^2, FStat = 0.01282, pValue = 0.90997 2. Removing engine_size^2, FStat = 0.078043, pValue = 0.78027 3. Removing curb_weight:bore, FStat = 0.70558, pValue = 0.40195 ans = Linear regression model: price ~ 1 + curb_weight*engine_size + engine_size*bore + curb_weight^2 Estimated Coefficients: Estimate SE tStat pValue ___________ __________ _______ __________ (Intercept) 131.13 14.273 9.1873 6.2319e-17 curb_weight -0.043315 0.0085114 -5.0891 8.4682e-07 engine_size -0.17102 0.13844 -1.2354 0.21819 bore -12.244 4.999 -2.4493 0.015202 curb_weight:engine_size -6.3411e-05 2.6577e-05 -2.386 0.017996 engine_size:bore 0.092554 0.037263 2.4838 0.013847 curb_weight^2 8.0836e-06 1.9983e-06 4.0451 7.5432e-05 Number of observations: 201, Error degrees of freedom: 194 Root Mean Squared Error: 3.59 R-squared: 0.735, Adjusted R-Squared 0.726 F-statistic vs. constant model: 89.5, p-value = 3.58e-53 ```

The initial model is a quadratic formula, and the lowest model considered is the constant. Here, `stepwiselm` performs a backward elimination technique to determine the terms in the model. The final model is `price ~ 1 + curb_weight*engine_size + engine_size*bore + curb_weight^2`, which corresponds to

where is price, is curb weight, is engine size, is bore, is the coefficient for the corresponding term in the model, and is the error term. The final model includes all three main effects, the interaction effects for curb weight and engine size and engine size and bore, and the second-order term for curb weight.

## Related Topics

Was this topic helpful?

Download now