Documentation |
Regression is the process of fitting models to data. Linear regression assumes that the relationship between the dependent variable y_{i} and the independent variable x_{i} is linear: y_{i} = a + b x_{i}. Here a is the offset and b is the slope of the linear relationship.
For linear regression of a data sample with one independent variable, MuPAD^{®} provides the stats::linReg function. This function uses the least-squares approach for computing the linear regression. stats::linReg chooses the parameters a and b by minimizing the quadratic error:
.
The function also can perform weighted least-squares linear regression that minimizes
with the positive weight w_{i}. By default, the weights are equal to 1.
Besides the slope a and the offset b of a fitted linear model, stats::linReg also returns the value of the quadratic deviation χ^{2}. For example, fit the linear model to the following data:
x := [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]: y := [11, 13, 15, 17, 19, 21, 23, 25, 27, 29]: stats::linReg(x, y)
The linear model y_{i} = 9 + 2 x_{i} fits this data perfectly. The quadratic error for this model is zero. To visualize the data and the resulting model, plot the data by using the plot::Scatterplot function. The plot shows the regression line y_{i} = 9 + 2 x_{i} computed by stats::linReg:
plot(plot::Scatterplot(x, y))
When you work with experimental data samples, the data almost never completely fits any linear model. The value of the quadratic error indicates how far the actual data deviate from the fitted model. For example, modify the data from the previous example by adding small random floating-point values to the entries of the list y. Then, perform linear regression for the entries of the lists x and y1 and plot the data:
y1 := y + [10*frandom() $ i = 1..10]: stats::linReg(x, y1); plot(plot::Scatterplot(x, y1))
The fact that stats::linReg finds a linear model to fit your data does not guarantee that the linear model is a good fit. For example, you can find a linear model to fit the following uniformly distributed random data points:
x := [frandom() $ i = 1..100]: y := [frandom() $ i = 1..100]: stats::linReg(x, y); plot(plot::Scatterplot(x, y))
The large value of the quadratic error indicates that the linear model is a poor fit for these data.
delete x, y, y1