Statistics toolbox predict new data non linear regression

Hi!
After building a non linear model, the example in the matlab statistics toolbox predicts new data. I dont understand
'Xnew = [200,200,200;100,200,100;500,50,5];'
- we have 3 columns with each 13 rows and our output rate. Why does this row has 9 values? And what does ypred?
Please see below the example:
This is our data (X):
470 300 10
285 80 10
470 300 120
470 80 120
470 80 10
100 190 10
100 80 65
470 190 65
100 300 54
100 300 120
100 80 120
285 300 10
285 190 120
Data output(Y):
8.55000000000000
3.79000000000000
4.82000000000000
0.0200000000000000
2.75000000000000
14.3900000000000
2.54000000000000
4.35000000000000
13
8.50000000000000
0.0500000000000000
11.3200000000000
3.13000000000000
.... (Building model - Step 1-5)....
Step 6. Predict for new data
Create some new data and predict the response from both models.
Xnew = [200,200,200;100,200,100;500,50,5];
[ypred yci] = predict(mdl,Xnew)
ypred =
1.8762
6.2793
1.6718
yci =
1.6283 2.1242
5.9789 6.5797
1.5589 1.7846
[ypred1 yci1] = predict(mdl1,Xnew)
ypred1 =
1.8984
6.2555
1.6594
yci1 =
1.6260 2.1708
5.9323 6.5787
1.5345 1.7843
Even though the model coefficients are dissimilar, the predictions are nearly identical.
Thank you!

 Accepted Answer

They estimate a non-linear model that has three independent variables (input variables) and five coefficients. Then they predict, that means they calculate the output for certain values of the input variables. For an output value they have to feed the values with three input values because the model contains three independent variables. They do that three times. They caluclate three output values, that is why the matrix for the input variables is a 3 x 3 matrix: three times the three values for the input variables. So Xnew is not a row vector of 9 elements as you write in your question but a 3 x 3 matrix. The variable ypred contains the three values of the output variable that were calcualted based on the matrix Xnew.

6 Comments

Thank you so much, wish they would do so detailed explanations in the documentation. And the yci: yci =
1.6283 2.1242
5.9789 6.5797
1.5589 1.7846
tells me that the confidence interval for output 1 is between 1,6 and 2,1, for the second between 5,9 and 6,5, and the third between 1,5 and 1,7, right?
One last question, they plot the residuals and then they say that "The model seems adequate for the data", how do they come to this conclusion?
Thank you :)
Yes, you're right about the confidence interval.
When you do a regression you want the residuals to be normally distributed. Matlab looks a the distribution of the residuals. Most are close to zero. More important, there are no outliers. The distribution looks normal. That is why they come to the conclusion that the model seems adequate. You could do more tests to check if the model is adequate (for example check if the residuals are normally distributed, if there is correlation between the residuals and the input variables, ...).
Similar question for linear regression: How do they know that "There is one possible outlier, with a value greater than 12. This is probably not truly an outlier. For demonstration, here is how to find and remove it."? How can they see that?
Thanks :)
The histogramm tells you that there is a value greater than 12, and that value could be an outlier.
Sry, but I still dont get it, how do you see that? using plot residuals doesnt really help me.. do not understand how i can see that there is a value greater than 12...Thanks :)

Sign in to comment.

More Answers (0)

Asked:

on 17 Jun 2014

Commented:

on 29 Jun 2014

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!