Help with Linear Regression

I'm trying to conduct a simple linear regression fitting using fitlm, but the results that is provided when I plot the curve with the original data and the linear equation line found through fitlm is really off. I'm just using the first 20 data points from the list
mdl = fitlm(time(1:20),price(1:20))
mdl =
Linear regression model:
y ~ 1 + x1
Estimated Coefficients:
Estimate SE tStat pValue
__________ _______ _______ _________
(Intercept) 2.4736e+05 42743 5.7871 1.748e-05
x1 -5.7654 0.99663 -5.7849 1.756e-05
Number of observations: 20, Error degrees of freedom: 18
Root Mean Squared Error: 0.0178
R-squared: 0.65, Adjusted R-Squared 0.631
F-statistic vs. constant model: 33.5, p-value = 1.76e-05
Then I prepped for plot using.
x=time(1:20)
y1=2.4736e+05+(x*-5.7654)
plot(time(1:20),price(1:20),x,y1)
I would get this plot attached. I've also attached the data. Please help with a better fit or explanation as to why it is so far off. I'm not sure what I'm doing incorrectly.

 Accepted Answer

This works:
d = load('liu James Linear .mat');
Price = d.Price;
time = d.time;
mdl = fitlm(time, Price, 'linear')
ypred = predict(mdl, [min(time) max(time)]');
figure(1)
plot(time, Price, '+')
hold on
plot([min(time) max(time)]', ypred, '-r')
hold off
grid

4 Comments

Hey Star, Thanks for helping. I thought that this code
plot(time(1:20),price(1:20),x,y1)
would do the same as
plot(time, Price, '+')
hold on
plot([min(time) max(time)]', ypred, '-r')
hold off
Can you explain the difference as to why my coding would cause the huge offset or why it doesn't do the same thing? Is it the way that I wrote the slope formula that was incorrect? Or is it just that the graph scaling changed? Sorry, hope you won't mind explaining.
Hey Star, I looked at it some more. It seems like my slope formula is incorrect. I don't get why it's incorrect. Thanks.
by the way, I changed the upper code for range of 1:20.
mdl = fitlm(time(1:20), Price(1:20), 'linear')
ypred = predict(mdl, time(1:20));
figure(1)
plot(time(1:20), Price(1:20), '+')
hold on
plot([min(time) max(time)]', ypred, '-r')
hold off
grid
mdl =
Linear regression model:
y ~ 1 + x1
Estimated Coefficients:
Estimate SE tStat pValue
__________ _______ _______ _________
(Intercept) 2.4736e+05 42743 5.7871 1.748e-05
x1 -5.7654 0.99663 -5.7849 1.756e-05
Number of observations: 20, Error degrees of freedom: 18
Root Mean Squared Error: 0.0178
R-squared: 0.65, Adjusted R-Squared 0.631
F-statistic vs. constant model: 33.5, p-value = 1.76e-05
The mdl gives the same as above.
Which I wrote the slope formula as
y1=2.4736e+05+(x*-5.7654)
Is this incorrect?
That appears to be correct for the first 20 values. (I get the same result, not surprisingly.)
I would use the predict function rather than writing your own function to calculate the fit (that you then use to plot the line). The predict function uses full internal precision of the slope and intercept, while your equation uses only the precision that fitlm reports in its results. That is likely the reason your results seem to be in error. For example, for ‘time(1)’ and ‘time(20)’, predict gives [94.5918, 94.5157], and your equation gives [97.0681, 96.9920]. The difference will be noticeable on the plot.
So your equation is mathematically correct, but computationally incorrect, in that it does not use the full precision that fitlm calculates.
You have to ask for the full precision parameter estimates with:
coefs = mdl.Coefficients.Estimate;
Your equation rewritten as:
y1 = coefs(1) + coefs(2)*x;
then gives results identical to those predict produces.

Sign in to comment.

More Answers (0)

Categories

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!