Least Squares in Matlab

5 views (last 30 days)
Megan
Megan on 7 Dec 2014
Edited: John D'Errico on 7 Dec 2014
I'm stuck on part (d) I'm not sure how to code it so that it approximates that function in matlab.
I'm also not sure if my (a) thru (c) are correct. But this is what I have so far.
This is my code..
%%Problem 1
t = [1990; 1910; 1920; 1930; 1940; 1950; 1960; 1970; 1980; 1990];
b = [75.995; 91.972; 105.711; 123.203; 131.669; 150.697; 179.323; 203.212; 226.505; 249.633];
t2 = t.^2;
x = (1900:0.01:1990);
O = ones(10,1);
A = [O, t, t2];
disp('The Regular Matrix A is: ')
disp(A)
disp('The vector b is: ')
disp(b)
K = A'*A;
Ab = A'*b;
v = K\Ab;
v = flipud(v);
disp('v is: ')
disp(v)
px = polyval(v,x);
xlabel('Population of the U.S. 1900-1990');
ylabel('Millions');
plot(t,b,'.',x,px)
%c
disp('The Rank of A is: ')
disp(rank(A))
disp('The Cholesky decomposition is: ')
R = chol(A'*A);
disp(R)
L = chol(A'*A,'lower')
disp(L)
z = L\v
w = R\z
%(d)
B = [O, t]
u = log(b)
K2 = B'*B
Bu = B'*u
r = K2\Bu;
r = flipud(r)
qx = polyval(r,x);
norm(v)
norm(r)
xlabel('Population of the U.S. 1900-1990');
ylabel('Millions');
plot(t,b,'.',x,qx)
It seems like my condition numbers are off and the second graph makes no sense to me. I can do this by hand and I know for (d) you need to take the log to linearize it. I'm just so horrible at coding that I most likely didn't do something correctly. Thank you
  9 Comments
John D'Errico
John D'Errico on 7 Dec 2014
I won't comment on the use of the normal equations here, since it was part of the assignment to use them. I would point out something that linear algebra gives us. For example, you compute a Cholesky factor of A'*A. Better is to avoid forming A'*A in the first place. I.e., if we compute
[Q,R] = qr(A);
so we know that A = Q*R, where Q is an orthogonal matrix, and A is upper triangular, then what is A'*A?
A'*A = R'*Q'*Q*R
but Q is orthogonal! So we have
A'*A = R'*R
Essentially, a Cholesky factor of A'*A is given by the QR factorization, but since we never need form the matrix A'*A, we do not induce the serious loss of accuracy caused by that operation. We can do even a bit better if we use the column pivoted QR form, which is more stable yet, but then you need to recognize what the column pivoting does to your Cholesky factor. It is not a serious problem, a simple permutation of your variables only.
John D'Errico
John D'Errico on 7 Dec 2014
Edited: John D'Errico on 7 Dec 2014
A final comment is that it is significantly better to shift your x variable. Subtracting 1900 is ok, but better is to subtract something like 1945. For example, try doing so, then compute the condition numbers of the associated matrix. For example...
t = (1900:10:1990);
cond(bsxfun(@power,t,0:3))
ans =
3.0846e+15
cond(bsxfun(@power,t-1900,0:3))
ans =
9.0051e+05
cond(bsxfun(@power,t-1945,0:3))
ans =
68994
The differences are dramatic. Of course, if you form A'*A, they are much more dramatic. This is why it is a good idea to "mean center" your data when you will build a polynomial model.
A = bsxfun(@power,t,0:3);
cond(A'*A)
ans =
4.4934e+25
A = bsxfun(@power,t-1900,0:3);
cond(A'*A)
ans =
8.1093e+11
A = bsxfun(@power,t-1945,0:3);
cond(A'*A)
ans =
4.7601e+09
As you can see, the problem is essentially unsolvable in double precision arithmetic if you are not careful, but it is no problem if you take the proper precautions.
Finally, see what happens when I both center AND scale the time variable...
cond(bsxfun(@power,(t-1945)/45,0:3))
ans =
7.0441
A trivial thing to do, but it makes the linear algebra so much more nicely behaved.

Sign in to comment.

Answers (0)

Categories

Find more on Manual Performance Optimization in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!