Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
confidence intervals regress

Subject: confidence intervals regress

From: leo nidas

Date: 22 Aug, 2008 05:51:02

Message: 1 of 6


Hi there,

Suppose I have the simple linear model y=b0+b1*x1+b2*x2+e.
I want to estimate b0,b1,b2 and calculate a confidence
interval for them. Then I want to see how many times the CI
contains the true parameter. If I was making everything
right, for 1000 repetitions, that should be approximately
950 times. But I am getting 1000 times as a result. What
wrong with my code?

And another question. The formula used for estimating 95%
CI in the regress function is b0_hat-Za*SE(b0_hat) where
Za=1.96 right? But nor this formula, nor the more precise
Za=1.95996398454002 seems to agree exactly (4 decimals)
with the result given by bint in the regress function.
Precision is not much of a problem, all I want to know is
the correct formula.

And what about confidence intervals when I have weighted
regression (lscov). The formula there uses a t distribution?

n=200;
b0=20;
b1=3;
b2=4;
lam=3;
for rep=1:1000
e=normrnd(0,10,1,n);
x1=exprnd(lam,1,n)';
x2=exprnd(5,1,n)';
X = [ones(size(x1)) x1 x2];
y=b0+b1.*x1+b2.*x2+e';
[b_ols bint]=regress(y,X);

cover1(rep)=(bint(1)<b0<bint(4));
cover2(rep)=(bint(2)<b1<bint(5));
cover3(rep)=(bint(3)<b2<bint(6));
end

cover1=sum(cover1)
cover2=sum(cover2)
cover3=sum(cover3)

Subject: confidence intervals regress

From: John D'Errico

Date: 22 Aug, 2008 11:04:02

Message: 2 of 6

"leo nidas" <bleonidas25@yahoo.gr> wrote in message
<g8lk46$5v1$1@fred.mathworks.com>...
>
> Hi there,
>
> Suppose I have the simple linear model y=b0+b1*x1+b2*x2+e.
> I want to estimate b0,b1,b2 and calculate a confidence
> interval for them. Then I want to see how many times the CI
> contains the true parameter. If I was making everything
> right, for 1000 repetitions, that should be approximately
> 950 times. But I am getting 1000 times as a result. What
> wrong with my code?
>
> And another question. The formula used for estimating 95%
> CI in the regress function is b0_hat-Za*SE(b0_hat) where
> Za=1.96 right? But nor this formula, nor the more precise
> Za=1.95996398454002 seems to agree exactly (4 decimals)
> with the result given by bint in the regress function.
> Precision is not much of a problem, all I want to know is
> the correct formula.
>
> And what about confidence intervals when I have weighted
> regression (lscov). The formula there uses a t distribution?
>
> n=200;
> b0=20;
> b1=3;
> b2=4;
> lam=3;
> for rep=1:1000
> e=normrnd(0,10,1,n);
> x1=exprnd(lam,1,n)';
> x2=exprnd(5,1,n)';
> X = [ones(size(x1)) x1 x2];
> y=b0+b1.*x1+b2.*x2+e';
> [b_ols bint]=regress(y,X);
>
> cover1(rep)=(bint(1)<b0<bint(4));
> cover2(rep)=(bint(2)<b1<bint(5));
> cover3(rep)=(bint(3)<b2<bint(6));

Did you know that this is not the correct way
to make that test?

(bint(1)<b0<bint(4))

is NOT equivalent (in MATLAB) to the test

(bint(1)<b0) && (b0<bint(4))

What does matlab do when you make this
comparison

(bint(1)<b0)

Now, what happens when it takes that result,
and compares it to bint(4)?

HTH,
John

Subject: confidence intervals regress

From: leo nidas

Date: 22 Aug, 2008 12:50:19

Message: 3 of 6


Thanks for the help John. Now it's working. Could you
confirm me that the formula MATLAB is using for the
confidence intervals is b0_hat-Za*SE(b0_hat) for b0 for
example? If I pass the formula myself then the limits of
the intervals may differ in the fourth decimal even if I
use Za=1.95996398454002. Is it just a round off error ?

For the weighted regression the formula is the same?

Subject: confidence intervals regress

From: Tom Lane

Date: 22 Aug, 2008 13:25:48

Message: 4 of 6

> Thanks for the help John. Now it's working. Could you
> confirm me that the formula MATLAB is using for the
> confidence intervals is b0_hat-Za*SE(b0_hat) for b0 for
> example? If I pass the formula myself then the limits of
> the intervals may differ in the fourth decimal even if I
> use Za=1.95996398454002. Is it just a round off error ?

Leo, you are describing a normal approximation to the confidence interval.
For finite samples it's more accurate to use a t distribution instead. So
Za in your example should be replaced by a quantile from the t distribution.

Type "edit regress" and search for "tinv" to see how it is done.

The t distribution has a "degrees of freedom" parameter equal to the number
of observations minus the number of estimated coefficients. This approaches
the normal value as the parameter gets large:

>> tinv(1-0.05/2, [1 2 5 10 50 100 Inf])
ans =
   12.7062 4.3027 2.5706 2.2281 2.0086 1.9840 1.9600

-- Tom

Subject: confidence intervals regress

From: leo nidas

Date: 24 Aug, 2008 08:14:02

Message: 5 of 6

>
> >> tinv(1-0.05/2, [1 2 5 10 50 100 Inf])
> ans =
> 12.7062 4.3027 2.5706 2.2281 2.0086
1.9840 1.9600
>
> -- Tom
>
>

Thanks for the help Tom! That worked!For the weighted least
squares in the documentation for the lscov it is said that

b=inv(X'*inv(V)*X)*X'*inv(V)*y, which is the formula I also
find in Wikipedia. But this doesn't work I think. This
formula works instead inv(X'*V*X)*X'*inv(V)*y
 (i.e. agrees with the result of lscov):


x = (1:10)';
y = 10 - 2*x + randn(10,1);
y(10) = 0;
X=[ones(1,length(x))' x];


w=[1 1 1 1 1 1 1 1 1 0.5]';

[bw,sew_b,msew] = lscov([ones(1,length(x))' x],y,w)
V=diag(w);
bw_2=inv(X'*V*X)*X'*inv(V)*y
bw_wikipedia=inv(X'*inv(V)*X)*X'*inv(V)*y


bw_2 seems to be working. Could you please explain me what
am I doing wrong because I am a little confused with the
formulas here... Thanks again!!

Subject: confidence intervals regress

From: Peter Perkins

Date: 25 Aug, 2008 13:19:43

Message: 6 of 6

leo nidas wrote:

> Thanks for the help Tom! That worked!For the weighted least
> squares in the documentation for the lscov it is said that
>
> b=inv(X'*inv(V)*X)*X'*inv(V)*y, which is the formula I also
> find in Wikipedia. But this doesn't work I think. This
> formula works instead inv(X'*V*X)*X'*inv(V)*y
> (i.e. agrees with the result of lscov):

The formula you cite is for generalized least squares, i.e.

>> help lscov
[snip]
    X = LSCOV(A,B,V), where V is an M-by-M symmetric (or hermitian) positive
    definite matrix, returns the generalized least squares solution to the
    linear system A*X = B with covariance matrix proportional to V, i.e., X
    minimizes (B - A*X)'*inv(V)*(B - A*X).

That's not treated the same as weighted least squares:

    X = LSCOV(A,B,W), where W is a vector length M of real positive weights,
    returns the weighted least squares solution to the linear system A*X =
    B, i.e., X minimizes (B - A*X)'*diag(W)*(B - A*X). W typically
    contains either counts or inverse variances.

Hope this helps.

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us