MATLAB Answers


How do I compute the confidence interval of this random variable manually?

Asked by promo
on 30 Jan 2013


My goal is to compute the 95% confidence interval of a random variable, say X, distributed according to the normal distribution with mean 2 and variance 9/100. The code I have so far is as follows


conventionalCI = zeros(2,1000);
for i=1:1000
%reset values for next round, i+1
rr = zeros(1,200);
%Here we are drawing directly from the distrution N(2,9/100), which is the
%distribution of the sameple means of a random variable from N(2,9). 
rr = 2 + (3/10).*normrnd(0,1,1,200);
%The quantiles for the data above is recorded
conventionalCI(1,i) = quantile(sort(rr),.975);
conventionalCI(2,i) = quantile(sort(rr),.025);
%Proportion (for conventional confidence intervals)
propCI = zeros(1,1000);
for i=1:1000
    if (2>=conventionalCI(2,i)) & (conventionalCI(1,i)>=2)
proportionCI = sum(propCI)/1000


In essence what I'm doing is drawing from the distribution N(2,9/100) with a sample of size 200, saving the 97.5 and 2.5 percentiles in the variable conventionalCI and calculating the proportion of times the true mean, 2, falls in the intervals in conventionalCI.

The problem is I keep getting the 1 for the proportion of times 2 falls in the interval which doesn't make sense given that I'm trying to construct a 95% confidence interval. Can anyone figure out why this is the case?

I appreciate your help



No products are associated with this question.

3 Answers

Answer by the cyclist
on 30 Jan 2013
Edited by the cyclist
on 2 Feb 2013

Suppose your friend gives you a sample (drawn from a parent distribution that you do not know). Maybe that sample is

>> x = [1 1.5 1.8 1.9 1.9 1.9 1.9 2 2 2 2 2 2.1 2.1 2.1 2.1 2.2 2.5 3];

It happens to have a sample mean of exactly 2. (If I were not lazy, I could have given you lots more data, and told you the confidence interval as well.) You do not know what the parent mean is. The parent distribution might have a mean of 2.04, or 1.93. The most likely parent mean is 2, and parent means further and further from 2 are statistically less likely. Under normal distribution assumptions, you can calculate how likely each value of the parent mean is.

Now -- and this is the point that is specifically important for your code -- you cannot test that CI by drawing over and over again from a parent distribution with mean 2. That would just be assuming the exact thing you are attempting to test! You will almost always ( way more than 95%) have a sample mean within your CI.

Instead, you would need to test by drawing from the entire range of possible parent distributions, in proportion to their probability of having generated the sample data that your friend gave you.

This is a very subtle but important distinction. I hope I've helped your understanding a little.


OK, so I finally found some time to sit down and write out some code that illustrates what you can do. I tried to comment it richly and use explicit variable names to help you understand what's going on. The core of what you tried to do is still there, which I hope will guide you.

function proportionCI = howOftenConfidenceIntervalContainsMean()
% Set the random number seed. [Older MATLAB version don't have the rng() function. Modify accordingly.]
% Create a "random" sample. For the rest of the code, you have to pretend that you never
% knew the true parameters of the parent distribution.
rr = 2 + 3*normrnd(0,1,1,SAMPLESIZE);
sampleMean = mean(rr);
sampleStandardDeviation = std(rr);
sampleStandardErrorOfMean = sampleStandardDeviation/sqrt(SAMPLESIZE);
confidenceIntervalOfMean = sampleMean + 1.96*sampleStandardErrorOfMean*[1; -1];
% Now, here's the slightly tricky part. You want the distribution of the possible parent means
% that might have resulted in your sample. But you don't know what those are. (You actually
% can derive it theoretically, but we are going to get it here by simulation.)
% The trick that enable us to do this by sampling is that we have an APPROXIMATION to the
% parent distribution ... our one sample! So, because we can't sample the parent, instead
% we SAMPLE THE SAMPLE! (This trick is known as resampling.) We will resample many times,
% and each one of those resamples gives us an approximation to the parent mean. We check those
% approximate parent means to see if they lie within the confidence interval as often as we
% expect.
% Do the resampling (lots of times), and see what proportion of the (approximate) parent means
% lie within the confidence intervals
      if mod(ni,1000)==0
          disp(['Trial ',num2str(ni),' of ',num2str(NUMBER_RESAMPLE_TRIALS)])
      % For each trial we generate a (re)sample.
      resampledSample = rr(randsample(SAMPLESIZE,SAMPLESIZE,true));
      resampledMean = mean(resampledSample); % Each trials gives one approximation of the parent mean
      % Test to see if the parent mean lies within the confidence interval
      if (resampledMean>=confidenceIntervalOfMean(2)) && (confidenceIntervalOfMean(1)>=resampledMean)
proportionCI = sum(propCI)/NUMBER_RESAMPLE_TRIALS


the cyclist
on 30 Jan 2013

No, it's not a trivial couple lines of code, I don't believe. Sorry. But if you understand the theory well, there is nothing too difficult to code.

Just as a bit more background, this presentation has the theory that I am talking about. About 3/4 of the way in, there is a slide with "Most likely value of mean is the sample mean" written in red font. (The slides aren't numbered, sorry.) At the top of that slide you see the formula for the relative probability that the sample comes from a parent with true mean mu.

I have edited my answer to include code that does what promo wanted. It's not "one or two instructions", but at least it's not just conceptual now.

hi, yes now it looks good how to manipulate the same sample instead of generating many .

thanks .

Answer by Youssef Khmou
on 30 Jan 2013


Try this modified version, :

%KHMOU Youssef, Statistics :exmpl1 30/01/2013 5:00 AM




conventionalCI = zeros(2,N);


%sample=sort(sample); %p=length(sample); %n=p/10;

for i=1:N

    if (2.00000>=b) && (a>=2.00000)


proportionCI = sum(propCI)/N



After each iteration i, the RANDOM sample increases in length, Does it give the result you want ? anyway try to change the number of samples per example :





As the the number of samples increases, th mean tends to 1.

i Hope that helps :

KHMOU Yussef.


Answer by Youssef Khmou
on 4 Feb 2013

HI promo,

The code provided by "the cyclist" does exactly the task ,

Just as an addition, tape in the command Window :


the GUI uses the "resampling" technique mentioned above .


Join the 15-year community celebration.

Play games and win prizes!

Learn more
Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

MATLAB Academy

New to MATLAB?

Learn MATLAB today!