# The Integration of Gaussian PDF to obtain the CDF why don’t I get the correct answer?

31 views (last 30 days)
Lino Gonzalez on 6 Nov 2013
Answered: Jonathan LeSage on 7 Nov 2013
I am trying to run a simulation, but before I do I wanted to write a simple program to ensure I could get a correct answer.
I start off by generating 10,000 long vector using a Normally distributed pseudorandom number generator (randn) with a mean =0 and sigma =1.
nsamples=100000; f=0 f=f+(1)*randn(nsamples,1);
x = f;
m = mean(f)
s = std(f)
I calculate the mean and std using the standard MATLAB functions (mean and std) verifying mean and sigma.
I then plot the data using the standard normal equation
f(x,m,s)=(e^-0.5*(x-m/s)^2)/s*sqrt(2pi)
where x is the RV, s=std deviation, and m is the mean.
I then integrate the PDF from -1 to 1 using the following commend
p1 = -.5 * ((x - m)/s) .^ 2;
p2 = (s * sqrt(2*pi));
G = exp(p1) ./ p2;
fun = @(x) G;
cdf=integral(fun,-1,1,'ArrayValued',true);
This generates another vector. So I calculate the mean of the CDF vector (cdfmean=mean(cdf)) and instead of getting something around 68% I get 56%
why?

Jonathan LeSage on 7 Nov 2013
After reviewing your code, I was able to figure out what was troubling you. The flow of your code and equations are all correct, however, you're making a small mistake when creating the normal distribution function (fun = @(x) G;).
Basically what is happening is that you have defined the variable x as a vector of random number already. These vector values are incorrectly used in p1, where x is supposed to be a function variable. As a result, G is just a constant vector. The output of a definite integral should be a scalar value in this case (around 68% as you mentioned) and not a vector.
Fortunately, the fix is quite straightforward, you just need to remove the definition of x as f and then fix your function. Here is how I would implement the integral:
% Generate normally distributed random numbers
nSamples = 100000;
f = 1*randn(nSamples,1);
% Compute normal distribution statistics
m = mean(f);
s = std(f);
% Define normal distribution and integrate over [-1,1] interval
normFun = @(x) 1/(sqrt(2*pi)*s)*exp(-(x-m).^2./(2*s^2));
cdf = integral(normFun,-1,1)
Hope this helps to clarify what was happening!