why my cdf does not match cumsum(pdf)

x=1D array of 100 discrete values that I calculate Rayleigh pdf and cdf as below
pdfrayl = 2 * x .* exp(-x .^ 2);
cdfrayl = 1 - exp(-x .^ 2);
Now I plot following two lines:
plot(x, cdfrayl)
hold on;
plot(x, cumsum(pdfrayl )/ sum(pdfrayl))
I expected them to match exactly, but they don't. Can anybody please explain why they don't match?

 Accepted Answer

It does not appear to me that you are approximating the Riemann sum of the integral of the PDF correctly here.
x = 0:0.01:10;
y = x/4.*exp(-x.^2/8);
% \Delta x for the Riemann sum
dx = 10/length(x);
yc = cumsum(y).*dx;
yct = 1-exp(-x.^2/8);
plot(x,yc,'r-.','linewidth',2);
hold on;
plot(x,yct,'b');
legend('Approximation of CDF','True CDF', ...
'Location','SouthEast');

More Answers (1)

Russ Adheaux
Russ Adheaux on 15 Sep 2012
Thanks. What if my x vector is a random array that is not equally spaced (no single dx),?I guess I will have to sort x and use dx between consecutive x values in cumsum. Correct?

3 Comments

The dx is coming the length of the interval max(x)-min(x) divided by the number of points. Yes, you would have to sort the x. Do you know the ecdf function?
For unevenly spaced, monotonically-increasing x-data, I suggest trapz or cumtrapz, specifying both x and y data as arguments. It then allows you to integrate with respect to any x-variable spacing, and is more accurate than cumsum.
Hmmm...just learned about ecdf. I was using the following code to get cdf of my x data. Is there an advantage to use ecdf directly? My ultimate goal is to fit a user-defined distribution function to this data.
xbins=[0:dx:5]; [n,xout] = hist(x,xbins); yout=n/sum(n)/dx; cdfy=cumsum(yout*dx);

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!