Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
ecdf exists. why there is no epdf

Subject: ecdf exists. why there is no epdf

From: Alex

Date: 9 Jun, 2011 05:51:04

Message: 1 of 6

Dear all,
I would like to ask you something concerning ecdf.
I think I miss something from the theory.
Why there is ecdf and there is no something like epdf. Empirical pdf.

Best Regards
Alex

Subject: ecdf exists. why there is no epdf

From: TideMan

Date: 9 Jun, 2011 05:55:47

Message: 2 of 6

On Jun 9, 5:51 pm, "Alex " <ala...@yahoo.com> wrote:
> Dear all,
> I would like to ask you something concerning ecdf.
> I think I miss something from the theory.
> Why there is ecdf and there is no something like epdf. Empirical pdf.
>
> Best Regards
> Alex

And now you've started yet another thread on the same topic.

WARNING: this person has started 3 different threads on exactly the
same topic.
In response to the first thread, he was told all he needs to know, but
he has chosen to ignore this.

This guy's wasting our time.

Subject: ecdf exists. why there is no epdf

From: Roger Stafford

Date: 9 Jun, 2011 16:40:19

Message: 3 of 6

TideMan <mulgor@gmail.com> wrote in message <a18b1a52-7bbe-44fb-a4ae-202af9a504d3@s16g2000prf.googlegroups.com>...
> On Jun 9, 5:51 pm, "Alex " <ala...@yahoo.com> wrote:
> > Dear all,
> > I would like to ask you something concerning ecdf.
> > I think I miss something from the theory.
> > Why there is ecdf and there is no something like epdf. Empirical pdf.
> > Best Regards
> > Alex
>
> And now you've started yet another thread on the same topic.
> WARNING: this person has started 3 different threads on exactly the
> same topic.
> In response to the first thread, he was told all he needs to know, but
> he has chosen to ignore this.
> This guy's wasting our time.
- - - - - - - - - - - -
  Tideman, I think Alex is asking a legitimate question and it deserves a straight answer.

  In answer to your question, Alex, with empirical data it is relatively easy to produce a cumulative distribution curve from it that looks reasonably close to the true underlying distribution. It amounts to little more than sorting the data.

  Exhibiting a valid density distribution approximation is more difficult. One would be trying to deduce the derivative or slope of this same curve. Unfortunately, unless there is an enormous amount of data furnished, the cumulative curve has a markedly jagged appearance and all estimates of density are bound to be highly inaccurate. The reluctance of Mathworks to write routines that could give extremely inaccurate results is understandable, though if the appropriate filtering and/or interpolation were done it is something that could give smooth-looking, though perhaps still inaccurate, results.

  As you may be aware, using purely empirical results to demonstrate reliably that a given process has a certain probability distribution is an undertaking that requires an enormous amount of data even for a single variable. For more than one variable it becomes very much more difficult to achieve empirically and often has to be supported by other types of evidence or arguments.

Roger Stafford

Subject: ecdf exists. why there is no epdf

From: TideMan

Date: 9 Jun, 2011 20:02:23

Message: 4 of 6

On Jun 10, 4:40 am, "Roger Stafford"
<ellieandrogerxy...@mindspring.com.invalid> wrote:
> TideMan <mul...@gmail.com> wrote in message <a18b1a52-7bbe-44fb-a4ae-202af9a50...@s16g2000prf.googlegroups.com>...
> > On Jun 9, 5:51 pm, "Alex " <ala...@yahoo.com> wrote:
> > > Dear all,
> > > I would like to ask you something concerning ecdf.
> > > I think I miss something from the theory.
> > > Why there is ecdf and there is no something like epdf. Empirical pdf.
> > > Best Regards
> > > Alex
>
> > And now you've started yet another thread on the same topic.
> > WARNING: this person has started 3 different threads on exactly the
> > same topic.
> > In response to the first thread, he was told all he needs to know, but
> > he has chosen to ignore this.
> > This guy's wasting our time.
>
> - - - - - - - - - - - -
>   Tideman, I think Alex is asking a legitimate question and it deserves a straight answer.
>
>   In answer to your question, Alex, with empirical data it is relatively easy to produce a cumulative distribution curve from it that looks reasonably close to the true underlying distribution.  It amounts to little more than sorting the data.
>
>   Exhibiting a valid density distribution approximation is more difficult.  One would be trying to deduce the derivative or slope of this same curve.  Unfortunately, unless there is an enormous amount of data furnished, the cumulative curve has a markedly jagged appearance and all estimates of density are bound to be highly inaccurate.  The reluctance of Mathworks to write routines that could give extremely inaccurate results is understandable, though if the appropriate filtering and/or interpolation were done it is something that could give smooth-looking, though perhaps still inaccurate, results.
>
>   As you may be aware, using purely empirical results to demonstrate reliably that a given process has a certain probability distribution is an undertaking that requires an enormous amount of data even for a single variable.  For more than one variable it becomes very much more difficult to achieve empirically and often has to be supported by other types of evidence or arguments.
>
> Roger Stafford

Well, for a mere engineer who deals with real physical data, the
empirical PDF is simply the histogram, as I pointed out to OP in his
original thread.
Integrating the PDF using cumsum produces the empirical CDF.
You have lots of freedom in choosing the bins for the histogram and
this governs how smooth the CDF is.

Subject: ecdf exists. why there is no epdf

From: Alex

Date: 10 Jun, 2011 05:33:05

Message: 5 of 6

"Roger Stafford" wrote in message <isqt1j$1qd$1@newscl01ah.mathworks.com>...
> TideMan <mulgor@gmail.com> wrote in message <a18b1a52-7bbe-44fb-a4ae-202af9a504d3@s16g2000prf.googlegroups.com>...
> > On Jun 9, 5:51 pm, "Alex " <ala...@yahoo.com> wrote:
> > > Dear all,
> > > I would like to ask you something concerning ecdf.
> > > I think I miss something from the theory.
> > > Why there is ecdf and there is no something like epdf. Empirical pdf.
> > > Best Regards
> > > Alex
> >
> > And now you've started yet another thread on the same topic.
> > WARNING: this person has started 3 different threads on exactly the
> > same topic.
> > In response to the first thread, he was told all he needs to know, but
> > he has chosen to ignore this.
> > This guy's wasting our time.
> - - - - - - - - - - - -
> Tideman, I think Alex is asking a legitimate question and it deserves a straight answer.
>
> In answer to your question, Alex, with empirical data it is relatively easy to produce a cumulative distribution curve from it that looks reasonably close to the true underlying distribution. It amounts to little more than sorting the data.
>
> Exhibiting a valid density distribution approximation is more difficult. One would be trying to deduce the derivative or slope of this same curve. Unfortunately, unless there is an enormous amount of data furnished, the cumulative curve has a markedly jagged appearance and all estimates of density are bound to be highly inaccurate. The reluctance of Mathworks to write routines that could give extremely inaccurate results is understandable, though if the appropriate filtering and/or interpolation were done it is something that could give smooth-looking, though perhaps still inaccurate, results.
>
> As you may be aware, using purely empirical results to demonstrate reliably that a given process has a certain probability distribution is an undertaking that requires an enormous amount of data even for a single variable. For more than one variable it becomes very much more difficult to achieve empirically and often has to be supported by other types of evidence or arguments.
>
> Roger Stafford

Could you please try to quantify how much data one needs to demonstrate reliably one process has a certain probability distribution. Are we talking about thousands, tens of thousand or even more?

Regards
Alex

Subject: ecdf exists. why there is no epdf

From: Jeff

Date: 28 Jul, 2011 14:47:30

Message: 6 of 6

TideMan <mulgor@gmail.com> wrote in message <a18b1a52-7bbe-44fb-a4ae-202af9a504d3@s16g2000prf.googlegroups.com>...
> On Jun 9, 5:51 pm, "Alex " <ala...@yahoo.com> wrote:
> > Dear all,
> > I would like to ask you something concerning ecdf.
> > I think I miss something from the theory.
> > Why there is ecdf and there is no something like epdf. Empirical pdf.
> >
> > Best Regards
> > Alex
>

As Roger has pointed out, this problem is more difficult than it would first seem.

You seem to blur the distinction between two problems. One question is, what distribution do the data fit? The second is, how do I construct a smooth looking graph of a PDF?

For the first question, you really should just buy the Statistics Toolbox, which has the interactive dfittool that lets you visually compare your data (plotted in a variety of scales including PDF and CDF) with a distribution fitted to those data (and you have a large choice of distributions). I generally make the choice by eye, depending on what I know about the data (e.g., constrained to be non-negative, bound to some range like 0-1, etc).

To batch process a bunch of data with multiple distributions, the Statistics Toolbox has the fitdist function. Finally, to test how well data fit a putative distribution, you can use the Kolmogorov-Smirnoff test with either kstest or kstest2.

The alternative question is, how do I construct a smooth curve either from a histogram or the output of edcf. Here the problem is how to both smooth the curve and constrain the integral of the new PDF to agree with the bins in the histogram (i.e. we know how many observations are in a bin, and we want the integral of the PDF across that bin's boundaries to be consistent with the number of observations). The cool word for this is "pycnophylactic" interpolation, about which Tobler has written some great papers. I generally do this by getting [f,x] from ecdf, smoothing the resulting staircase, and then differentiating to calculate the PDF.

Jeff

Tags for this Thread

No tags are associated with this thread.

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us