MATLAB Answers


how do I determine the probability distribution of data?

Asked by Doug
on 29 Mar 2012

Hello, I have a data set and I am trying to determine its probability distribution. It is from empirical data and I have no idea what distribution family it would have, let alone what parameters it would have. Is there a matlab function that can do that?


2 Answers

Answer by Richard Willey
on 29 Mar 2012

Sorry if this sounds like a silly question:

Is there an absolute requirement that you describe your data using a parametric distribution? If so why?

As an alternative, would something like the following suffice?

%% Generate some data
X1 = 10 + 5 * randn(200, 1);
X2 = 20 + 8 * randn(250 ,1);
X = [X1; X2];
%% Fit a distribution using a kernel smoother
myFit = fitdist(X, 'kernel')
%% Visualize the resulting fit
index = linspace(min(X), max(X), 1000);
plot(index, pdf(myFit, index))
%% Generate a set of 500 random numbers drawn from the distribution
numbers = random(myFit, 500, 1);
%% Inspect the complete set of methods for myFit


Hi Bobby

The probability distribution object provides methods for calculating pdfs, cdf, and the like. If you look closely at the example code, you'll see that I am calculating the pdf for the kernel smoother to generate my plot.

on 13 Apr 2012

Hello Richard,

I am looking at your code. This is probably a silly question but what is it actually doing? Does it help you find the probability distribution of data?

I've commented in the rely below - but I don't know if you would get notified of it?

Thank you


Tom Lane
on 13 Apr 2012

His example produces a nonparametric density estimate that should be flexible enough to adapt to your data. It doesn't produce a named parametric distribution (normal, Weibull, etc.).

Answer by Doug
on 29 Mar 2012

Not sure. I need to find the distribution of the sum of n independent identically distributed random variables with the same distribution. I haven't taken or used statistics in many years so I had tried to read up and found that different distributions sum random variables differently.

From my historam, it "looks" like a gamma distribution. Is there a relatively straightforward way to verify that?

Thanks very much.


The distribution is the histogram (normalized). That's what you actually got. Now, do you need to figure out what theoretically perfect "named" or "known" distribution (such as Poisson, Rayleigh, Normal, or any of the dozens of others listed here: that your actual distribution was generated from?

on 11 Apr 2012

Kye: it's empirical data generated by a solar panel.
Image analyst: yes, that is exactly what I am trying to do.


on 13 Apr 2012


I would also like to know if there is a way to determine the probability distribution of data. I would like to use an inbuilt function but it requires the distribution names and it also requires other input parameters such as shape and scale. It kinda hard to use the function if you don't have these inputs?

Thank you

Join the 15-year community celebration.

Play games and win prizes!

Learn more
Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

MATLAB Academy

New to MATLAB?

Learn MATLAB today!