How can I determine if my data follows a lognormal distribution?
You are now following this question
- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.
An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Show older comments
1 vote
My x-data includes arrival time for cells and my y-data includes their velocities. How can I determine if this data set follows a lognormal distribution?
I've already tried QQ-plots and histograms, but am utterly lost on how to approach this.
Thanks.
Accepted Answer
Star Strider
on 22 Jun 2014
2 votes
If all your data are positive, that’s a good start. I’m not sure what you’re studying, but I always associate ‘arrival times’ with the Poisson distribution (that ‘looks’ a lot like the lognormal distribution). The velocities may well be lognormally distrbuted. I suggest using the histfit function for both. Another option is to use the chi2gof function to perform a Chi-square goodness-of-fit test.
10 Comments
Veena
on 22 Jun 2014
Yes, all my data is positive. Using the histfit function that you suggested, it looks like both my x-data and my y-data have lognormal distributions (independently of each other). However, that does not mean that they have the same normal distribution, correct? How can I determine this?
Thank you for your help.
Star Strider
on 22 Jun 2014
Edited: Star Strider
on 22 Jun 2014
My pleasure!
There are a few ways of determining if they have the same distribution. (You have already determined they are not normally distributed, so I don’t know what you mean by ‘the same normal distribution’. I assume you mean ‘the same distribution’.)
One way is to fitdist and then paramci. If the respective parameter confidence intervals don’t overlap, they don’t share the same distribution parameters.
Another way is to do the Chi-squared goodness-of-fit test on the distributions of the two data sets (arrival time and velocity). That would be my choice.
Image Analyst
on 22 Jun 2014
Can't you just take the log of your x data and the log of your y data? Then if the means and standard deviations of the log of the data are "the same" or pretty close then they have pretty much the same distribution.
Star Strider
on 22 Jun 2014
@Image Analyst — That’s the idea behind my suggestion to use fitdist abd paramci. I would certainly agree with your approach if more statistically robust methods weren’t so easily available in the Statistics Toolbox.
Image Analyst
on 22 Jun 2014
Yes, it's always better to use fully tested, debugged, and validated methods if you have them. I've added the Statistics Toolbox to the products list above.
Star Strider
on 22 Jun 2014
The problem in log-transforming the data and then taking the mean and variance (and computing the standard errors of the estimates from them to use to determine if the two distributions are significantly different) is that it transforms the errors from additive to multiplicative. This skews the distribution of the errors.
The chi-squared test is easy enough to program, although the chi-squared distribution (to estimate the p-value) is somewhat more challenging.
I overlooked adding the product tag. Thanks!
Veena
on 22 Jun 2014
Thank you both for your help!
Star Strider
on 22 Jun 2014
My (our) pleasure!
Veena
on 28 Jun 2014
I gave the fitdist and then paramci method a go.
For the arrival times (my x-data), fitdist gave me:
Lognormal distribution
mu = 6.92077 [6.8758, 6.96574]
sigma = 0.87264 [0.841985, 0.905628]
For the velocities (my y-data), fitdist gave me:
Lognormal distribution
mu = 3.66801 [3.61411, 3.7219]
sigma = 1.04586 [1.00912, 1.08539]
Since the parameter confidence intervals do not overlap, I can assume that my x-data and y-data do not share the same distribution parameters, like Star Strider said, correct?
*paramci gave me the same values as fitdist did in the square brackets, so I did not repeat them here.
Star Strider
on 28 Jun 2014
Correct!
More Answers (0)
Categories
Find more on Univariate Discrete Distributions in Help Center and File Exchange
Tags
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)