I have two different data sets. I try to find the zipf's law parameter that represent both of them. HOw do I know if the parameter values are statistically different. Do I use a t-test or f-test or something else

1 view (last 30 days)
Data set 1: frequencies of terms present in a corpus.
Data set 2: frequencies of terms present in a sub corpus (of the corpus above)
y = c/(x^a)
y is the actual frequency of the terms
x ranges from 1:N (N is the total number of terms)
I get c1 and a1 for data set 1 and c2 and a2 for dataset 2 using regression.
How do I test if c1 and c2 (or a1 and a2) are significantly different

Accepted Answer

Star Strider
Star Strider on 30 Jun 2014
If you have already done the regression (ideally using nlinfit), you should have the confidence intervals (using nlparci) of the parameters for each dataset.
If the confidence intervals for the respective parameters do not overlap between dataset regressions and do not include zero, they are significant and statistically different and you can probably stop there. If they do overlap, you will likely need to do a paired t-test (using the square roots of the diagonals of the covariance matrices to calculate the standard deviations) to calculate the t-statistics to determine the probability that they are different.
That would be my approach. There may be others.
  4 Comments

Sign in to comment.

More Answers (0)

Categories

Find more on Descriptive Statistics in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!