Hi!
Thanks very much for your suggestions.
Do you mean when I use the parametricboostrap, I can test for any types of
distributions?
Actually, I have some measurement data.
I do not know which distributions describe the data the best.
So, I need to test several distributions (e.g. Lognormal, Rayleigh, Gamma
and Weibull).
And I would like to use some kind of hypothesis test such as KS test to
determine which distribution give the best fit (or highest percentage of
passing rate).
So, this is how I do:
% (1) Gamma distribution
gam_para = gamfit(data);
p_gam = kstest(data,[data gamcdf(data,gam_para(1),gam_para(2))]);
% (2) Lognormal distribution
logn_para = lognfit(data);
p_logn = kstest(data,[data logncdf(data,logn_para(1),logn_para(2))]);
% (3) Rayleigh distribution
rayl_para = raylfit(data);
p_rayl = kstest(data,[data raylcdf(data,rayl_para))]);
% (4) Weibull distribution
wbl_para = wblfit(data);
p_wbl = kstest(data,[data wblcdf(data,wbl_para(1),wbl_para(2))]);
However, I notice that "p" values from all distributions show "1" i.e. to
reject the distribution eventhough when I try to plot the empirical data CDF
and all the estimated CDF using estimated parameters set, they show pretty
close.
Can I know is it coz the technique I use is wrong?
Thanks.
Linda
"tax" <onemorenoisy@day.today> wrote in message
news:eee08f4.0@webx.raydaftYaTP...
> > chi2 and gamma distribution pass the KS
> > test. what should I do?
> > Linda
>
> First quick solution: look at the pvalues not just at the 0/1 result
> (choose the model that passed the test with the highest pvalue)
>
> Pretty often, however, you have to estimate the parameters of the
> distribution family under the null. In this case the usual KStest is
> not reliable and you might wanna look at the following thread:
>
> "KSteststatistical test"
> Date: 20030611 11:32:55
>
> Now if, for example, you wanna test against a normal with parameter
> estimated from the data, then the (parametric) bootstrap procedure
> described in the paper cited there goes, more or less, as follow
> (assuming you have the stat TBX):
>
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> % X contains the data
>
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> % perform the usual KStest
> % based on the estimates.
> % We need the ksObs
>
> [foo1, foo2, ksObs] = kstest(X, [X normcdf(X, muH, sigmaH)], alpha);
>
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> % Parametricbootstrap
>
> ksBoot = zeros(nBoot,1);
> for k=1:nBoot,
> % sample from the
> % null distribution
> Xboot = normrnd(muH, sigmaH, [nSample,1]);
>
> % calculate the estimates
> [muHboot, sigmaHboot] = normfit(Xboot);
>
> % KStest on the sample
> [foo1, foo2, ksBoot(k)] = kstest(Xboot, [Xboot normcdf(Xboot,
> muHboot, sigmaHboot)]);
> end
>
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> % calculate the pvalue
>
> % smooth out the density
> % of the KSstatistics
> ksX = sort([linspace(0,max(ksBoot)+1,250), ksObs]);
> idx = find(ksX == ksObs);
> ksSmooth = ksdensity(ksBoot, ksX, 'support', 'positive');
>
> % approximate the tail
> Pvalue = trapz(ksX(idx:end), ksSmooth(idx:end));
>
> % actual decision rule
> if bPvalue < alpha,
> Reject = 1;
> else
> Reject = 0;
> end
>
> HTH
