I have a rather fundamental question regarding the analysis of my data involving nonlinear fitting and I hope it is appropriate to post it here. For the sake of brevity I will not provide the whole code and only summarize the essential steps, but of course I can add any details you request.
I have some data which represents some response to a stimulus as a function of the distance to the stimulation site. The data shows, as to be expected, a decay in the response variable, which may be best approximated by a sigmoidal fit. So I applied the BOLTZMANN equation to the data and let MATLAB predict confidence bounds for new observations:
% Define model function (BOLTZMANN); f = @(beta0,conds)beta0(1) + ((beta0(2)-beta0(1)) ./ (1+exp((beta0(3) - conds) ./ beta0(4))));
% Find initialization parameters: resp50 = (max(resp) + min(resp))/2; x50 = 5000; %Educated guess
inidat = [0,max(resp),resp50,x50];
% Estimate the fitted function: [beta,res,jac,covb] = nlinfit(conds',fliplr(resp),f,inidat);
% Fit the function: xfit = linspace(min(conds),max(conds),100); [yfit,delta,n,df,varpred] = nlpredci(f,xfit,beta,res,'Covar',covb,'PredOpt','observation'); %Function edited, see below yfit = fliplr(yfit); delta = fliplr(delta'); varpred = fliplr(varpred');
Behold the plotted result following this link: http://imageshack.us/a/img18/114/stimdistwithoxrngalt1.png (Embedding this image did not work.)
I am now adressing the question, how far I can get off the reference site until responses are to be regarded non-maximum. I.e. up from which distance are my (predicted) responses signicantly different to the maximum a 0 mm? I did not find a pre-described solution to such a question, so I developed a little bit naively my own approach, and I would like to ask you to tell me if it is appropriate or if there is some superior method.
My idea was simply to run multiple pairwise t-tests given the statistics from the NLINFIT function (which I edited as to return sample size n, degrees of freedom v, and predicted variance varpred, so I would not have to do the calculations on my own). Thus, I iterate through the predictions unless the tested pair is significantly different:
alpha = 0.05;
for i=2:length(yfit) testdiff = yfit(1) - yfit(i);
%Common MSE is mean of both estimated variances (s. ONLINESTATBOOK p.376): mse = (varpred(1) + varpred(i))/2;
%Common SE: testse = sqrt(2*mse/n); %Correct?
%Compute t-value: t = testdiff/testse;
%Common df: testdf = (n-1) + (n-1); %Correct?
p = tpdf(t,testdf);
if p < (alpha/(i-1)) %With BONFERRONI correction (correct?) m = i; break; end end
As you can see, I also tried to add some BONFERRONI correction of the alpha-level to account for these multiple comparisions. I am aware, that the t-test I used may be inappropriate for correlated pairs (which is evidently the case).
According to my rule of thumb, I would expect a cut-off x-value somewhere were the confidence intervals of the fit do not intersect anymore. Surprisingly, I obtain a way earlier cut-off as you can see in the picture above.
Could you suggest any corrections or improvements to adress this question?
Best regards, Rouven
I suggest doing paired t-tests ( ttest2 ) between your reference (at 0 mm) and data taken from stimuli at various distances, for instance 0 mm and 1 mm, 0 mm and 2 mm, etc. (My guess is that you would see significance at 10 to 15 mm, depending on whether your error bars represent standard errors or 95% confidence limits.) This is relatively common in the literature I am familiar with, and generally does not require the Bonferroni correction, because you are not also comparing 2 mm and 3 mm and others. (I also suggest consulting the statistics guidance for the journal you plan to submit your data to.)
The Boltzmann equation is interesting, but you might consider using a model more appropriate to the system you are measuring (unless you're doing the sort of physics the Boltzmann equation describes). I suspect a reviewer would want to know the reason you chose it and how it describes your experiment. You have not described the system you are investigating, but using a regression model fit may be redundant if you are only interested in the differences between the result of stimulation at various distances from the reference site.
I suggest that a regression model is most appropriate to a discussion of the possible mechanisms explaining the results you are getting. The model parameters can help explain the details of the system you are investigating.
The statistical analyses you use always depend on how you designed your experiment.