|
> Thanks for your reply. There exists a "standard" statistical model for
> this type of fit. It is called generalized linear models. This kind of
> statistical models can deal with a great variety of different problems and
> parameters. Our particular problem has a non-zero parameter (wich you
> correctly discribe as a baseline probability). In the biostatistics field
> (dose-response analysis) is called control mortality or mortality in the
> control group. I know that programs such SPSS or SAS can handle this type
> of problems directly. I would like to be able to do it in MATLAB, but I
> don't know how.
Joe (and Bego), the glmfit function does generalized linear models. For a
binomial distribution it has several built-in "link" functions that describe
how the fitted probability depends on a linear combination of the
predictors. These are the logit, probit, log-log, and complementary log-log
functions. All have the property that they produce fitted probabilities
throughout the range from 0 to 1.
You want a probability ranging upward from some lower limit L to 1.
Fortunately, there's a capability to provide your own link function. You
would write your own functions to compute the link, its derivative, and its
inverse, and invoke glmfit something like this:
b = glmfit(log(dose),y,'binomial','link',{@mylink @mydlink @myilink});
If you want to estimate the lower limit, you can try a range of lower limit
values and pick the one that works best. A problem is that, internally,
glmfit produces some starting values that may not satisfy your lower limit
requirement (because glmfit doesn't know about that lower limit). Below is
some code using an idea suggested by Peter Perkins for dealing with that
issue. Sorry it's so long.
Thanks for this question. You've given us some ideas for improving glmfit
in the future: allowing specification of starting values, supporting link
functions with lower or upper bounds, and maybe estimating link functions
with parameters in them.
I should add that this is going to enable you to compute standard errors for
the coefficients given that the L value is fixed at the value you choose.
That's similar to what's often done in Box-Cox models, where the estimation
is done conditional on the Box-Cox parameter. That may not be what you
want -- you may prefer to measure the uncertainty in L and take that into
account when measuring the uncertainty on the coefficients. I don't have a
way to do the latter right now.
-- Tom
-----------------
function mortality2
% Generate random data
rand('twister',10);
dose = sort(floor(exprnd(10,200,1)));
ilogit = @(L) 1./(1+exp(-L));
p = 0.15 + 0.85 * ilogit(-6 + 2*log(dose));
y = binornd(1,p);
ax1 = subplot(2,1,1);
plot(dose,p,'b-', dose,y,'ro')
% Nested functions defining the link
function eta = mylink(mu) % link function
if init % adjust starting values first time through
mu = min(max(mu,L+(1-L)*.5/2),1-(1-L)*.5/2);
end
eta = log((mu-L) ./ (1-mu));
end
function deta = mydlink(mu) % derivative of link function
if init % adjust starting values first time through
mu = min(max(mu,L+(1-L)*.5/2),1-(1-L)*.5/2);
init = false;
end
deta = (1-L) ./ ((mu-L) .* (1-mu));
end
function mu = myilink(eta,lowerBnd,upperBnd) % inverse of link function
if nargin<2
lowerBnd = log(eps);
upperBnd = -lowerBnd;
end
% keep mu = ilink(eta) in (approx) [L+eps, (1-eps)];
mu = L + (1-L) ./ (1 + exp(-min(max(eta,lowerBnd),upperBnd)));
end
% Negative log likelihood
nll = @(y,p) -(sum(log(p(y==1))) + sum(log(1-p(y==0))));
% Fit using a range of lower limit values using points with dose>0, but
% compute the negative log likelihood for all points
Lvec = (0:30)/100;
bmat = zeros(2,length(Lvec));
nlogl = zeros(size(Lvec));
t = dose>0;
for j=1:length(Lvec)
init = true;
L = Lvec(j);
b = glmfit(log(dose(t)),y(t),'binomial','link',{@mylink @mydlink
@myilink});
nlogl(j) = nll(y,myilink(b(1)+b(2)*log(dose)));
bmat(:,j) = b;
end
% Show the best one
[bestn,bestloc] = min(nlogl);
b = bmat(:,bestloc);
L = Lvec(bestloc);
axes(ax1);
line(dose,myilink(b(1)+b(2)*log(dose),-Inf,Inf),'color','g');
legend('True','Data','Fit','location','SE')
ax2 = subplot(2,1,2);
plot(Lvec,nlogl)
line(Lvec(bestloc),bestn,'marker','o','color','g');
legend('Neg log likelihood','Best','location','SE')
end
|