Binomial Data with 0's in glmfit
4 views (last 30 days)
Show older comments
I am using glmfit to perform logistic regression on a set of data. If I run the code
b = glmfit(X,[y n],'binomial','link','logit')
where y contains a few 0 values, how does Matlab handle the 0's? Are they set to very small values? Is it necessary or beneficial to use the empirical log-odds
log((y+0.5)/(n-y+0.5)
to determine y_new and n_new (where y_new does not contain 0's)?
Thank you.
0 Comments
Answers (1)
the cyclist
on 15 Sep 2015
In a logistic regression, the response variable (y) is typically a binary variable (and can represented as 0's and 1's).
Why do you think 0's would be a problem?
2 Comments
the cyclist
on 15 Sep 2015
In the logistic model, you would only say the probability is equal to 1 as X approaches infinity.
It's true that if for some particular value of X, you happen to see all "successes" (say, 15 out of 15 successes when X = 300), then the code is going to make a starting estimate of the probability (at that value of X) to be just a bit smaller than 1, while it tries to find the best fit across all values of X.
I think the only time you will have a problem fitting is if you see only successes at all values of X. But, then you don't really need a model, do you? :-)
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!