Accounting for groups in GzLM Poisson regression
1 view (last 30 days)
Show older comments
I am currently trying to perform a regression on data with one predictor and one covariate.
The hypothesis I am trying to test is whether the incidence of pores (a hole through a cell) increases when I stretch the cells. I have three cell lines and three stretch levels (0, 10 and 20%). At each stretch level and for each cell line I have 4 (and sometimes 3).
Because I am sampling the occurrence of rare events (typical counts between 0 and 20) over an area, we are looking at a Poisson process. This is the reason why I assume the data is Poisson-distributed and not normally distributed.
x1=1x34 vector containing the strain-level for each specimen x2=1x34 vector containing the cell line fore each specimen y=1x34 vector containing the pore counts for each specimen There are 3x3x4-2=34 specimens: cell line 1 is missing one specimen at 10%, cell line 2 is missing one specimen at 20%.
I am doing my statistical analysis in MATLAB. Before I came to the conclusion that my data is Poisson distributed, I used ANCOVA with the aoctool function, fitting separate lines through each cell line. The aoctool then calculates an F-statistic and a corresponding p-value.
[h,atab,ctab,astats]=aoctool2(x1,yT,x2,0.05,'Strain','Pore Count','Cell Line','off','separate lines');
The aoctool fits the data to the following model: y = (α + αi) + (β + βi)x + ε where y is the dependent variable, α the general intercept, αi the cell line specific intercept, β the general slope, βi the cell line specific intercept and ε the error term.
Fitting the data to this model regresses the pore count with the stretch-level whilst accounting for possible differences between cell lines (groups).
Now I want to do the same with the fitglm (or glmfit) function in Matlab but using a Poisson rather than a normal distribution. However, I have not been able to get the fitglm function to fit separate lines for each cell line. Rather, it takes the cell line as a covariate.
My question is twofold:
- Is there a difference between using cell line as a covariate or as grouping variable?
- How do I adjust the fitglm-input to take into account cell line as a grouping variable?
This is what I have tried so far:
tbl=table(yT,x1,x2); % enter my data into a 'table' format for the fitglm function
tbl.x2=nominal(tbl.x2); % x2, cell line, is a nominal variable with values 1, 2 or 3
model=fitglm(tbl,'yT ~ (x1+x2)','Distribution','Poisson')
this results in
Estimated Coefficients: Estimate SE tStat pValue __ ___ _ ____
(Intercept) 2.5086 0.097237 25.798 9.234e-147 x1 0.024509 0.0024828 9.8714 5.5397e-23 x2_2 -0.47372 0.10328 -4.5869 4.499e-06 x2_3 -0.65656 0.10444 -6.2862 3.2532e-10
Thank you so much for your help!
0 Comments
Answers (0)
See Also
Categories
Find more on Regression in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!