Accounting for groups in GzLM Poisson regression

1 view (last 30 days)
I am currently trying to perform a regression on data with one predictor and one covariate.
The hypothesis I am trying to test is whether the incidence of pores (a hole through a cell) increases when I stretch the cells. I have three cell lines and three stretch levels (0, 10 and 20%). At each stretch level and for each cell line I have 4 (and sometimes 3).
Because I am sampling the occurrence of rare events (typical counts between 0 and 20) over an area, we are looking at a Poisson process. This is the reason why I assume the data is Poisson-distributed and not normally distributed.
x1=1x34 vector containing the strain-level for each specimen x2=1x34 vector containing the cell line fore each specimen y=1x34 vector containing the pore counts for each specimen There are 3x3x4-2=34 specimens: cell line 1 is missing one specimen at 10%, cell line 2 is missing one specimen at 20%.
I am doing my statistical analysis in MATLAB. Before I came to the conclusion that my data is Poisson distributed, I used ANCOVA with the aoctool function, fitting separate lines through each cell line. The aoctool then calculates an F-statistic and a corresponding p-value.
[h,atab,ctab,astats]=aoctool2(x1,yT,x2,0.05,'Strain','Pore Count','Cell Line','off','separate lines');
The aoctool fits the data to the following model: y = (α + αi) + (β + βi)x + ε where y is the dependent variable, α the general intercept, αi the cell line specific intercept, β the general slope, βi the cell line specific intercept and ε the error term.
Fitting the data to this model regresses the pore count with the stretch-level whilst accounting for possible differences between cell lines (groups).
Now I want to do the same with the fitglm (or glmfit) function in Matlab but using a Poisson rather than a normal distribution. However, I have not been able to get the fitglm function to fit separate lines for each cell line. Rather, it takes the cell line as a covariate.
My question is twofold:
  1. Is there a difference between using cell line as a covariate or as grouping variable?
  2. How do I adjust the fitglm-input to take into account cell line as a grouping variable?
This is what I have tried so far:
tbl=table(yT,x1,x2); % enter my data into a 'table' format for the fitglm function
tbl.x2=nominal(tbl.x2); % x2, cell line, is a nominal variable with values 1, 2 or 3
model=fitglm(tbl,'yT ~ (x1+x2)','Distribution','Poisson')
this results in
Estimated Coefficients: Estimate SE tStat pValue __ ___ _ ____
(Intercept) 2.5086 0.097237 25.798 9.234e-147 x1 0.024509 0.0024828 9.8714 5.5397e-23 x2_2 -0.47372 0.10328 -4.5869 4.499e-06 x2_3 -0.65656 0.10444 -6.2862 3.2532e-10
Thank you so much for your help!
I also posted this question here: StackExchange

Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!