# poisson regression using genetics algorithm

8 views (last 30 days)
dmr on 13 Aug 2020
Commented: Gifari Zulkarnaen on 19 Aug 2020
Hi,
i've seen a few examples in the community on how to use genetics algorithm in optimizing regression models but i was wondering if i can use genetics algorithm as an approach to optimize poisson regression model (especially since i don't think pr uses mse to estimate the parameters). i have four independent variables and i've generated the parameters using maximum likelihood method but i don't know how to apply it to genetics algorithm. maybe some ideas on what should i use as an objective function and how to initialize the chromosomes? thanks in advance.

Gifari Zulkarnaen on 13 Aug 2020
Edited: Gifari Zulkarnaen on 13 Aug 2020
I dont really understand statistics and PR, but I think you can define the objective function as a function of PR variables which generate least error of regression. And using matlab toolbox, you dont need to code GA yourself. Can you give us your attempt of objective function script?
It's more likely in this form:
global X Y
X = rand(4,3); % x data, with 4 variables input of 3 samples
Y = rand(1,3); % y data, with 1 output of 3 samples
[teta,err] = ga(@obj_func,4);
function err = obj_func(teta)
global X Y
PR = exp(teta*X); % PR prediction
err = sumsqr(Y - PR); % difference between actual output and predicted regression
end

dmr on 13 Aug 2020
thank you for the response!
so this is the regression model
y=data(:,1); %y is a response variable
x1=data(:,2); %x1 x2 x3 are all independent variables
x2=data(;,3);
x3=data(:,4);
y=exp(b0+b1*x1+b2*x2+b3*x3) %this is the regression model
but i guess poisson regression doesn't have a function similar to least square since the approach to estimate the parameters is through numerical estimation using something like maximum likelihood (which is the one i'm using) and the error would be counted by substracting the parameter like this
b=[b0;b1;b2;b3]; %the first estimation of the parameters is known
while error>10^-3
bnew=b%substitute some functions with the parameters
%some functions to get the new b
b=%some functions
error=norm(bnew-b); %evaluate the error
end
the condition is if the error is small enough then the iteration will stop and we would get the parameters. that's why i'm kinda confused on which should i use as a fitness function.
and also since i have the data for both my xs and ys which one should i randomize to be initialized as the chromosomes?
edit:
okay so i already initialize the independent variables as binary individuals using this
new_chrom=randi([0 1],1000,4);
since i have 4 independet variables. but what should i do next?
Gifari Zulkarnaen on 19 Aug 2020
Sorry I forgot to check this.
First, do you want to code GA yourself or just use toolbox from matlab? If using toolbox from matlab, you dont need to initialize the variable. See the syntanx.
Second, I dont get how substracting the update of parameter can be an error measurement. According to wikipedia, the MLE of PR would be like this:
L = sum(y.*teta*x - exp(teta*x));
But GA will work better if the problem is minimization, so objective function should become:
f = 1/L;
So, the script would be:
global x y_data % make the data global variables
m = size(data,1); % number of samples
x = [ones(m,1) data(2:4)]'; % to simplify the coding, input data become 4 x m matrix
y_data = data(:,1)'; % y is a response variable, transposed
% GA optimization
teta = ga(@obj_func,4); % teta is the optimized parameters result, yes the GA coding is only this line
% Optimized PR model
y_PR = exp(teta*x); % which teta is vector of parameters: [b0 b1 b2 b3]
% Objective function
function f = obj_func(teta)
global x y_data
L = sum(y_data.*teta*x - exp(teta*x)); % MLE of PR
f = 1/L; % to become minimization problem
end