Estimate Parameters for System of ODEs with given Data using a Genetic Algorithm (COVID-19-Model)

60 views (last 30 days)
Dear MATLAB experts! I am from Switzerland, developing a compartmental model for COVID-19 in school, and need some urgent help! I am definitely a beginner when it comes to coding but I managed to program my model. However, I would like to fit the twelve model parameters (here p(1) to p(12)) describing the transfer rates to some given official data in order to increase the model's accuracy. For this purpose, I have created a program that uses a genetic algorithm for the task. The problem is that I only have data for three of the seven states (Susceptible, Infected, Diagnosed, Ailing, Recovered, Dead, Total Population) and I don't quite get reasonable values - I suppose there are some major issues. Also, when setting the initial population (here from Italy) to 60e6, nothing happens (maybe the calculations are to difficult and take too long?). Can anyone update my code so that it works properly and gives me the suiting parameters between 0 and 1? I would be really, really grateful!! 🇨🇭 Here is my code:
function EIDAGG_MODELL_PARAMETER
clear
close all
clc
%7 ODEs to describe the model (E,I,D,A,G1,G2,N)
function M = EIDAGG(p,t)
%Initial conditions with population of 60e6, 200 infected and 20 diagnosed
%people
M0 = [60e6;200;20;0;0;0;60e6];
[~,m] = ode45(@DifEq,t,M0);
function dM = DifEq(~,k)
dMdt = zeros(7,1);
dMdt(1) = -k(1).*(p(1).*k(2)+p(2).*k(3)+p(3).*k(4)+p(12))+p(11).*k(7);
dMdt(2) = k(1).*(p(1).*k(2)+p(2).*k(3)+p(3).*k(4))-(p(4)+p(5)+p(6)+p(12)).*k(2);
dMdt(3) = p(4).*k(2)-(p(7)+p(8)+p(12)).*k(3);
dMdt(4) = p(6).*k(2)+p(8).*k(3)-(p(9)+p(10)+p(12)).*k(4);
dMdt(5) = p(5).*k(2)+p(7).*k(3)+p(9).*k(4)-p(12).*k(5);
dMdt(6) = p(10).*k(4);
dMdt(7) = -p(12).*(k(1)+k(2)+k(3)+k(4)+k(5))+p(11).*k(7);
dM = dMdt;
end
M = m;
end
%From day 1 to day 20
t = [1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20];
%Official data for I, G1 and G2 for the first 20 days
k = [3 0 0
19 0 1
77 0 2
129 1 2
213 1 5
311 1 10
385 3 12
588 45 17
821 46 21
1049 50 29
1577 83 34
1835 149 52
2263 160 79
2706 276 107
3296 414 148
3916 523 197
5061 589 233
6387 622 366
7985 724 463
8514 1004 631];
% [theta,Rsdnrm,Rsd,ExFlg,OptmInfo,Lmda,Jmat]=lsqcurvefit(@kinetics,theta0,t,c);
ftns = @(p) norm(k-EIDAGG2(p,t));
PopSz = 500;
Parameter = 12;
opts = optimoptions('ga','PopulationSize',PopSz,'InitialPopulationMatrix',randi(1,PopSz,Parameter)*1E-3,'MaxGenerations',2E3,'PlotFcn','gaplotbestf');
t0 = clock;
fprintf('\nStart Time: %4d-%02d-%02d %02d:%02d:%07.4f\n',t0)
% [B,fval,exitflag,output,population,scores] = ga(ftns, 53, [],[],[],[],zeros(53,1),Inf(53,1),[],[],opts)
%Parameters should have values between 0 and 1
[p,fval,exitflag,output] = ga(ftns,Parameter,[],[],[],[],zeros(Parameter,1),[1;1;1;1;1;1;1;1;1;1;1;1],[],[],opts);
t1 = clock;
fprintf('\nStop Time: %4d-%02d-%02d %02d:%02d:%07.4f\n',t1)
GA_Time = etime(t1,t0);
fprintf('\nElapsed Time: %23.15E\t\t%02d:%02d:%02d.%03d\n',GA_Time)
fprintf(1,'\tRate Constants:\n')
for k1 = 1:length(p)
fprintf(1,'\t\tp(%d) = %8.5f\n',k1,p(k1))
end
tv = linspace(min(t),max(t));
Mfit = EIDAGG(p,tv);
figure(1)
plot(t,k,'p')
hold on
hlp = plot(tv,Mfit);
hold off
grid
xlabel('Zeit')
ylabel('Anzahl Personen')
legend(hlp,'Empfänglich','Infiziert','Diagnostiziert','Akut gefährdet','Genesen','Gestorben','Location','NE')
function solpts = EIDAGG2(p,t)
solpts = EIDAGG(p,t);
solpts = solpts([2,5,6]);
end
end

Accepted Answer

Star Strider
Star Strider on 29 Sep 2021
You found the code I was intending to direct you to when I just now saw this post. (My apologies for not seeing it earlier. Thanks to Image Analyst for editing it, otherwise I would have missed it.)
The problem is that you are fitting 3 variables, however the code returns 7 variables. It will be necessary to select the 3 variables you want to fit, and then return only those variables. Do that in the ‘M’ variable, or in the ‘ftns’ function. (Doing it in ‘ftns’ would actually be easier, because you can then display and plot the entire output in figure(1)).
Also, estimate the initial conditions as parameters. That generally results in a much more successful fit.
I can probably edit your code to do all this, however I need to know what columns of the output you want to fit to the data. That is not currently obvious to me.
.
  15 Comments
Dylan
Dylan on 1 Oct 2021
Thanks for your answer, I can see the difficulty now! Do you agree with me that the ODEs were wrong before? And is it impossible to set initial conditions for the compartments k(1:7)? And last but not least, does it make sense to set strict boundaries for the parameters and are the ones which are modeled and not guessed those which appear in the modeled ODEs? Thanks and sorry for the disturbance!
Star Strider
Star Strider on 1 Oct 2021
As always, my pleasure!
No worries!
Bounding the parameters is appropriate of the assumption is that in a compartmental model (that all SIR models are), the kinetic parameters must always be on the interval (0,1).
The initial conditions can be whatever you want them to be. The ga algorithm will evolve a suitable set (in my code, since I estimate them with the other parameters) that will produce appropriate results. It is possible to set them as static non-parameters, however that generally leads to an inferior fit, since they are not absolutely certain and so should be estimated parameters.
The problem is not that the unmodeled differential equations cannot be directly estimated. It is instead that the parameters in them that are not parameters of the estimated differential equations cannot be reliably estimated. That will require any optimisation routine to ‘guess’ what they should be, and that will rarely be in any way accurate, much less precise.
.

Sign in to comment.

More Answers (0)

Categories

Find more on Problem-Based Optimization Setup in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!