I have been able to change this: [ones(1,20)*1E-3 ones(1,8)*10]. It seems to work but is taken long time. Please why is it so?
Parameter estimation of SEIHRDBP model
5 views (last 30 days)
Show older comments
% I am back please. I have gotten the data correct data with date. The whole idea is that I want the parameter estimation for parameters for SEI. I.e., run the parameter estimation these classes (SEI). These are only the active human class. The parameters associated with classes are theta16, theta17, theta18, theta19, theta20, theta3, theta7 and theta8.
% I tried to run the new code use the "ga" algorithm but I got the error:
% Arrays have incompatible sizes for this operation.
% Arrays have incompatible sizes for this operation.
%
% Error in untitled10>SEIHRDBP_MODEL_PARAMETER (line 61)
% opts = optimoptions('ga','PopulationSize',PopSz,'InitialPopulationMatrix',randi(1E+4,PopSz,Parameter).*[ones(1,12)*1E-3 ones(1,7)*10],'MaxGenerations',2E3)%,'PlotFcn','gaplotbestf');
%
% Error in untitled10 (line 2)
% SEIHRDBP_MODEL_PARAMETER
% In fact, this is my new corrected codes and new data is attached.
SEIHRDBP_MODEL_PARAMETER
function SEIHRDBP_MODEL_PARAMETER
%7 ODEs to describe the model (S,E,I,H,R,D,B, P)
function M = SEIHRDBP(p,t)
%Initial conditions with population
M0 = p(21:28);
[~,m] = ode15s(@DifEq,t,M0);
function dM = DifEq(~,k)
dMdt = zeros(8,1);
dMdt(1) = p(1) - (p(16)*k(3) + p(17)*k(4) + p(18)*k(6)+ p(19)*k(7) + p(20)*k(8))*k(1)-p(2)*k(1);
dMdt(2) = (p(16)*k(3) + p(17)*k(4) + p(18)*k(6)+ p(19)*k(7) + p(20)*k(8))*k(1) - (p(2) + p(3))*k(2);
dMdt(3) = p(3)*k(2) - (p(2) + p(4) + p(5))*k(1);
dMdt(4) = p(5)*k(3) - (p(6) + p(7) + p(2))*k(4);
dMdt(5) = p(7)*k(4) - p(2)*k(5);
dMdt(6) = p(4)*k(3) + p(6)*k(4) -p(8)*k(6);
dMdt(7) = p(8)*k(6) - p(9)*k(7);
dMdt(8) = p(10) + p(14)*k(3) + p(15)*k(4) + p(11)*k(6) + p(12)*k(7) -p(13)*k(8);
dM = dMdt;
end
M = m;
end
%From day Aug to nov 20
t = [1
2
3
4
5
6
7
8
9
10];
%Official data for I, D for 10 days
k = [24 23
62 35
68 41
70 42
70 42
70 43
68 49
67 49
67 49
66 49];
% [theta,Rsdnrm,Rsd,ExFlg,OptmInfo,Lmda,Jmat]=lsqcurvefit(@kinetics,theta0,t,c);
function fv = ftns(p,t)
SEIHRDBPout = SEIHRDBP(p,t);
fv = norm(k-SEIHRDBPout(:,[3 6]));
end
% ftns = @(p) norm(k-EIDAGG2(p,t));
PopSz = 50;
% Parameter = 12;
Parameter = 28;
opts = optimoptions('ga','PopulationSize',PopSz,'InitialPopulationMatrix',randi(1E+4,PopSz,Parameter).*[ones(1,20)*1E-3 ones(1,8)*10],'MaxGenerations',2E3)%,'PlotFcn','gaplotbestf');
t0 = clock;
fprintf('\nStart Time: %4d-%02d-%02d %02d:%02d:%07.4f\n',t0)
[p,fval,exitflag,output] = ga(@(p)ftns(p,t),Parameter,[],[],[],[],zeros(Parameter,1),[ones(20,1); Inf(8,1)],[],[],opts);
t1 = clock;
fprintf('\nStop Time: %4d-%02d-%02d %02d:%02d:%07.4f\n',t1)
GA_Time = etime(t1,t0);
DT_GA_Time = datetime([0 0 0 0 0 GA_Time], 'Format','HH:mm:ss.SSS');
fprintf('\nElapsed Time: %23.15E\t\t%s\n\n', GA_Time, DT_GA_Time)
fprintf(1,'\n\tRate Constants:\n')
for k1 = 1:12
fprintf(1,'\t\tp(%2d) = %11.8f\n',k1,p(k1))
end
fprintf(['\n\tInitial Conditions:\n'])
for k1 = 1:8
fprintf('\t\tdMdt(%d) = %11.8f\n',k1,p(k1))
end
tv = linspace(min(t),max(t));
Mfit = SEIHRDBP(p,tv);
figure(1)
plot(t,k,'p')
hold on
hlp = plot(tv,Mfit);
hold off
grid
xlabel('time')
ylabel('Population')
legend(hlp,'S','E','I','H','R','D','B','P',Location','best')
end
Accepted Answer
Star Strider
on 5 Mar 2024
Edited: Star Strider
on 5 Mar 2024
The time values are incorrect.
The complete table of those values should be —
Date Infected Death
____ ________ _____
0 24 13
16 62 35
30 68 41
34 70 42
37 70 42
43 70 43
50 68 49
60 67 49
66 67 49
80 66 49
When I experimented with a similar initial population matrix to:
'InitialPopulationMatrix',randi(1E+4,PopSz,Parameter).*[ones(1,20)*1E-3 ones(1,8)*10]
the code crashed, even using ode15s, because the differential equaiton system encountered a singularity and stopped.
The code I used —
T0 = readtable('observed.csv')
VN = T0.Properties.VariableNames % Use These Variable Names
T1 = readtable('mydata.csv');
T1{end-1,1} = datetime(2014,10,31) % Correct Typographical Error In 'Date' Entry
% figure
% plot(T1{:,1}, T1{:,2:end})
% grid
% return
VN1 = T1.Properties.VariableNames;
L = size(T1,1);
t = day(T1{:,1},'dayofyear'); % Time Vector (NECESSARY) Units = 'dayofyear'
t = t - t(1);
c = T1{:,2:end};
% Data = table(t, c(:,1),c(:,2), 'VariableNames',{'Date','Infected','Death'}) % Show Once Then Coomment Out
ftns = @(theta) norm(c-SEIHRDBP(theta,t));
PopSz = 500;
Parms = 28;
% optsAns = optimoptions('ga', 'PopulationSize',PopSz, 'InitialPopulationMatrix',randi(1E+4,PopSz,Parms)*1E-2, 'MaxGenerations',5E3, 'FunctionTolerance',1E-10); % Options Structure For 'Answers' Problems
opts = optimoptions('ga', 'PopulationSize',PopSz, 'InitialPopulationMatrix',randi(1E+4,PopSz,Parms).*[ones(1,20)*1E+3 ones(1,8)*0.01], 'MaxGenerations',1E2, 'FunctionTolerance',1E-10, 'PlotFcn',@gaplotbestf, 'PlotInterval',1);
% optshf = optimoptions('ga', 'PopulationSize',PopSz, 'InitialPopulationMatrix',randi(1E+4,PopSz,Parms)*1E-3, 'MaxGenerations',5E3, 'FunctionTolerance',1E-10, 'HybridFcn',@fmincon, 'PlotFcn',@gaplotbestf, 'PlotInterval',1); % With 'HybridFcn'
t0 = clock;
fprintf('\nStart Time: %4d-%02d-%02d %02d:%02d:%07.4f\n', t0)
[theta,fval,exitflag,output,population,scores] = ga(ftns, Parms, [],[],[],[],zeros(Parms,1),Inf(Parms,1),[],[],opts);
t1 = clock;
sprintf(['B = [' repmat('%.6f ',1,28) ']'],theta)
fprintf('\nStop Time: %4d-%02d-%02d %02d:%02d:%07.4f\n', t1)
GA_Time = etime(t1,t0)
DT_GA_Time = datetime([0 0 0 0 0 GA_Time], 'Format','HH:mm:ss.SSS');
fprintf('\nElapsed Time: %23.15E\t\t%s\n\n', GA_Time, DT_GA_Time)
fprintf('Fitness value at convergence = %.4f\nGenerations \t\t\t\t = %d\n\n',fval,output.generations)
fprintf(1,'\tRate Constants:\n')
for k1 = 1:length(theta)
fprintf(1, '\t\tTheta(%2d) = %8.5f\n', k1, theta(k1))
end
tv = linspace(min(t), max(t));
tv = t;
Cfit = SEIHRDBP(theta, tv)
figure
hd = plot(t, c, 'p');
for k1 = 1:size(c,2)
CV(k1,:) = hd(k1).Color;
hd(k1).MarkerFaceColor = CV(k1,:);
end
hold on
hlp = plot(tv, Cfit);
for k1 = 1:size(c,2)
hlp(k1).Color = CV(k1,:);
end
hold off
grid
Ax = gca;
Ax.YScale = 'log';
xlabel('Time')
ylabel('Concentration')
% legend(hlp, compose('C_%d',1:size(c,2)), 'Location','N')
legend(hlp, VN, 'Location','best')
function C=SEIHRDBP(theta,t)
c0=theta(end-7:end);
[T,Cv]=ode15s(@DifEq,t,c0);
%
function dC=DifEq(t,c)
dcdt=zeros(8,1);
dcdt(1) = theta(1) - (theta(16)*c(3) + theta(17)*c(4) + theta(18)*c(6)+ theta(19)*c(7) + theta(20)*c(8))*c(1)-theta(2)*c(1);
dcdt(2) =(theta(16)*c(3) + theta(17)*c(4) + theta(18)*c(6)+ theta(19)*c(7) + theta(20)*c(8))*c(1) - (theta(2) + theta(3))*c(2);
dcdt(3) = theta(3)*c(2) - (theta(2) + theta(4) + theta(5))*c(1);
dcdt(4) = theta(5)*c(3) - (theta(6) + theta(7) + theta(2))*c(4);
dcdt(5) = theta(7)*c(4) - theta(2)*c(5);
dcdt(6) = theta(4)*c(3) + theta(6)*c(4) -theta(8)*c(6);
dcdt(7) = theta(8)*c(6) - theta(9)*c(7);
dcdt(8) = theta(10) + theta(14)*c(3) + theta(15)*c(4) + theta(11)*c(6) + theta(12)*c(7) -theta(13)*c(8);
dC=dcdt;
end
C=Cv(:,[3 6]); % Return Only 'Infected' & 'Deaths'
end
(these differential equations use different variable names, however I believe the system is the same as in the initial question post) produced (after about 70 generations over about 4 minutes with MATLAB Online) what I consider to be insane parameter values for a kinetic system, those being:
Rate Constants:
Theta( 1) = 72614000.00000
Theta( 2) = 96602001.00000
Theta( 3) = 1055000.00010
Theta( 4) = 47392001.01295
Theta( 5) = 19519002.01651
Theta( 6) = 87092000.06967
Theta( 7) = 73074002.00018
Theta( 8) = 48324999.00702
Theta( 9) = 96660001.00649
Theta(10) = 9511000.00048
Theta(11) = 12759002.00011
Theta(12) = 73740000.00532
Theta(13) = 89201001.00000
Theta(14) = 31149000.05865
Theta(15) = 41402000.00567
Theta(16) = 31798000.00065
Theta(17) = 8765000.01877
Theta(18) = 5103001.50109
Theta(19) = 6618002.93761
Theta(20) = 23122001.00077
Theta(21) = 872.77225
Theta(22) = 875.78264
Theta(23) = 60.38500
Theta(24) = 280.22130
Theta(25) = 783.93000
Theta(26) = 44.87735
Theta(27) = 980.88375
Theta(28) = 344.53288
Then with those parameters, throwing this message:
Warning: Failure at t=1.870129e-05. Unable to meet integration tolerances without reducing the step size below the smallest value allowed (5.421011e-20) at time t.
and not going further.
So the code crashed again for essentially the same reason, and the last plot (showing the data and the estimated fit) did not display.
I suspect something is wrong with the system of differential equations, however I cannot determine what that problem is. They could be coded correctly from a published paper and the paper itself is in error. (That would not be the first time I would have encountered that problem, since I spent an entire semester in grad school working to replicate an interesting paper that used neural nets for EKG classification that when coded according to the description in the paper and using the online data provided, resulted in entirely different — and significantly conflicting — results than those published. My professor reviewed the paper and my code and my results and agreed with me. It was a learning experience for both of us.)
.
24 Comments
Star Strider
on 7 Mar 2024
As always, my pleasure!
I need to spend some time with Python. A least squares (gradient descent) approach can work with relatively simple problems, however your system has 20 kinetic parameters (and in my approach 28 including the initial conditions for the differential equations), so gradient descent is less likely to be successful unless the initial parameter estimates are relatively close to the best values. For that reason, I chose the genetic algorithm approach. I am currently running it, and it is converging reasonably well after nearly an hour. (It seems to be stalling just now, so it may be close to converging.) it is experiencing a ‘graphics timeout’ so it is depriving me of progress update information right now. I will post the results when it provides them.
.
More Answers (0)
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!