lsqnonlin providing very poor model fits

I am trying to estimate the values of the parameters k, C, Ib, as well as Q1(0) and Q2(0) that come from the set of equations below:
dQ1/dt=ui(t)-k*Q1(t)
dQ2/dt=k*Q1(t)-Q2(t)
Ip=C*Q2(t)*10^6+Ib
I need to etimate the values of Ip, and by comparing them to experimental data, determine the values of the unknowns listed above. The values for ui(t) are known and also come from experimental data. The experimental values of Ip are missing at random time intervals. To solve this problem I have tried implimenting the code shown below:
A=lsqnonlin(@insulin_prediction,[2 2 2 2],[0 0 0 0]);
function error=insulin_prediction (y)
filename='data.xlsx';
File=readtable(filename);
basal_insulin=File{:,2}/60;
plasma_insulin=File{:,3};
k=y(1);
C=y(2);
Ib=y(3)
Q10=y(4);
Q20=y(4)
i=1;
function dydt=ODE(t,y)
filename='data.xlsx';
File=readtable(filename);
dydt=zeros(2,1);
basal_insulin=File{:,2}/60;
if t>i*10
i=i+1;
end
ui=basal_insulin(i);
dydt(1)=ui-k*y(1);
dydt(2)=k*y(1)-k*y(2);
end
[T, Y]=ode45(@ODE,[0:10:880],[Q10 Q20]);
Ip=C*Y(:,2)*1e6+Ib;
not_nan=[];
for j=1:length(basal_insulin)
if ~isnan(plasma_insulin(j))
not_nan(end+1)=j;
end
end
error=abs(Ip(not_nan)-plasma_insulin(not_nan));
end
I apologise if the format of this post is improper, I'm new to posting in this community.

1 Comment

Please refer to the result below, note: only the data without missing Ip are used
Sum Squared Error (SSE): 991.257443450357
Root of Mean Square Error (RMSE): 4.74642794743957
Correlation Coef. (R): 0.806895419155318
R-Square: 0.651080217453836
Parameter Best Estimate
--------- -------------
k -0.000700762761913323
c -2.56141164517887E-5
ib 91.2295568000767
q1 Initial Value -3548.8889459752
q2 Initial Value 502.327269670698

Sign in to comment.

 Accepted Answer

With a few edits to make it more efficient, this seems to work —
A=lsqnonlin(@insulin_prediction,[2 2 2 2],[0 0 0 0])
Local minimum possible. lsqnonlin stopped because the size of the current step is less than the value of the step size tolerance.
A = 1×4
1.9377 0.0637 0.0373 0.1082
function error=insulin_prediction (y)
% filename='data.xlsx';
filename = 'https://www.mathworks.com/matlabcentral/answers/uploaded_files/1304895/data.xlsx';
File=readtable(filename, 'VariableNamingRule','preserve');
basal_insulin=File{:,2}/60;
plasma_insulin=File{:,3};
k=y(1);
C=y(2);
Ib=y(3);
Q10=y(4);
Q20=y(4);
i=1;
function dydt=ODE(t,y,File)
% filename='data.xlsx';
% File=readtable(filename);
dydt=zeros(2,1);
basal_insulin=File{:,2}/60;
if t>i*10
i=i+1;
end
ui=basal_insulin(i);
dydt(1)=ui-k*y(1);
dydt(2)=k*y(1)-k*y(2);
end
[T, Y]=ode45(@(t,y)ODE(t,y,File),File{:,1},[Q10 Q20]);
Ip=C*Y(:,2)*1e6+Ib;
not_nan=[];
for j=1:length(basal_insulin)
if ~isnan(plasma_insulin(j))
not_nan(end+1)=j;
end
end
error=abs(Ip(not_nan)-plasma_insulin(not_nan));
end
I defer to you to determine if those parameter estimates are appropriate. I use the time variable in the table as the ‘tspan’ argiment to ode45. This makes the calculated values match the data.
.

More Answers (2)

John D'Errico
John D'Errico on 23 Feb 2023
Edited: John D'Errico on 23 Feb 2023
You have not posted the data, so it is impossible to test your code.
Regardless, there are several common reasons why a solver like lsqnonlin will fail.
  1. You have an error in how you formulated the code. I can't even try to check that, as we lack your data to test it. (This seems a fairly simple ODE, so that is probably not the issue.)
  2. Your model simply does not fit your data. Is your data poor? Is it highly noisy? (Nonlinear least squares has a problem with high residuals (so in the presence of outliers and serious noise.) Is it not even consistent with the model you have posed? We cannot know.
  3. You have provided poor starting values for the solver. This is a common error, and i would be terribly surprised if the vector [2 2 2 2] is even close to being good starting values.
  4. Is ODE45 not convergent for this system? It is not at all uncommon to see a solver like ODE45 fail for biological or biochemical problems, since they are often stiff systems. (I would note the presence of a constant like 10^6 in one of the equations. That to me would strongly suggest the possibility of a stiff system.) ODE45 may easily be diverging for some sets of parameter values. You MIGHT try one of the stiff solvers. Let me say it differently - I would NEVER suggest using ODE45 as the default solver for bio(anything) problems.
With some time, I could probably suggest at least a couple of other failure modes.

1 Comment

I have updated the original post to include the data referenced. Thank you for your suggestions.

Sign in to comment.

Your code is incorrect. You have a shared variable "i" that you increment each time t>10*u and you use i to select coefficients.
Your assumption in that code is that t never decreases. That is an incorrect assumption for ode45. ode45 deliberately probes to longer and longer time frames until it encounters a place where the slope is too steep to match the time frame, and then it retreats backwards to shorter time frames a bit at a time until it gets to a time frame that the approximations are good enough for. This is not an exception condition that "seldom" happens: ode45 deliberately runs up the time increment until it fails. It is like seeing how fast you can go in Mario cart but deliberately speeding up on the curve until you run off the track and then restarting using a slower maximum speed, only to deliberately speed up again after you pass that point.

1 Comment

You are reading the file once for each call to insulin_prediction and once for each call to ODE within that. That is a lot of unnecessary reading of the file. See http://www.mathworks.com/help/matlab/math/parameterizing-functions.html
If your basal_insulin is sampling a continuous function that can be usefully approximated by a piecewise quadratic, then you should use interp1() with 'spline' . However, if basal_insulin is sampling impulses (such as injections) then you need to stop the ode*() call at each abrupt change and restart it. The ode*() functions rely on the second derivatives of your provided function being continuous during any one call to ode*(), and will give wrong answers (or an error message) if they are not.

Sign in to comment.

Products

Release

R2022b

Asked:

on 23 Feb 2023

Commented:

on 2 Mar 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!