how to use all the data for a nonlinear regression?

2 views (last 30 days)
Hi everyone,
I'm modeling some of my data. As you can see from the attached file, there are 4 columns: time, biomass, hydrogen, acetate. In reality I have more data than hydrogen, but I don't know how to do the integration, since biomass and acetate have less data.
Attached you will find the model and complete data. The model works if you remove the additional data from the third column.
Thanks everyone for the advice!

Accepted Answer

Star Strider
Star Strider on 12 Oct 2023
You can only do the regression on complete data rows, so I used the rmmissing function to accomplish that.
Try something like this —
type('A.m')
clear all load data1.txt ld=size(data1,1); y=[data1;data1]; [t,ix]=sort(y(:,1)); y=y(ix,:); t=y(:,1); x=y(:,2); s=y(:,3); p=y(:,4); c = [x s p]; options = optimset('MaxFunEvals',100000,'MaxIter',100000); theta0=[0.03 0.1 0.3 0.1]'; % il simbolo ' significa trasposto options = optimoptions('lsqcurvefit','Algorithm','levenberg-marquardt'); [theta,Rsdnrm,Rsd,ExFlg,OptmInfo,Lmda,Jmat]=lsqcurvefit(@kinetics,theta0,t,c,zeros(1,numel(theta0))); ci = nlparci(theta,Rsd,'Jacobian',Jmat); format shortE ParametersCI = table(ci(:,1), theta, ci(:,2), 'VariableNames',{'Lower 95% CI','Parmaeter Estimate','Upper 95% CI'}) C = kinetics(theta,t); hold on yyaxis right set(gca, 'ycolor', 'k'); % Imposta il colore dell'asse sinistro a nero plot(t, c(:,1), 'ok', 'MarkerFaceColor', 'k') % Pallini neri ripieni xlabel('Time (days)') ylabel('Concentration (mmol/L)') yyaxis left set(gca, 'ycolor', 'k'); % Imposta il colore dell'asse destro a nero plot(t,c(:,2),'ok') % Pallini neri yyaxis right set(gca, 'ycolor', 'k'); % Imposta il colore dell'asse destro a nero plot(t,c(:,3),'^k', 'MarkerFaceColor', [0.5 0.5 0.5], 'MarkerEdgeColor', 'k') yyaxis right set(gca, 'ycolor', 'k'); % Imposta il colore dell'asse destro a nero plot(t, C(:,1), '--k', 'LineWidth', 2) % Linea nera tratteggiata più spessa yyaxis left set(gca, 'ycolor', 'k'); % Imposta il colore dell'asse sinistro a nero plot(t, C(:,2), '-.k', 'LineWidth', 1) % Linea nera a puntini più spessa yyaxis right set(gca, 'ycolor', 'k'); % Imposta il colore dell'asse destro a nero plot(t, C(:,3), '-k', 'LineWidth', 1) % Linea grigia con contorno nero hold off legend('biomassa','idrogeno','acetato') hold off function C=kinetics(theta,t) c0 = [0.77;25.1;0.0]; u=theta(1); YH=theta(2); YAC=theta(3); Ks = theta(4) lt=length(t); TF=zeros(lt,1); YF=zeros(lt,3); YF(1,:)=c0; for i=1:lt-1 if t(i+1)>t(i) [T,Cv]=ode45(@DifEq,t(i:i+1),c0); TF(i+1)=t(i+1); YF(i+1,:)=Cv(end,:); c0=YF(i+1,:); else TF(i+1)=t(i+1); YF(i+1,:)=YF(i,:); c0=YF(i+1,:); end end function dC=DifEq(t,c) dcdt=zeros(3,1); Kh=0.000072; %mmol/L atm Vg= 0.1; Vl = 0.15; Te=298.15; R=0.082; %[(l ∙ atm) / (mmol ∙ K)] K = Kh*R*Te; a=(Vg+Vl*K)/Vl; ua=u*(c(2)/(c(2)+(Ks/Kh))); dcdt(1)=ua*c(1); dcdt(2)= -1/YH*ua*c(1)/a; dcdt(3)= 1/YAC*ua*c(1); dC=dcdt; end C=YF; end
T1 = readtable('data1.txt')
T1 = 15×4 table
Var1 Var2 Var3 Var4 ____ ____ _____ ____ 0 0.77 20.03 0 3 NaN 19.89 NaN 4 0.89 24.2 0.55 6 NaN 21.6 NaN 7 1 20.8 0.61 10 0.95 18.9 0.84 11 NaN 17.51 NaN 12 1.32 15.87 1.22 13 NaN 15 NaN 14 1.27 11.72 1.39 17 NaN 10.76 NaN 18 1.35 7.77 1.42 19 NaN 7.64 NaN 20 NaN 6.69 NaN 21 1.4 5.47 1.48
T1rm = rmmissing(T1)
T1rm = 8×4 table
Var1 Var2 Var3 Var4 ____ ____ _____ ____ 0 0.77 20.03 0 4 0.89 24.2 0.55 7 1 20.8 0.61 10 0.95 18.9 0.84 12 1.32 15.87 1.22 14 1.27 11.72 1.39 18 1.35 7.77 1.42 21 1.4 5.47 1.48
writematrix(table2array(T1rm),'data1.txt')
run('A.m')
Ks = 0.1000
Ks = 0.1000
Ks = 0.1000
Ks = 0.1000
Ks = 0.1000
Ks = 0.1000
Ks = 0.1000
Ks = 0.1000
Ks = 0.1000
Ks = 0.1000
Ks = 0.0491
Ks = 0.0491
Ks = 0.0491
Ks = 0.0491
Ks = 0.0491
Ks = 0.0922
Ks = 0.0922
Ks = 0.0922
Ks = 0.0922
Ks = 0.0922
Ks = 0.0724
Ks = 0.0724
Ks = 0.0724
Ks = 0.0724
Ks = 0.0724
Ks = 0.0882
Ks = 0.0882
Ks = 0.0882
Ks = 0.0882
Ks = 0.0882
Ks = 0.0781
Ks = 0.0781
Ks = 0.0781
Ks = 0.0781
Ks = 0.0781
Ks = 0.0624
Ks = 0.0624
Ks = 0.0624
Ks = 0.0624
Ks = 0.0624
Ks = 0.0744
Ks = 0.0744
Ks = 0.0744
Ks = 0.0744
Ks = 0.0744
Ks = 0.0674
Ks = 0.0674
Ks = 0.0674
Ks = 0.0674
Ks = 0.0674
Ks = 0.0586
Ks = 0.0586
Ks = 0.0586
Ks = 0.0586
Ks = 0.0586
Ks = 0.0573
Ks = 0.0573
Ks = 0.0573
Ks = 0.0573
Ks = 0.0573
Ks = 0.0520
Ks = 0.0520
Ks = 0.0520
Ks = 0.0520
Ks = 0.0520
Ks = 0.0344
Ks = 0.0344
Ks = 0.0344
Ks = 0.0344
Ks = 0.0344
Ks = 0.0058
Ks = 0.0058
Ks = 0.0058
Ks = 0.0058
Ks = 0.0058
Ks = 0.0054
Ks = 0.0054
Ks = 0.0054
Ks = 0.0054
Ks = 0.0054
Ks = 2.7015e-04
Ks = 2.7015e-04
Ks = 2.7015e-04
Ks = 2.7015e-04
Ks = 2.7017e-04
Ks = 2.7792e-04
Ks = 2.7792e-04
Ks = 2.7792e-04
Ks = 2.7792e-04
Ks = 2.7794e-04
Ks = 2.7794e-04
Ks = 2.7794e-04
Ks = 2.7794e-04
Ks = 2.7794e-04
Ks = 2.7795e-04
Ks = 2.1860e-04
Ks = 2.1860e-04
Ks = 2.1860e-04
Ks = 2.1860e-04
Ks = 2.1862e-04
Ks = 1.8626e-04
Ks = 1.8626e-04
Ks = 1.8626e-04
Ks = 1.8626e-04
Ks = 1.8627e-04
Ks = 1.5710e-04
Ks = 1.5710e-04
Ks = 1.5710e-04
Ks = 1.5710e-04
Ks = 1.5712e-04
Ks = 1.6609e-04
Ks = 1.6609e-04
Ks = 1.6609e-04
Ks = 1.6609e-04
Ks = 1.6611e-04
Ks = 1.6240e-04
Ks = 1.6240e-04
Ks = 1.6240e-04
Ks = 1.6240e-04
Ks = 1.6241e-04
Ks = 1.6397e-04
Ks = 1.6397e-04
Ks = 1.6397e-04
Ks = 1.6397e-04
Ks =
1.6399e-04
Local minimum possible. lsqcurvefit stopped because the final change in the sum of squares relative to its initial value is less than the value of the function tolerance.
ParametersCI = 4×3 table
Lower 95% CI Parmaeter Estimate Upper 95% CI ____________ __________________ ____________ 1.6521e-03 5.5335e-02 1.0902e-01 1.6049e-02 9.7913e-02 1.7978e-01 -3.1424e-02 7.0951e-01 1.4504e+00 -4.8476e-04 1.6397e-04 8.1271e-04
Ks =
1.6397e-04
for sorting the matrix (that actually does not need to be sorted), the sortrows function may be more efficient here.
Thank you for quotimg my code!
.
  6 Comments
Torsten
Torsten on 13 Oct 2023
From ode45, you get a nx3 matrix of values Cv of simulated values. From the measurements, you also have a nx3 matrix c of values, but some of them are NaN. Now form c-Cv for all values c(i,j) for which c(i,j) is not NaN, make these values a vector and return it to lsqnonlin. This way, you consider all values in the c-matrix that are not NaN, not only those where all three values in a row are not NaN.
Star Strider
Star Strider on 13 Oct 2023
To me, that means using the entire time vector to simulate the original matrix, and plotting that. That is what I usually do to create the fitted data for the plot. I believe that was done here as well, since the plotted fit lines appear smooth.

Sign in to comment.

More Answers (1)

Pavl M.
Pavl M. on 12 Oct 2023
Edited: Pavl M. on 12 Oct 2023
Hi,
I carefully attain further requirements/demands:
It's okay, normal right, you do there curve-fitting by l.-m. algorithm, 3 values in c matrix as a functions of time,I checked the data file timestamps to values consistency ( number of quantities in each column must be equal number of timestamps (sampling points in time - length of the first column)), then you solve custom differential equation.
My answer: remove NaN and Inf values from input by:
c(isnan(c))=0; % (or value to impute) and input can be your c matrix or y(:,i) vectors,
%c(isinf(c)) = max_num; % if there are Inf in inputs
and run, what d'you get?

Products


Release

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!