# Fit multiple data with one mathmatical model/function

2 views (last 30 days)
AHMED FAKHRI on 21 Feb 2020
Commented: the cyclist on 24 Feb 2020
Hi
I have the following Excel file which contain my data.
The input data is the number of cycles (N) while the output data is the capacity at different temepratures (278.15, 298.15,308.15, 318.15 ).
I want to fit the input and output data with a single function or mathmatical formula, preferably the following equation:
C=a1*(a2-a3*N^0.5) *exp(((-a4/8.314)*(1/T-1/298.15))-((a5/8.314)*(1/T-1/298.15)^2))
Therfore, what I want to do is to get C that correspond to (T and N) everytime. a1-a5 are free fitting parameters.
For T is is clear that I want it to chage only from 278.15 to 318.15.
Can I solve this using lsqcurvefit? or any other better appraoch?
Thanks

the cyclist on 21 Feb 2020
I would restructure my data (see attached), and use the nlinfit function from the Machine Learning and Statistics Toolbox.
This way of structuring the data is known as tidy data.
C = @(a,X) a(1).*(a(2)-a(3).*X(:,1).^0.5).*exp(((-a(4)/8.314)*(1./X(:,2)-1/298.15))-((a(5)/8.314).*(1./X(:,2)-1/298.15).^2));
% Define initial fit coefficients
beta0 = [1 100 1 -50 100];
% Fit the model
mdl = fitnlm(tbl,C,beta0);
There are some subtleties here regarding whether there are correlated errors among these measurements. The method here assumes that the errors in each row are independent from each other.
the cyclist on 21 Feb 2020
This is the warning:
Warning: The Jacobian at the solution is ill-conditioned, and some model parameters may not be
estimated well (they are not identifiable). Use caution in making predictions.
> In nlinfit (line 384)
In NonLinearModel/fitter (line 1127)
In classreg.regr/FitObject/doFit (line 94)
In NonLinearModel.fit (line 1434)
In fitnlm (line 99)
I don't think I can really comment on the error question. It's too complex a decision about whether a model is "good enough" for a particular application.

Alex Sha on 22 Feb 2020
Hello, AHMED, your fit function is overdeterminted which will lead to multi-solutions, the parameter of "a1" is redundant:
C=a1*(a2-a3*N^0.5)*exp(((-a4/8.314)*(1/T-1/298.15))-((a5/8.314)*(1/T-1/298.15)^2));
the effect of above is exact same as the function of below
C=(a2-a3*N^0.5)*exp(((-a4/8.314)*(1/T-1/298.15))-((a5/8.314)*(1/T-1/298.15)^2));
Root of Mean Square Error (RMSE): 2.01307596538931
Sum of Squared Residual: 405.247484242812
Correlation Coef. (R): 0.962621756201578
R-Square: 0.926640645512611
Determination Coef. (DC): 0.762037250139209
F-Statistic: 19.6453235739073
Parameter Best Estimate
-------------------- -------------
a2 102.235245663151
a3 1.13255810875212
a4 844.097603208569
a5 13460420.8019471

AHMED FAKHRI on 22 Feb 2020
Thank you both Alex Sha and the cyclist
I have removed a(1) parameter and the error/warning written in the cyclist comment has been mitigated.
I will try to enhance the model further and reduce the RMSE by manipulating the function. After that I will test it.

AHMED FAKHRI on 24 Feb 2020
Thanks guys, it worked. I only need to reduce the RMSE by modifying the function.
Is there anyway that we give the data and a 'tool' or 'algorithim' suggests a mathmatical model ?
the cyclist on 24 Feb 2020
Usually one chooses the fitting function based on understanding of the underlying process. So, for example, is there some other factor besides number of cycles and temperature, that you are not accounting for? Is there something else that happens at lower temperature, that you can capture with another term?
Of course, it is possible that you just have noise that you cannot really account for. Then your RMSE is simply the best you can do.