Fit multiple data with one mathmatical model/function

Hi
I have the following Excel file which contain my data.
The input data is the number of cycles (N) while the output data is the capacity at different temepratures (278.15, 298.15,308.15, 318.15 ).
I want to fit the input and output data with a single function or mathmatical formula, preferably the following equation:
C=a1*(a2-a3*N^0.5) *exp(((-a4/8.314)*(1/T-1/298.15))-((a5/8.314)*(1/T-1/298.15)^2))
Therfore, what I want to do is to get C that correspond to (T and N) everytime. a1-a5 are free fitting parameters.
For T is is clear that I want it to chage only from 278.15 to 318.15.
Can I solve this using lsqcurvefit? or any other better appraoch?
Thanks

 Accepted Answer

I would restructure my data (see attached), and use the nlinfit function from the Machine Learning and Statistics Toolbox.
This way of structuring the data is known as tidy data.
tbl = readtable('mydata.xlsx');
C = @(a,X) a(1).*(a(2)-a(3).*X(:,1).^0.5).*exp(((-a(4)/8.314)*(1./X(:,2)-1/298.15))-((a(5)/8.314).*(1./X(:,2)-1/298.15).^2));
% Define initial fit coefficients
beta0 = [1 100 1 -50 100];
% Fit the model
mdl = fitnlm(tbl,C,beta0);
There are some subtleties here regarding whether there are correlated errors among these measurements. The method here assumes that the errors in each row are independent from each other.

4 Comments

Thanks @the cyclist for your response.
I will give it a go and change the temperature in mydata.xlsx to Kelvin , i.e +273.15 and see if it gives me a good model for the data.
just one question please, based on what you chose the initial values of beta0.
Thanks again
Yeah, I should have mentioned that. The first time I tried it, I just did
beta0 = [1 1 1 1 1]
The model issued a warning, but did produce coefficient estimates. So then I just picked initial values that were close to those estimates.
The model is still issuing the warning, though. Hopefully when you do the temperature transform, those will go away.
Thanks. What kind of warning you are referring to that the model produces?
Also, the RMSE now is 2 , do you see this acceptable ?
Many thanks
This is the warning:
Warning: The Jacobian at the solution is ill-conditioned, and some model parameters may not be
estimated well (they are not identifiable). Use caution in making predictions.
> In nlinfit (line 384)
In NonLinearModel/fitter (line 1127)
In classreg.regr/FitObject/doFit (line 94)
In NonLinearModel.fit (line 1434)
In fitnlm (line 99)
I don't think I can really comment on the error question. It's too complex a decision about whether a model is "good enough" for a particular application.

Sign in to comment.

More Answers (3)

Hello, AHMED, your fit function is overdeterminted which will lead to multi-solutions, the parameter of "a1" is redundant:
C=a1*(a2-a3*N^0.5)*exp(((-a4/8.314)*(1/T-1/298.15))-((a5/8.314)*(1/T-1/298.15)^2));
the effect of above is exact same as the function of below
C=(a2-a3*N^0.5)*exp(((-a4/8.314)*(1/T-1/298.15))-((a5/8.314)*(1/T-1/298.15)^2));
Root of Mean Square Error (RMSE): 2.01307596538931
Sum of Squared Residual: 405.247484242812
Correlation Coef. (R): 0.962621756201578
R-Square: 0.926640645512611
Adjusted R-Square: 0.920394018937605
Determination Coef. (DC): 0.762037250139209
F-Statistic: 19.6453235739073
Parameter Best Estimate
-------------------- -------------
a2 102.235245663151
a3 1.13255810875212
a4 844.097603208569
a5 13460420.8019471
Thank you both Alex Sha and the cyclist
I have removed a(1) parameter and the error/warning written in the cyclist comment has been mitigated.
I will try to enhance the model further and reduce the RMSE by manipulating the function. After that I will test it.
Thanks guys, it worked. I only need to reduce the RMSE by modifying the function.
Is there anyway that we give the data and a 'tool' or 'algorithim' suggests a mathmatical model ?

3 Comments

If you mean that the tool can find a function, from the entire universe of functions, then no. You have to specify the functional form of the model(s), and then fit for parameters.
Here is a simple example showing why a tool cannot simply find "the best model". Suppose your data were
x = [1 2 3];
y = [2 3 5];
Now, realize that there are an infinite number of functions that will fit those data exactly. An automated tool that finds the "best" function will simply choose one of those. But that does not mean that the model will fit a new sample of data (which is presumably what you care about). This is the problem known as overfitting.
If you don't care about finding a functional form, then there are machine learning methods that will find a fit, and there are techniques to limit overfitting. That's beyond the scope of a simple MATLAB answer. :-)
I can recommend the book and online course Learning from Data to get a better understanding of these ideas.
Thanks. Any suggestions to improve the RMSE? such as making the nlinfit roboust ? currently the RMSE is 2.18. I am not conveinced exactly for the fit of the 5C ( 278.15) data.
Usually one chooses the fitting function based on understanding of the underlying process. So, for example, is there some other factor besides number of cycles and temperature, that you are not accounting for? Is there something else that happens at lower temperature, that you can capture with another term?
Of course, it is possible that you just have noise that you cannot really account for. Then your RMSE is simply the best you can do.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!