Optimizing a Regression Learner App for an Electrochemical NO2 Sensor: Dealing with Drift and Input Variations
5 views (last 30 days)
Show older comments
Hello,
I am currently using the Regression Learner App to develop a GPR Exponential model for my Electrochemical NO2 Sensor. This sensor outputs a voltage, and I use reference data alongside temperature and humidity measurements to train my model.
Initially, after creating a model with the App, I find that the GPR Exponential model aligns reasonably well with the sensor data. However, over time, I have noticed a slight drift in the data. I don't believe that this drift is a result of the sensor itself. Instead, it may be influenced by new combinations of sensor output voltage, temperature, humidity, and reference data values, which the model might not have encountered during the training process.
If I rerun the Regression Learner App to update or create a new GPR Exponential model, the sensor output appears to be accurate again. This leads me to believe that the need to retrain the model might be due to changes in the combination of the input parameters.
Considering the potential for a wide array of different parameter combinations, how can I optimize my model to predict more accurately?
Moreover, could the nature of my temperature input impact the prediction? Specifically, would there be a noticeable difference in the accuracy of predictions if I input the absolute temperature compared to inputting the temperature segmented into smaller blocks?
I'm curious to know if anyone else has had similar experiences with their models? Any insights or suggestions to enhance the performance of my GPR Exponential model would be greatly appreciated.
6 Comments
dpb
on 8 Sep 2023
Edited: dpb
on 9 Sep 2023
A. You can always compute something outside the model range; how accurate it will be is clearly dependent upon how accurate the model is to begin with plus how well it does predict what the response will be outside that range.
Clearly, if a sensor's response were purely linear over the entire range, then it wouldn't matter; a straight line is a straight line. That is never the case in practice; just how nonlinear and how well the fitted model holds is purely up to whatever the particular data/model predict related to what the sensor output actually is for a given input. Polynomials in higher degrees are particular notorious for "blowing up" as a range gets larger; a quadratic term response alone increases by 2X for every 1.4X in input; iow a 40% increase in T would double the predicted sensor output including a quadratic term by that term alone. (38/32)^2 ==> 1.4. Remember the shape of a parabola is always increasing slope magnitude, whether pointing up or down.
B. You clearly can't measure every single combination of all paramters, that's not what experiment design is about. You should, however cover the RANGE of all parameters over the ranges that can exist jointly. Picking that set of points is the subject of experiment design; one method that has been generally found helpful in fitting quadratic response surface models is the central composite design. Again, I recommend to you Box, Hunter and Hunter as an essential background tool to get an idea of the issues and techniques designed to avoid pitfalls.
Answers (1)
Kaustab Pal
on 19 Aug 2024
For the model to work well, it needs to see inputs that are similar to what it saw during training. For example, if the input and output had a linear relationship during training, the model will do well if this relationship stays the same. But if the relationship changes to something like exponential during testing, the model might not perform well, and you'll need to retrain it.
To make your model more accurate, try to gather a large dataset that shows different types of input-output relationships. You can also improve the model by updating it regularly with new data it hasn't seen before.
The way you input temperature data can also affect how well the model works. It's important to use the same method of representing temperature both when training the model and when using it to make predictions. This consistency is key to maintaining accuracy.
I hope this helps clear things up!
0 Comments
See Also
Categories
Find more on Gaussian Process Regression in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!