How to fit a curve to first part of some data points by using Normal Distribution function?

3 views (last 30 days)
Hello
I was wondering how to fit a curve to some data points by using normal distribution function. Assume that we already know that:
1- We just have some first part of the data points.
2- the pattern of rest of these data points will change in manner of normal distribution function.
So for example:
X = [1 2 3 4 5 6 7 8 9];
Y = [0 0.5 0.9 3 3.1 5.2 5.8 5.9 6.5];
Curve_Estimation = smooth (X, Y); % Just an estimation to show the posible distributin form
plot (X, Y, X, Curve_Estimation, 'linewidth' , 2);
In other words, the question is how to predict the changes in data points?
Thank you
  3 Comments
behzad
behzad on 18 Mar 2020
Thank you for the response.
Since i am not familiar with "regression" method, i do not know its capabilities. i choose normal distribution idea because i am certain that the incoming data and the whole "change pattern" will be fitted into "Cumulative Distribution Function". i must say that i want to "predict" the change in incoming data in Y array. (i have atleast half of the data)
I dont know i am wrong or not but i imagine to predict the deviations, the regression method acts as some near-linar tool. so i start with Normal Distrobution Function which was the first idea came to my mind based on this fact that the change patern would be very similar to CDF.
To clear the question i add some line in the code:
X = [1 2 3 4 5 6 7 8 9];
Y = [0 0.5 0.9 3 3.1 5.2 5.8 5.9 6.5];
mean = 8.5;
standard_deviation = 3.6;
XX = 1:2*mean;
CDF = normcdf(XX,mean,standard_deviation);
plot(X,Y,'bo',XX,15*CDF,'k','linewidth',2);grid;xlabel('X');ylabel('CDF');
legend('Real Value','Predicted Curve?')
As you see i changed the values of mean, standard deviation and some multiplier coeficient to CDF to best-fit the curve in the data. i am certain that by using CDF concept, i may have aacceptable prediction. but i am not so sure that i am asking the right question.
Thanks again
J. Alex Lee
J. Alex Lee on 18 Mar 2020
If I understand correctly, you indeed have a fitting (regression) problem. You want to fit the function
to your (x,y) data. However, your y data are not a properly normalized cdf (you have values > 1), so you will need an additional constant in front.
Based on your sample data, I don't think this strategy will give you very good results; since your data don't reach into the plateau, and it looks noisy, there will probably be many combinations of the scaling constant and other parameters that will give you roughly the same quality fits. In general, I'm not sure it makes sense to pose a distribution fitting problem in terms of only partially supplied distribution...it's asking to infer statistics from an insanely biased sample set (find the actual mean of this normally distributed data, but i'm only going to give you data where the values are all less than the mean - impossible?).

Sign in to comment.

Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!