Finding fit parameters for x,y data of a lognormal cdf

Hi,
I have x, y vector data where x = some independent variable of interest and y = cumulative probability. I know the resulting curve represents a lognormal cdf but I'm having trouble finding a way to find the location and scale parameters that correspond to it.
My initial thought was to simply take the cdf, convert it to a pdf by taking p(ii) = y(ii+1) - y(ii), and then use the frequency option of lognfit to find the parameters. I do not get the correct result from this though and was wondering if anyone else had any ideas. Example code is below. Thanks!
if true
% code
end
X = 1:200;
Y = logncdf(X,4.5,0.1); %4.5 and 0.1 are just for illustration, in reality I don't know these parameters.
for ii = 1:length(X)-1
P(ii) = Y(ii+1)-Y(ii);
end
P(200) = 1 - Y(end);
fit = lognfit(X,[],[],P)
The location parameter I get from this example is correct, but the scale parameter is 0.

 Accepted Answer

I think I answered my own question by going a different route. See code below.
X = 1:200;
Y = logncdf(X,4.5,0.1);
func = @(fit,xdata)logncdf(xdata,fit(1),fit(2));
fit = lsqcurvefit(func,[4 0.3],X,Y)
This gives me fit parameters of
fit =
4.5000 0.1000

4 Comments

I don’t have the Ciurve Fitting Toolbox (redundant for my purposes with the Statistics and Optimization Toolboxes) so I can’t run your code and so don’t understand what ‘func’ does.
It seems to me though that you’re fitting a function to itself, an operation guaranteed to succeed, especially if you choose initial parameters close to what you know them to be.
I’m not certain you ’re actually any farther ahead, but then I don’t have your data, only your description of it.
Func just defines a custom function, which for my case since, I know the data defines a logn cdf, is just the lognormal cdf function itself. The guesses are close in the example I used, but I can always take log of the median value and have a reasonable estimate for location. If you look at the figure I posted above hopefully this makes more sense now. I did try it on my actual data and it worked perfectly. I still feel like this is a very trivial problem that I'm making harder than it should be, but either way I have something that works for me.
You already accepted your own answer, so I’ll delete mine.
It isn’t as difficult a problem as you’re making it. In fact, you’re using the wrong approach. You need to understand the lognormal distribution, then the solution is straightforward:
D = load('David SampleData.mat');
x = D.X;
p = D.P;
prms = @(b,x) logncdf(x,b(1),b(2));
init_prms = interp1(p, x, [0.5 0.025]);
B0 = [log(init_prms(1)) -diff(init_prms)/init_prms(1)];
B = nlinfit(x,p,prms,B0);
lncdfit = prms(B,x);
figure(1)
plot(x, p, 'bp')
hold on
plot(x, lncdfit, '-r')
hold off
grid
produces:
Well, I guess I haven't understood this well enough when I have to ask this question, but,
where do I get the goodness of fit and fit parameters from this?

Sign in to comment.

More Answers (0)

Asked:

on 26 Mar 2015

Commented:

on 10 Oct 2019

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!