Thread Subject: fitting the norm CDF

Subject: fitting the norm CDF

From: Pete sherer

Date: 7 Jan, 2008 21:30:21

Message: 1 of 5

I tried using the glmfit to estimate the parameters for the
normal distribution. However, the fit didn't look right. I
presume that I set up the GLMFIT wrongly. Any suggestions
would be appreciate it.
x =
[0.0246;0.04;0.0585;0.0801;0.0926;0.1019;0.1083;0.1178;0.1243;0.1277;0.1342;0.1376;0.1441;0.1509;0.1544;0.1642;0.1709;0.1806;0.1874;0.194;0.1975;0.204;0.2075;0.217;0.2234;0.2452;0.2577;0.2701;0.2887;0.3103;0.3318;0.3534;0.3749;];
y =
[0.0045;0.0045;0.009;0.0224;0.0404;0.0583;0.0852;0.1166;0.148;0.1883;0.2242;0.2601;0.3004;0.3677;0.4126;0.4843;0.5381;0.5964;0.6592;0.7175;0.7623;0.8027;0.843;0.8744;0.9013;0.9372;0.9552;0.9686;0.9821;0.991;1;1;1;];

b = glmfit( x, y, 'normal');
refx = linspace( min(x), max(x), 50);
Xb = b(1)+ b(2).*refx;
yfit = normcdf( Xb, 0, 1); % Standardized Normal CDF

figure
hold on
plot( x, y, 'ob')
th = plot( refx, yfit, '--r', 'linewidth', 2.5);

box on
hold off

Subject: fitting the norm CDF

From: Tom Lane

Date: 7 Jan, 2008 22:00:57

Message: 2 of 5

>I tried using the glmfit to estimate the parameters for the
> normal distribution. However, the fit didn't look right. I
> presume that I set up the GLMFIT wrongly. Any suggestions
> would be appreciate it.
> x =
> [0.0246;0.04;0.0585;0.0801;0.0926;0.1019;0.1083;0.1178;0.1243;0.1277;0.1342;0.1376;0.1441;0.1509;0.1544;0.1642;0.1709;0.1806;0.1874;0.194;0.1975;0.204;0.2075;0.217;0.2234;0.2452;0.2577;0.2701;0.2887;0.3103;0.3318;0.3534;0.3749;];
> y =
> [0.0045;0.0045;0.009;0.0224;0.0404;0.0583;0.0852;0.1166;0.148;0.1883;0.2242;0.2601;0.3004;0.3677;0.4126;0.4843;0.5381;0.5964;0.6592;0.7175;0.7623;0.8027;0.843;0.8744;0.9013;0.9372;0.9552;0.9686;0.9821;0.991;1;1;1;];

Pete, the problem as you set it up estimates the parameters for a linear fit
of y as a function of x, under the assumption that the errors in y come from
a normal distribution.

When I look at a plot of your raw data, the relationship is not linear. It
appears to resemble the situation where y is the proportion of responses of
some sort (call these "successes") for various x values. That proportion
generally seems to follow the shape of a normal cdf. Is that accurate?

If it is, you need to enter your response Y as a two-column matrix with each
row giving the number of successes and the number of trials for one
observation. For the purposes of the following, I set the number of trials
to 100 for each observation, and I used your y to compute the number of
responses. It turns out glmfit doesn't force this number to be an integer;
maybe non-integer values make sense in some situations. The "probit" option
requests a curve in the form of a normal cdf.

Try this and see if it does what you need:

Y = [y*100, repmat(100,size(y))];
b = glmfit( x, Y, 'binomial','link','probit');
refx = linspace( min(x), max(x), 50);
Xb = b(1)+ b(2).*refx;
yfit = glmval(b,refx,'probit');
figure
hold on
plot( x, y, 'ob')
th = plot( refx, yfit, '--r', 'linewidth', 2.5);
box on
hold off

-- Tom

Subject: fitting the norm CDF

From: Pete sherer

Date: 7 Jan, 2008 22:39:02

Message: 3 of 5

"Tom Lane" <tlane@mathworks.com> wrote in message
<flu7ep$hpl$1@fred.mathworks.com>...
> >I tried using the glmfit to estimate the parameters for the
> > normal distribution. However, the fit didn't look right. I
> > presume that I set up the GLMFIT wrongly. Any suggestions
> > would be appreciate it.
> > x =
> >
[0.0246;0.04;0.0585;0.0801;0.0926;0.1019;0.1083;0.1178;0.1243;0.1277;0.1342;0.1376;0.1441;0.1509;0.1544;0.1642;0.1709;0.1806;0.1874;0.194;0.1975;0.204;0.2075;0.217;0.2234;0.2452;0.2577;0.2701;0.2887;0.3103;0.3318;0.3534;0.3749;];
> > y =
> >
[0.0045;0.0045;0.009;0.0224;0.0404;0.0583;0.0852;0.1166;0.148;0.1883;0.2242;0.2601;0.3004;0.3677;0.4126;0.4843;0.5381;0.5964;0.6592;0.7175;0.7623;0.8027;0.843;0.8744;0.9013;0.9372;0.9552;0.9686;0.9821;0.991;1;1;1;];
>
> Pete, the problem as you set it up estimates the
parameters for a linear fit
> of y as a function of x, under the assumption that the
errors in y come from
> a normal distribution.
>
> When I look at a plot of your raw data, the relationship
is not linear. It
> appears to resemble the situation where y is the
proportion of responses of
> some sort (call these "successes") for various x values.
That proportion
> generally seems to follow the shape of a normal cdf. Is
that accurate?
>
> If it is, you need to enter your response Y as a
two-column matrix with each
> row giving the number of successes and the number of
trials for one
> observation. For the purposes of the following, I set the
number of trials
> to 100 for each observation, and I used your y to compute
the number of
> responses. It turns out glmfit doesn't force this number
to be an integer;
> maybe non-integer values make sense in some situations.
The "probit" option
> requests a curve in the form of a normal cdf.
>
> Try this and see if it does what you need:
>
> Y = [y*100, repmat(100,size(y))];
> b = glmfit( x, Y, 'binomial','link','probit');
> refx = linspace( min(x), max(x), 50);
> Xb = b(1)+ b(2).*refx;
> yfit = glmval(b,refx,'probit');
> figure
> hold on
> plot( x, y, 'ob')
> th = plot( refx, yfit, '--r', 'linewidth', 2.5);
> box on
> hold off
>
> -- Tom
>
>

Hi Tom,

I tried the code you sent. However, I received the following
error message:

??? Error using ==> glmfit at 154
Y must contain values in the interval [0,N] for the binomial
distribution.

I would appreciate your suggestion.

P

Subject: fitting the norm CDF

From: Pete sherer

Date: 7 Jan, 2008 22:48:02

Message: 4 of 5

It seems to work now by using
Y = round(Y);

One more question is how I can can transform the b (fitted
parameters) into the estimated mean and dispersion of the
normal distribution. From glmfit I received
b(1) = -3.6396 and
b(2) = 21.6186;

Thanks a lot.

P

Subject: fitting the norm CDF

From: Tom Lane

Date: 9 Jan, 2008 20:08:41

Message: 5 of 5

> One more question is how I can can transform the b (fitted
> parameters) into the estimated mean and dispersion of the
> normal distribution. From glmfit I received
> b(1) = -3.6396 and
> b(2) = 21.6186;

Pete, the model being fit is

   p = normcdf( b(1) + b(2)*x )
      = normcdf( (x - (-b(1)/b(2)) )/ (1/b(2)) )

so -b(1)/b(2) and 1/b(2) are I think what you are trying to get. Try this
after the code you have so far:

line(refx,normcdf(refx, -b(1)/b(2), 1/b(2)),'color','g')

-- Tom

Tags for this Thread

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

rssFeed for this Thread

Contact us at files@mathworks.com