Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
fitting a generalized pareto

Subject: fitting a generalized pareto

From: Bart

Date: 14 Jul, 2009 14:38:03

Message: 1 of 6

Hello,
I am looking for a function in order to fit a generalized pareto distribution with censored information.
The matlab statistics tollbox does not offer this option in the gpfit function.
Can anyone suggest another toolbox which provides this functionality?

Thanks,

Bart Hamers

Subject: fitting a generalized pareto

From: Peter Perkins

Date: 14 Jul, 2009 17:38:12

Message: 2 of 6

Bart wrote:
> Hello,
> I am looking for a function in order to fit a generalized pareto distribution with censored information.
> The matlab statistics tollbox does not offer this option in the gpfit function.
> Can anyone suggest another toolbox which provides this functionality?

In theory, you can do this using the GPPDF, GPCDF, and MLE functions, similarly to what's described in this demo:

<http://www.mathworks.com/products/statistics/demos.html?file=/products/demos/shipping/stats/customdist2demo.html>

In practice, though, I have no idea what the properties of such a maximum likelihood estimate would be, or the numerical challenges in maximizing this particular likelihood.

Hope this helps.

Subject: fitting a generalized pareto

From: TideMan

Date: 14 Jul, 2009 21:33:09

Message: 3 of 6

On Jul 15, 2:38 am, "Bart" <bart.ham...@dexiaREMOVETHIS.com> wrote:
> Hello,
> I am looking for a function in order to fit a generalized pareto distribution with censored information.
> The matlab statistics tollbox does not offer this option in the gpfit function.
> Can anyone suggest another toolbox which provides this functionality?
>
> Thanks,
>
> Bart Hamers

Have you tried the (free) EVIM toolbox? EVIM = Extreme Value analysis
In Matlab. Google it.

Subject: fitting a generalized pareto

From: Bart

Date: 15 Jul, 2009 11:29:01

Message: 4 of 6

I tried to follow the advice as given in the link. Similar is in for the extreme value distribution, I made a new logpdf and logsurvival for the generalized pareto.:

gplogpdf = @(x,shape,sigma) (-1-1/shape).*(log(sigma+shape*(x))-2*log(sigma));
gplogsf = @(x,shape,sigma) (-1/shape).*(log(sigma+shape*(x))-log(sigma));
try
[paramEsts,paramCIs] = mle(TV, 'censoring',TC, 'logpdf',gplogpdf, 'logsf',gplogsf,'start', parmhat, 'lower',[-Inf,0])
catch ME
disp(ME.message)
end


I get the error:
"The LOGSF function returned positive values."

I have no idea how to avoid it? It occurs since the optimization algorithm want to put the 'shape' parameter below zero. This however gives an error.
Limiting the search area for the shape parameter only above zero, gives a non-optimal convergences and therefor a bad fit.

Any idea how to get aroud this problem?

Thanks,

Bart

Subject: fitting a generalized pareto

From: Bart

Date: 15 Jul, 2009 11:42:01

Message: 5 of 6

TideMan <mulgor@gmail.com> wrote in message <4413d616-cb6a-4ac7-a748-3a14e71ce050@h11g2000yqb.googlegroups.com>...
> On Jul 15, 2:38?am, "Bart" <bart.ham...@dexiaREMOVETHIS.com> wrote:
> > Hello,
> > I am looking for a function in order to fit a generalized pareto distribution with censored information.
> > The matlab statistics tollbox does not offer this option in the gpfit function.
> > Can anyone suggest another toolbox which provides this functionality?
> >
> > Thanks,
> >
> > Bart Hamers
>
> Have you tried the (free) EVIM toolbox? EVIM = Extreme Value analysis
> In Matlab. Google it.


EVIM also does nothave the functionality to have censored data.

Bart

Subject: fitting a generalized pareto

From: Peter Perkins

Date: 16 Jul, 2009 15:27:57

Message: 6 of 6

Bart wrote:
> I tried to follow the advice as given in the link. Similar is in for the extreme value distribution, I made a new logpdf and logsurvival for the generalized pareto.:
>
> gplogpdf = @(x,shape,sigma) (-1-1/shape).*(log(sigma+shape*(x))-2*log(sigma));
> gplogsf = @(x,shape,sigma) (-1/shape).*(log(sigma+shape*(x))-log(sigma));
> try
> [paramEsts,paramCIs] = mle(TV, 'censoring',TC, 'logpdf',gplogpdf, 'logsf',gplogsf,'start', parmhat, 'lower',[-Inf,0])
> catch ME
> disp(ME.message)
> end
>
>
> I get the error:
> "The LOGSF function returned positive values."

Your logpdf function seems to be wrong, but didn't any time figuring out why. It's unclear to me why your logSF function would be returning a positive value, obviously if written safely, it should not ever do that.

I recommend using the GPPDF and GPCDF functions in the Statistics Toolbox as starting points to get your two functions right. Given that, this:

k0 = -.1; sigma0 = 2;

x = gprnd(k0,sigma0,0,100,1);
t = gprnd(k0,sigma0,0,100,1); % censoring times
c = (t <= x); x(c) = t(c); % 50% censoring

mle(x,'logpdf',@gplogpdf, 'logsf',@gplogsf, ...
      'start',[k0,sigma0], 'lowerBound',[-1,0], 'upperBound',[1,Inf], ...
      'censoring',c)

seems to work with these simple data and good starting value, for a range of k0 between -1 and 1. Of course, these are simple data with an easy way to get a good starting point.

There are potential numerical issues in this optimization, including the fact that the support of this distribution changes with k and sigma. In practice, that may or may not be a problem, depending on the data. But the symptom would presumably be that the logPDF or the logSF returns a -Inf because the optimization has stepped outside of the possible values of sigma/k. To check that, write functions that check the support condition (i.e., use tghe guts of GPPDF and GPCDF), and set a break point in them to see what's happening.

Since MLE only handles box constraints, you might do one of these two things to deal with the varying support:

1) Reparameterize from (sigma, k) to (sigma, -sigma/k), and enforce the conditions sigma>0, -sigma/k>max(x). A potential problem might be that the lower bound on -sigma/k is only needed when k < 0, and using it prevents a positive estimate for k. There may be some other parameterization that gives box constraints but doesn't have this probelm, I don't know.

2) Use FMINCON from the Optimization Toolbox instead of MLE, and provide a nonlinear constraint to enforce the support conditions directly on k and sigma.

Hope this helps.

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us