MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

### Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

# Problem in plotting confidence interval in a probability plot

Asked by arif rabbani on 5 May 2013

i have performed extreme distribution to a data set.I intended to draw 95% confidence interval in the probability plot.The following codding was used:

WL= [ 6.79 6.89 6.38 5.67 6.91 7.66 6.41 6.04 6.41 5.55 6.93 5.67 6.69 6.51 6.32 7.33 6.43 6.52 6.13 6.72 6.7 7.78 6.18 5.41 6.94 6.51 6.02 5.94 6.08 ]; % observed water level data pd = fitdist(WL','ev'); % Creates a ProbDistUnivParam object by fitting the data to a extreme value distribution. ci = paramci(pd); % This function calculates the values of the parameters based on a certain confidence interval. Here the by default the confidence interval is 95 percent probplot(WL); h=probplot(gca,@(WL,x,y)evcdf(WL,x,y),[ci(1,1),ci(1,2)]); set(h,'color','r','linestyle','-') t= probplot(gca,@(WL,x,y)evcdf(WL,x,y),[ci(2,1),ci(2,2)]); set(t,'color','g','linestyle','-.')

But there is a problem in confidence interval line(green one) which deviated in a wrong way.there might be some error in ci(confidence interval) boundary but i could not fix it.please help me out.

## Products

No products are associated with this question.

Answer by José-Luis on 6 May 2013
Edited by José-Luis on 7 May 2013

I thing there are two problems with your approach. One is that you could simplify your code like this:

``` WL= [ 6.79 6.89 6.38 5.67 6.91 7.66 6.41 6.04 6.41 5.55 6.93 5.67 6.69 6.51 6.32 7.33 6.43 6.52 6.13 6.72 6.7 7.78 6.18 5.41 6.94 6.51 6.02 5.94 6.08 ];
% observed water level data
pd = fitdist(WL','ev');
% Creates a ProbDistUnivParam object by fitting the data to a extreme value distribution.
ci = paramci(pd);
% This function calculates the values of the parameters based on a certain confidence interval. Here the by default the confidence interval is 95 percent
probplot('extreme value' , WL);
h=probplot(gca,@evcdf,[ci(1,1),ci(1,2)]);
set(h,'color','r','linestyle','-')
t= probplot(gca,@evcdf,[ci(2,1),ci(2,2)]);
set(t,'color','g','linestyle','-.')```

Note that probplot is by default a normal distribution. You should specify whenever you use a different distribution. Secondly, what makes you think that there is an error with what you did? This is not curve fitting, you are comparing cdfs. There is no reason why the parameters you obtained from ci will give you two cdf's "above" and "below" the best fit cdf. You probably expect that from vanilla curve fitting. But these are not the "curves" you fitted to, these are cdf's.

Please take a look at the cdf's, as they might help you understand better:

``` WL= [ 6.79 6.89 6.38 5.67 6.91 7.66 6.41 6.04 6.41 5.55 6.93 5.67 6.69 6.51 6.32 7.33 6.43 6.52 6.13 6.72 6.7 7.78 6.18 5.41 6.94 6.51 6.02 5.94 6.08 ];
% observed water level data
pd = fitdist(WL','ev');
% Creates a ProbDistUnivParam object by fitting the data to a extreme value distribution.
ci = paramci(pd);
% This function calculates the values of the parameters based on a certain confidence interval. Here the by default the confidence interval is 95 percent
evBest = @(x) evcdf(x,pd.Params(1),pd.Params(2));
evLow =  @(x) evcdf(x,ci(1,1),ci(1,2));
evHigh =  @(x) evcdf(x,ci(2,1),ci(2,2));```
``` fplot(evBest,[0 10],'k--');
hold on
fplot(evLow,[0 10],'r-');
fplot(evHigh,[0 10],'b-');```

Tom Lane on 10 May 2013

This approach, while a good idea, requires that the lower and upper limits of the cdf be calculated from the paired lower and upper limits of the parameters. I don't believe that would work in general. It might take some figuring out to determine how to convert the parameter bounds into cdf bounds.

Answer by Tom Lane on 10 May 2013

You are right that probplot has no feature for including confidence bounds.

The cdf method for the probability distribution object, I believe, can return lower and upper bounds as its second and third outputs for some distributions. It appears not to be documented, but is visible in generated code from dfittool so I would expect it to be supported.

Try this:

```x = evrnd(10,2,40,1);
p = fitdist(x,'ev')
xx = linspace(icdf(p,.001),icdf(p,.999));
n = length(x);
[f,flo,fhi] = cdf(p,xx);
plot(sort(x),icdf(p,(1:n)/(n+1)),'rx',xx,icdf(p,f),...
'b-',xx,icdf(p,flo),'r:',xx,icdf(p,fhi),'r:')
```

This code computes bounds on the cdf, then converts the cdf using the inverse cdf so the fit plots as a straight line. The resulting plot shows the observed points as red X's and the fitted quantiles as the blue line. This is basically the probability plot, except that the y axis is labeled with quantile values instead of probability values. Perhaps this will be adequate. You could try to label the y axis with the probability values if that is important.