## Normalizing a histogram

on 31 Mar 2012

### Wayne King (view profile)

Hello,

I've plotted a histogram of some data. Here it is http://dl.dropbox.com/u/54057365/All/departure%20time.JPG

How can remove the gaps between the bars? Should I be using a histogram? But how can you normalize the measurements on the y axis in a histogram?

Many thanks

```DepartureTimes = load('Departure Times (hr).txt')
h = hist(DepartureTimes,24);
h = h/sum(h);
bar(h, 'DisplayName', 'Depature Times');
legend('show');
xlim([5 25])
```

Jan Simon

### Jan Simon (view profile)

on 11 Jul 2013

[lost image] Here the expected effect appears: The image was deleted from the server, such that the question lost its meaning.

Please, TMW, add the service to host images on the Answers servers soon. Otherwise the quality of this forum as a database of solutions will suffer from the implicit expiration of the linked images.

## Products

No products are associated with this question.

### Wayne King (view profile)

on 31 Mar 2012

Hi John, if you type "help hist", you'll find information about specifying the bar centers. This implicitly controls the width of the bins that the bars cover.

If you want to change the gap between the bars, see "help hist" for information about returning the bar heights instead of plotting them, and "help bar" for information about drawing bars and controlling the space between them.

For example:

```   X = randn(1e3,1);
N = hist(X,22);
bar(N,1);```

As far as whether the histogram is appropriate or how to "normalize" it. Can you be more specific? People generally plot a histogram in two ways:

1.) the raw frequency or count histogram 2.) a probability histogram (as you have almost done), so that they can overlay a PDF for comparison.

Here's an example of that (requires Statistics Toolbox):

```   Data = randn(1000,1); %just making up some junk data
binWidth = 0.7; %This is the bin width
binCtrs = -3:0.7:3; %Bin centers, depends on your data
n=length(Data);
counts = hist(Data,binCtrs);
prob = counts / (n * binWidth);
H = bar(binCtrs,prob,'hist');
set(H,'facecolor',[0.5 0.5 0.5]);
% get the N(0,1) pdf on a finer grid
hold on;
x = -3:.1:3;
y = normpdf(x,0,1); %requires Statistics toolbox
plot(x,y,'k','linewidth',2);```

John

### John (view profile)

on 31 Mar 2012

Hello Wayne,

Thank you for taking the time to respond in such detail. I will look up this help documentation.

To answer your question, I am trying to plot a probability histogram and over lay a PDF on it - probably a chi-squared distribution. This is the reason why I was trying to normalize the data - if that is even the correct thing to do?

Your example is for the probability histogram I believe. I'll try it now.

Thanks

John

### John (view profile)

on 31 Mar 2012

I have determined that I require 22 bins - using Scotts Rule. Is it necessary to specify the bin centers? How would you convert bin numbers to bin centers?

### Wayne King (view profile)

on 31 Mar 2012

yes, you are doing the correct thing. You can use dfittool to fit a Gamma distribution, which you can use estimate the parameters of a chi-square PDF.

dfittool will also overlay the fitted pdf on the data.

The alpha parameter in a gamma is dof/2 and the beta parameter is 2.

You can generate code for your fit from dfittool and export the fit to the workspace.

John

### John (view profile)

on 31 Mar 2012

Oh, I tried using the dfittool earlier but there was no chi squared pdf.

I have done what you suggested. http://dl.dropbox.com/u/54057365/All/difit.JPG

I also generated a M file http://dl.dropbox.com/u/54057365/All/1.JPG

but I don't understand what you mean by "The alpha parameter in a gamma is dof/2 and the beta parameter is 2^(dof/2)"

Do you mean multiply the alpha by 2 to get the dof?

but what would I change in the code?

### Wayne King (view profile)

on 31 Mar 2012

I'm saying that if you fit a gamma, you get an alpha and a beta parameter. You can use that to see if a chi-square is appropriate (and not a more general gamma) and if so, get the dof parameter.

For example:

```   R = chi2rnd(5,1e3,1); %chi-square 5 dof
phat = mle(R,'distribution','Gamma');```

phat(1) is the alpha parameter, but that is the dof/2 for a chi- square

Therefore

`     round(2*phat(1))`

Gives you an estimate of the dof parameter for a chi-square.

phat(2) should always be close to 2 (if it isn't that is a indication that chi-square is not a good fit)

You can also use fitdist()

`    pd = fitdist(R,'gamma');`

For this example, I get:

gamma distribution

```    a = 2.50903
b = 1.99592```

which indicates a chi-square PDF with 5 dof.

John

### John (view profile)

on 31 Mar 2012

OK thanks, can I ask you one more question?

For the Gamma function I got a = 7.44, b = 1.42, does this mean that its a chi-square PDF with 15 dof? but because is close to one then it is not a good fit? What would you suggest would be a good fit?

http://dl.dropbox.com/u/54057365/All/difit.JPG

Wayne King

### Wayne King (view profile)

on 31 Mar 2012

Do you have some a priori reason that is must be chi-square and not a more general gamma? At first glance, the beta value indicates that a more general gamma is more appropriate here.

### Image Analyst (view profile)

on 31 Mar 2012

To remove the gaps between the bars, you set the ' BarWidth ' property to 1:

```bar(binsNumbers, CountData, 'BarWidth', 1);
```

By setting the BarWidth to between 0 and 1 you can go from having huge gaps between the bars (very skinny bars) to having full width bars (bars touch each other with no gap at all).

### Harish Chandra (view profile)

on 4 Sep 2012

I have a question, I know it has been some time since the last post in this thread but I am posting it here since it is relevant... How do you obtained the goodness of fit of gamma distrubution fitted to any data? For example the chi^2 or the R^2 value maybe using chi2gof or something similar?