Normalizing a histogram

25 views (last 30 days)
John
John on 31 Mar 2012
Hello,
I've plotted a histogram of some data. Here it is http://dl.dropbox.com/u/54057365/All/departure%20time.JPG
How can remove the gaps between the bars? Should I be using a histogram? But how can you normalize the measurements on the y axis in a histogram?
Many thanks
DepartureTimes = load('Departure Times (hr).txt')
h = hist(DepartureTimes,24);
h = h/sum(h);
bar(h, 'DisplayName', 'Depature Times');
legend('show');
xlim([5 25])
  1 Comment
Jan
Jan on 11 Jul 2013
[lost image] Here the expected effect appears: The image was deleted from the server, such that the question lost its meaning.
Please, TMW, add the service to host images on the Answers servers soon. Otherwise the quality of this forum as a database of solutions will suffer from the implicit expiration of the linked images.

Sign in to comment.

Accepted Answer

Wayne King
Wayne King on 31 Mar 2012
Hi John, if you type "help hist", you'll find information about specifying the bar centers. This implicitly controls the width of the bins that the bars cover.
If you want to change the gap between the bars, see "help hist" for information about returning the bar heights instead of plotting them, and "help bar" for information about drawing bars and controlling the space between them.
For example:
X = randn(1e3,1);
N = hist(X,22);
bar(N,1);
As far as whether the histogram is appropriate or how to "normalize" it. Can you be more specific? People generally plot a histogram in two ways:
1.) the raw frequency or count histogram 2.) a probability histogram (as you have almost done), so that they can overlay a PDF for comparison.
Here's an example of that (requires Statistics Toolbox):
Data = randn(1000,1); %just making up some junk data
binWidth = 0.7; %This is the bin width
binCtrs = -3:0.7:3; %Bin centers, depends on your data
n=length(Data);
counts = hist(Data,binCtrs);
prob = counts / (n * binWidth);
H = bar(binCtrs,prob,'hist');
set(H,'facecolor',[0.5 0.5 0.5]);
% get the N(0,1) pdf on a finer grid
hold on;
x = -3:.1:3;
y = normpdf(x,0,1); %requires Statistics toolbox
plot(x,y,'k','linewidth',2);
  2 Comments
John
John on 31 Mar 2012
Hello Wayne,
Thank you for taking the time to respond in such detail. I will look up this help documentation.
To answer your question, I am trying to plot a probability histogram and over lay a PDF on it - probably a chi-squared distribution. This is the reason why I was trying to normalize the data - if that is even the correct thing to do?
Your example is for the probability histogram I believe. I'll try it now.
Thanks
John
John on 31 Mar 2012
I have determined that I require 22 bins - using Scotts Rule. Is it necessary to specify the bin centers? How would you convert bin numbers to bin centers?

Sign in to comment.

More Answers (4)

Wayne King
Wayne King on 31 Mar 2012
yes, you are doing the correct thing. You can use dfittool to fit a Gamma distribution, which you can use estimate the parameters of a chi-square PDF.
dfittool will also overlay the fitted pdf on the data.
The alpha parameter in a gamma is dof/2 and the beta parameter is 2.
You can generate code for your fit from dfittool and export the fit to the workspace.
  1 Comment
John
John on 31 Mar 2012
Oh, I tried using the dfittool earlier but there was no chi squared pdf.
I have done what you suggested. http://dl.dropbox.com/u/54057365/All/difit.JPG
I also generated a M file http://dl.dropbox.com/u/54057365/All/1.JPG
but I don't understand what you mean by "The alpha parameter in a gamma is dof/2 and the beta parameter is 2^(dof/2)"
Do you mean multiply the alpha by 2 to get the dof?
but what would I change in the code?
Thank you for your help

Sign in to comment.


Wayne King
Wayne King on 31 Mar 2012
I'm saying that if you fit a gamma, you get an alpha and a beta parameter. You can use that to see if a chi-square is appropriate (and not a more general gamma) and if so, get the dof parameter.
For example:
R = chi2rnd(5,1e3,1); %chi-square 5 dof
phat = mle(R,'distribution','Gamma');
phat(1) is the alpha parameter, but that is the dof/2 for a chi- square
Therefore
round(2*phat(1))
Gives you an estimate of the dof parameter for a chi-square.
phat(2) should always be close to 2 (if it isn't that is a indication that chi-square is not a good fit)
You can also use fitdist()
pd = fitdist(R,'gamma');
For this example, I get:
gamma distribution
a = 2.50903
b = 1.99592
which indicates a chi-square PDF with 5 dof.
  2 Comments
John
John on 31 Mar 2012
OK thanks, can I ask you one more question?
For the Gamma function I got a = 7.44, b = 1.42, does this mean that its a chi-square PDF with 15 dof? but because is close to one then it is not a good fit? What would you suggest would be a good fit?
http://dl.dropbox.com/u/54057365/All/difit.JPG
Wayne King
Wayne King on 31 Mar 2012
Do you have some a priori reason that is must be chi-square and not a more general gamma? At first glance, the beta value indicates that a more general gamma is more appropriate here.

Sign in to comment.


Image Analyst
Image Analyst on 31 Mar 2012
To remove the gaps between the bars, you set the ' BarWidth ' property to 1:
bar(binsNumbers, CountData, 'BarWidth', 1);
By setting the BarWidth to between 0 and 1 you can go from having huge gaps between the bars (very skinny bars) to having full width bars (bars touch each other with no gap at all).

Harish Chandra
Harish Chandra on 4 Sep 2012
I have a question, I know it has been some time since the last post in this thread but I am posting it here since it is relevant... How do you obtained the goodness of fit of gamma distrubution fitted to any data? For example the chi^2 or the R^2 value maybe using chi2gof or something similar?
Thanks for your help. Harish

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!