MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Normalizing a histogram

Asked by John on 31 Mar 2012

Hello,

I've plotted a histogram of some data. Here it is http://dl.dropbox.com/u/54057365/All/departure%20time.JPG

How can remove the gaps between the bars? Should I be using a histogram? But how can you normalize the measurements on the y axis in a histogram?

Many thanks

DepartureTimes = load('Departure Times (hr).txt')
h = hist(DepartureTimes,24);
h = h/sum(h);
bar(h, 'DisplayName', 'Depature Times');
legend('show');
xlim([5 25])

1 Comment

Jan Simon on 11 Jul 2013

[lost image] Here the expected effect appears: The image was deleted from the server, such that the question lost its meaning.

Please, TMW, add the service to host images on the Answers servers soon. Otherwise the quality of this forum as a database of solutions will suffer from the implicit expiration of the linked images.

Products

No products are associated with this question.

Answer by Wayne King on 31 Mar 2012

Hi John, if you type "help hist", you'll find information about specifying the bar centers. This implicitly controls the width of the bins that the bars cover.

If you want to change the gap between the bars, see "help hist" for information about returning the bar heights instead of plotting them, and "help bar" for information about drawing bars and controlling the space between them.

For example:

X = randn(1e3,1);
N = hist(X,22);
bar(N,1);

As far as whether the histogram is appropriate or how to "normalize" it. Can you be more specific? People generally plot a histogram in two ways:

1.) the raw frequency or count histogram 2.) a probability histogram (as you have almost done), so that they can overlay a PDF for comparison.

Here's an example of that (requires Statistics Toolbox):

Data = randn(1000,1); %just making up some junk data
binWidth = 0.7; %This is the bin width
binCtrs = -3:0.7:3; %Bin centers, depends on your data
n=length(Data);
counts = hist(Data,binCtrs);
prob = counts / (n * binWidth);
H = bar(binCtrs,prob,'hist');
set(H,'facecolor',[0.5 0.5 0.5]);
% get the N(0,1) pdf on a finer grid
hold on;
x = -3:.1:3;
y = normpdf(x,0,1); %requires Statistics toolbox
plot(x,y,'k','linewidth',2);

John on 31 Mar 2012

Hello Wayne,

Thank you for taking the time to respond in such detail. I will look up this help documentation.

To answer your question, I am trying to plot a probability histogram and over lay a PDF on it - probably a chi-squared distribution. This is the reason why I was trying to normalize the data - if that is even the correct thing to do?

Your example is for the probability histogram I believe. I'll try it now.

Thanks

John on 31 Mar 2012

I have determined that I require 22 bins - using Scotts Rule. Is it necessary to specify the bin centers? How would you convert bin numbers to bin centers?

Answer by Wayne King on 31 Mar 2012

yes, you are doing the correct thing. You can use dfittool to fit a Gamma distribution, which you can use estimate the parameters of a chi-square PDF.

dfittool will also overlay the fitted pdf on the data.

The alpha parameter in a gamma is dof/2 and the beta parameter is 2.

You can generate code for your fit from dfittool and export the fit to the workspace.

1 Comment

John on 31 Mar 2012

Oh, I tried using the dfittool earlier but there was no chi squared pdf.

I have done what you suggested. http://dl.dropbox.com/u/54057365/All/difit.JPG

I also generated a M file http://dl.dropbox.com/u/54057365/All/1.JPG

but I don't understand what you mean by "The alpha parameter in a gamma is dof/2 and the beta parameter is 2^(dof/2)"

Do you mean multiply the alpha by 2 to get the dof?

but what would I change in the code?

Thank you for your help

Answer by Wayne King on 31 Mar 2012

I'm saying that if you fit a gamma, you get an alpha and a beta parameter. You can use that to see if a chi-square is appropriate (and not a more general gamma) and if so, get the dof parameter.

For example:

R = chi2rnd(5,1e3,1); %chi-square 5 dof
phat = mle(R,'distribution','Gamma');

phat(1) is the alpha parameter, but that is the dof/2 for a chi- square

Therefore

round(2*phat(1))

Gives you an estimate of the dof parameter for a chi-square.

phat(2) should always be close to 2 (if it isn't that is a indication that chi-square is not a good fit)

You can also use fitdist()

pd = fitdist(R,'gamma');

For this example, I get:

gamma distribution

a = 2.50903
b = 1.99592

which indicates a chi-square PDF with 5 dof.

John on 31 Mar 2012

OK thanks, can I ask you one more question?

For the Gamma function I got a = 7.44, b = 1.42, does this mean that its a chi-square PDF with 15 dof? but because is close to one then it is not a good fit? What would you suggest would be a good fit?

http://dl.dropbox.com/u/54057365/All/difit.JPG

Wayne King on 31 Mar 2012

Do you have some a priori reason that is must be chi-square and not a more general gamma? At first glance, the beta value indicates that a more general gamma is more appropriate here.

Answer by Image Analyst on 31 Mar 2012

To remove the gaps between the bars, you set the ' BarWidth ' property to 1:

bar(binsNumbers, CountData, 'BarWidth', 1);

By setting the BarWidth to between 0 and 1 you can go from having huge gaps between the bars (very skinny bars) to having full width bars (bars touch each other with no gap at all).