"shinchan " <shinchan75034@gmail.com> wrote in message <gro6rh$9gq$1@fred.mathworks.com>...
> I am a statistics newbie, so this may sound elementary but I jsut don't know it. What's the proper way to determine the bin size when using hist() to show a histogram? As the following code shows, bin size 5 gives a zeromean, one 'Gaussian distributed' population. But at bin size 30, the same data looks completely different, and itappears that there are multiple populations.
>
> r=randn([40, 1]); %Generate Gaussian distribution made of 40 random numbers.
> figure(1); hist(r, 5);
> figure(2); hist(r, 30);
>
> So if r is my experiemnt data, which I don't know a priori how many different populations exist, what should I do?

That's because with 40 numbers and 30 bins, you don't have enough counts in each bin to give a good shape. Look at the example below where I used 40,000 numbers in 30 bins instead of only 40 numbers. The shape is now nice. With only 40 numbers, you have many bins with only 0, 1, or 2 counts in them  nowhere near enough to visualize the true shape of the distribution.
If r is your data and you only have 40 observations, you're really dependent on how much spread there is in your observations if you want to determine if there is one population or two. Consult a standard college textbook.
Regards,
ImageAnalyst
clc;
close all;
% Display the original image.
figuresc(0.9, 0.8);
randomNumbers1 = randn([40, 1]); %Generate Gaussian distribution made of 40 random numbers.
counts1 = hist(randomNumbers1, 5);
subplot(1, 3, 1);
bar(counts);
title('40 numbers in 5 bins');
counts2 = hist(randomNumbers1, 30);
subplot(1, 3, 2);
bar(counts2);
title('40 numbers in 30 bins');
randomNumbers = randn([40000, 1]); %Generate Gaussian distribution made of 40000 random numbers.
counts3 = hist(randomNumbers, 30);
subplot(1, 3, 3);
bar(counts3);
title('40000 numbers in 30 bins');
