The normalization of histcounts
Show older comments
I would like to get the probability density function (PDF) from an array of data A (contained in the attached "a.mat" file).
If I understood correctly, if I use the normalization option called "probability", I would get the "relative frequency histogram".
Instead, if I use the normalization option called "pdf", I would get an "empirical estimate of the Probability Density Function".
However, when I check the sum of the probabilities ,I get "1" if I use the "probability" option, but I do not get "1" if I use the "pdf" option:
load('a.mat', 'A')
num_bins = 70;
B = histcounts(A,num_bins,'Normalization','probability');
sum(B)
C = histcounts(A,num_bins,'Normalization','pdf');
sum(C)
Shouldn't "sum(B)" give the sum of the relative frequencies, and "sum(C)" the sum of the the blocks' areas representing percentages?
What did I do wrong?
Accepted Answer
More Answers (1)
the cyclist
on 4 Aug 2023
Edited: the cyclist
on 4 Aug 2023
1 vote
PDF is the probability density, not the probability. To get the probability for a given bin, you need to multiply by the bin width.
Your sum of C does not take that into account. MATLAB's "probability" normalization (your B calculation) is doing that for you.
Categories
Find more on Creating and Concatenating Matrices in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!