probability density function normalization

Question

0 votes

I would like to illustrate the probability density function and the histogram of a data set. This is the code I used so far:

clc;
xValues = 0:0.001:0.5;
for i = [21,24]
    figure;
    grid on;
    hold on;
%     newcolors = [0 0 0; 1 0 0; 0.3010 0.7450 0.9330; 0.9290 0.6940 0.1250];
%     colororder(newcolors);
    for j = 0:c:(3*c) %alle 3 Messarten vergleichen
%         histfit(T_mean{i+j},20,'kernel')
        histogram(T_mean{i+j},20,'Normalization','pdf','DisplayStyle','stairs');
        pd = fitdist(T_mean{i+j},'Kernel');
        y = pdf(pd,xValues);
        plot(xValues,y)
%         ksdensity(T_mean{i+j})
    end
    hold off;
end

where c is 24. The T_mean is a table composed of 4 tables with length of 24, which are 24 different sets of data. In this case I only need 21 and 24, which each contain a vector. With this code, the probability density function and the histogram have the same normalization. But the y-axis is do large. The area under the pdf should be smaller than 1, so the y-axis could be read in %. Perhaps I don't understand the pdf function correctly. Here is a picture of one of the graph outputs:

The pdf seems to have different definitions in Matlab:

https://de.mathworks.com/help/stats/prob.normaldistribution.pdf.html

and

https://de.mathworks.com/help/stats/gmdistribution.pdf.html

Matlab seems to use the second one in this case.

How can I normalize the histogram as 'probability' but also normalize the pdf the same way?

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Jeff Miller on 10 Mar 2021

Open in MATLAB Online

1 vote

The pdf values are defined so that the total area under the pdf curve equals 1, but these values will exceed 1 (and, hence, not look like probabilities) when the range of X is less than one unit. To make pdf values look more like probabilities, you can adjust for that roughly like this:

% Generate some example values that look a little like yours:
mu = 0.25;
sigma = 0.1;
data = randn(500,1)*sigma + mu;
range = max(data) - min(data);
figure;
hold on;
nbins = 20;  % for the histogram
histogram(data,nbins,'Normalization','probability','DisplayStyle','stairs');
pd = fitdist(data,'Kernel');
xValues = 0:0.001:0.5;  % for computing the pdf
y = pdf(pd,xValues) / nbins * range;   % adjust to match probability based on nbins & range
plot(xValues,y)

4 Comments
Show 2 older comments Hide 2 older comments

Tamara Szecsey on 11 Mar 2021

Now I understand, thank you.

azhar albaaj on 5 Nov 2021

@Jeff Miller

Welcome

I have some data consisting of three parameters and I want to know the probability density function of this data in Matlab

I hope you can help me

Thank you so much

Sign in to comment.

probability density function normalization

0 Comments
Show -2 older comments Hide -2 older comments

Accepted Answer

4 Comments
Show 2 older comments Hide 2 older comments

More Answers (0)

Categories

Products

Release

Tags

Community Treasure Hunt

probability density function normalization

0 Comments Show -2 older comments Hide -2 older comments

Accepted Answer

4 Comments Show 2 older comments Hide 2 older comments

More Answers (0)

Categories

Products

Release

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

4 Comments
Show 2 older comments Hide 2 older comments