How to calculate R^2 using 1 - (SSR/SST)? For normal fit distribution.

Question

0 votes

practice3.xlsx

Hello, I have used the fitlm function to find R^2 (see below), to see how good of a fit the normal distribution is to the actual data. The answer is 0.9172.

How can I manually calculate R^2?

R^2 = 1 - (SSR/SST) or in other words 1 - ((sum(predicted - actual)^2) / ((sum(actual - mean of actual)^2)). I am having a hard time getting the correct answer.

Table = readtable("practice3.xlsx");

actual_values = Table.values;

actual_values = sort(actual_values);

normalfit = fitdist(actual_values,'Normal'); % fit the normal distribution to the data

cdfplot(actual_values); % Plot the empirical CDF

x = 0:2310;

hold on

plot(x, cdf(normalfit, x), 'Color', 'r') % plot the normal distribution

hold off

grid on

nonExceedanceProb = sum(actual_values'<=actual_values,2)/numel(actual_values);

Table.nonExceedanceProb=nonExceedanceProb;

mdl=fitlm(cdf(normalfit, actual_values),Table.nonExceedanceProb);

mdl.Rsquared.Ordinary % R^2

ans = 0.9172

mdl.SSR

ans = 0.7567

mdl.SST

ans = 0.8250

% How can I manually calculate R^2 (or SSR and SST)?

% SSR = sum(((predicted data - actual data).^2))

% TSS = sum((actual data - mean(actual data)).^2)

% Rsquared = 1 - SSR/TSS

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Torsten on 15 Feb 2023

Edited: Torsten on 15 Feb 2023

Open in MATLAB Online

0 votes

practice3.xlsx

In my opinion, it does not make sense to fit a linear function to the value pairs (cdf(normalfit, actual_values),Table.nonExceedanceProb) as you do above.

In principle, the blue points below should lie on the red line. This would mean that the empirical cdf is perfectly reproduced by the normal distribution.

So if you really want to compare the two distributions, you should consider the distance of the blue points (achieved quality of fit) to the red line (perfect fit).

Table = readtable("practice3.xlsx");

actual_values = Table.values;

actual_values = sort(actual_values);

normalfit = fitdist(actual_values,'Normal'); % fit the normal distribution to the data

nonExceedanceProb = sum(actual_values'<=actual_values,2)/numel(actual_values);

hold on

plot(nonExceedanceProb,cdf(normalfit, actual_values),'o')

plot([0 1],[0 1])

xlabel('P(empirical)')

ylabel('P(normal)')

hold off

grid on

12 Comments
Show 10 older comments Hide 10 older comments

Macy on 15 Feb 2023

Edited: Torsten on 15 Feb 2023

Open in MATLAB Online

practice3.xlsx

Ok, that makes sense. A linear function should not be fitted to the values pairs because they are not linear in pattern. So how would I compare the blue points to the red line in the original plot? I'm able to do it for the linear plot as seen here, and I got 0.8408. But I think I understand that it would be more approriate to compare the blue points to the red line in the second plot (the last one). I am getting stuck on what to input for "predicted".

Table = readtable("practice3.xlsx");

actual_values = Table.values;

actual_values = sort(actual_values);

normalfit = fitdist(actual_values,'Normal'); % fit the normal distribution to the data

nonExceedanceProb = sum(actual_values'<=actual_values,2)/numel(actual_values);

hold on

plot(nonExceedanceProb,cdf(normalfit, actual_values),'o')

plot([0 1],[0 1])

xlabel('P(empirical)')

ylabel('P(normal)')

hold off

grid on

predicted = cdf(normalfit, actual_values);

SSR = sum(((predicted - nonExceedanceProb).^2));

TSS = sum((nonExceedanceProb - mean(nonExceedanceProb)).^2);

Rsquared = 1 - SSR/TSS

Rsquared = 0.8408

cdfplot(actual_values);

x = 0:2310;

hold on

plot(x, cdf(normalfit, x), 'Color', 'r')

hold off

grid on

%So how would I input the red data into my R^2 formula?

%predicted_2 = ?????

%SSR = sum(((predicted_2 - nonExceedanceProb).^2));

%TSS = sum((nonExceedanceProb - mean(nonExceedanceProb)).^2);

%Rsquared = 1 - SSR/TSS

Macy on 21 Feb 2023

Yes, Rsquared1 he said is the "pearson correlation coefficient" and Rsquared2 is the "coefficient of determination".

Torsten on 21 Feb 2023

corr(yi,fi) is the pearson correlation coeffcient - I don't know why he wanted to square it.

Anyway: congratulations that you finished your assignment successfully.

Sign in to comment.

How to calculate R^2 using 1 - (SSR/SST)? For normal fit distribution.

0 Comments
Show -2 older comments Hide -2 older comments

Accepted Answer

12 Comments
Show 10 older comments Hide 10 older comments

More Answers (0)

Categories

Products

Release

Tags

Community Treasure Hunt

How to calculate R^2 using 1 - (SSR/SST)? For normal fit distribution.

0 Comments Show -2 older comments Hide -2 older comments

Accepted Answer

12 Comments Show 10 older comments Hide 10 older comments

More Answers (0)

Categories

Products

Release

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

12 Comments
Show 10 older comments Hide 10 older comments