Short script that calculates root mean square error from data vector or matrix and the corresponding estimates.
Checks for NaNs in data and estimates and deletes them and then simply does:
r = sqrt( sum( (data(:)-estimate(:)).^2) / numel(data) );
I always use mean function instead of sum and divide
rms = sqrt(mean((data(:)-estimate(:)).^2));
Hi Felix and Gary,
yes, the two sums could be avoided by simply writing
The computation time is about the same but readability might be enhanced by using the colon operator.
@Gary: no, you need two sums if you process matrices, the first sums across all columns, the second then sums across the resulting vector. If you process vectors, the second sum calculates the sum of a scalar. Faster than checking for dimensions first.
you have one too many SUM() in the eqn, although it appears to be harmless. Am I correct? RMS Error is then;
Thanks for the feedback Wolfgang, I completely forgot that nansum needs the statistical toolbox, and of course you are right that it becomes incorrect with nans. I should have divided by numel(~isnan(data)), but deleting all NaNs in this case _is_ better! Your version actually would extract all NaNs and discard the values, so I used
I = ~isnan(data) & ~isnan(estimate); instead, which works a treat!
Durga, it's great you advertise your script on my page ;-) I see no point in input argument checking for this oneliner though - in my case I would have to reshape my matrices to use your script, not sure if that is better...
Anyway, once your script takes care of NaNs as suggested by Wolfgang, it is surely great as it calculates more than one goodness of fit.
the formula becomes incorrect as soon as you have nans in your arrays. You should remove nans first in both arrays
I = isnan(data) | isnan(estimate);
data = data(I);
estimate = estimate(I);
and then apply the formula. That even allows you to use sum instead of nansum, thereby avoiding dependence on the statistical toolbox.
This code is without input argument checking.
To compute more types of goodness of fit (including RMSE, coefficient of determination, mean absolute relative error etc.) please have a look
Updated description and code for better readability and
By popular demand: using sum(data(:)) instead of sum(sum(data)). Thanks!
- delete NaNs and use sum instead of nansum, eliminating the need for the statistical toolbox
include NaN checking
Inspired: rmse(true_values, prediction)
Create scripts with code, output, and formatted text in a single executable document.