RMSE of model with standardized input

Question

Hadi Hadi on 13 Apr 2015

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/202621-rmse-of-model-with-standardized-input

Commented: Star Strider on 13 Apr 2015

Hi all,

I am running a model which requires to standardize the input predictor and response variables. I wish to calculate the RMSE in original input response unit as in the following script. I standardize my input training and test data with mean and standard deviation of the training set. However, the magnitude of the RMSE is not as I expected. Appreciate if anyone can suggest where did I make mistake in my script. Many thanks :)

% Calculate mean and sd of data_train
% data_train and data_test is matrix with response in 1st column
mean_train=mean(data_train); sd_train=std(data_train);
% Standardize data_train and data_test with mean and sd of data_train
zdata_train=(data_train-repmat(mean_train,[size(data_train,1) 1]))./ ...
                repmat(sd_train, [size(data_train,1) 1]);
zdata_test=(data_test-repmat(mean_train,[size(data_test,1) 1]))./ ...
                repmat(sd_train, [size(data_test,1) 1]);
xtrain=zdata_train(:,2:end); ytrain=zdata_train(:,1);
xtest=zdata_test(:,2:end); ytest=zdata_test(:,1);
% Run model with output test set predicted response (standardized) ymu_te
% Calculate RMSE in original y unit
ymu_te = ymu_te.*sd_train(:,1) + mean_train(:,1); % response in 1st column
RMSE_test=sqrt(mean((ytrain-ymu_te).^2));

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Star Strider on 13 Apr 2015

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/202621-rmse-of-model-with-standardized-input#answer_175049

Open in MATLAB Online

I’m not following what you’re doing. Your ‘zdata_test’ seem to be using your training data in their calculations, so that may be giving you anomalous results. If you have the Statistics Toolbox, use the zscore function. If not, this works as well (tested against zscore):

z_score = @(data) bsxfun(@rdivide,bsxfun(@minus,data,mean(data)),std(data));

Your data must be in column-major order, so that each variable is in a column, and observations correspond to rows.

2 Comments
Show NoneHide None

Hadi Hadi on 13 Apr 2015

Hi Star thanks for reply, I standardize the test set with mean and std of training set because I read somewhere that it is the right independent validation procedure (test set should not be 'seen' at any ways!). Also, since my test set in my 10-fold CV procedure contains very few cases the mean and std may not be representative of the population, as far as I understand. And by the way, previously I did the same analysis by standardizing the test set with their mean and std and still the transformation from the zscore of the predicted response to its original unit gives me unexpected magnitude. The problem is I cannot compare the RMSE in z unit with RMSE from other models I tested which uses RMSE in original unit, and I feel like I since in each fold of the CV, the training set is different I should use the mean and std of each training set realization to transform the standardized predicted response to original unit. Is it the right procedure? Thanks.

Star Strider on 13 Apr 2015

My impression is that the training data and test data should be individually standardised.

Sign in to comment.

RMSE of model with standardized input

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

2 Comments
Show NoneHide None

See Also

Categories

Tags

Community Treasure Hunt

RMSE of model with standardized input

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

2 Comments Show NoneHide None

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

2 Comments
Show NoneHide None