How to calculate the hat matrix for a linear model?

Question

Matthew Mendonca on 26 Feb 2018

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/385032-how-to-calculate-the-hat-matrix-for-a-linear-model

Answered: fred ssemwogerere on 7 Jan 2020

I am interested in calculating the hat matrix (H) for a linear regression model, so that I can get the leverage values from the diagonal of H. My independent variables are contained in X, which is a 101x5 matrix where values range from 0 to 1. I tried calculating H two different ways, and got different answers. First, I just manually calculated H using the definition of the hat matrix:

X*(inv(transpose(X)*X))*transpose(X)

Next, I obtained the hat matrix by creating the linear regression model using fitlm for my X (101x5) and Y (101x1) data:

mdl = fitlm(X,Y)

After fitting, I looked at mdl.Diagnostics.HatMatrix and found that the generated hat matrix values were different compared to when I calculated them manually using the formula above. Does fitlm perform some special scaling of the X matrix during the fitting process that causes the discrepancy? I would like to know why there is a difference, and which hat matrix is actually the one I want. I will be writing a script to calculate leverages for many different models and would like to know which hat matrix calculation method to use.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

fred ssemwogerere on 7 Jan 2020

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/385032-how-to-calculate-the-hat-matrix-for-a-linear-model#answer_409004

Hello, computations of the Hat matrix from predictors (observations) and the targets-fitted model values are expected to present differences, but not significant enough to cause any model fitting discrepancies. However, the observations-derived (Hx) Hat matrix is more of an initial estimate of the model derived Hat matrix (mdl.Diagnostics.HatMatrix). As such, i think it is preferable to use the Hat matrix derived from the model for subsequent computations.

For any more information about this subject you could also refer to:

https://in.mathworks.com/help/stats/hat-matrix-and-leverage.html

Regards,

Fred