How to calculate the hat matrix for a linear model?

35 views (last 30 days)
I am interested in calculating the hat matrix (H) for a linear regression model, so that I can get the leverage values from the diagonal of H. My independent variables are contained in X, which is a 101x5 matrix where values range from 0 to 1. I tried calculating H two different ways, and got different answers. First, I just manually calculated H using the definition of the hat matrix:
X*(inv(transpose(X)*X))*transpose(X)
Next, I obtained the hat matrix by creating the linear regression model using fitlm for my X (101x5) and Y (101x1) data:
mdl = fitlm(X,Y)
After fitting, I looked at mdl.Diagnostics.HatMatrix and found that the generated hat matrix values were different compared to when I calculated them manually using the formula above. Does fitlm perform some special scaling of the X matrix during the fitting process that causes the discrepancy? I would like to know why there is a difference, and which hat matrix is actually the one I want. I will be writing a script to calculate leverages for many different models and would like to know which hat matrix calculation method to use.

Answers (1)

fred  ssemwogerere
fred ssemwogerere on 7 Jan 2020
Hello, computations of the Hat matrix from predictors (observations) and the targets-fitted model values are expected to present differences, but not significant enough to cause any model fitting discrepancies. However, the observations-derived (Hx) Hat matrix is more of an initial estimate of the model derived Hat matrix (mdl.Diagnostics.HatMatrix). As such, i think it is preferable to use the Hat matrix derived from the model for subsequent computations.
For any more information about this subject you could also refer to:
Regards,
Fred

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!