Code covered by the BSD License  

Highlights from
Cookdist

4.0

4.0 | 1 rating Rate this file 6 Downloads (last 30 days) File Size: 3.84 KB File ID: #8716
image thumbnail

Cookdist

by Antonio Trujillo-Ortiz

 

12 Oct 2005 (Updated 14 Nov 2005)

Cook's distance influence index.

| Watch this File

File Information
Description

This quantity measures how much the entire regression function changes when the i-th observation is deleted. Should be comparable to F_p,n-p: if the 'p-value' of D_i is 50 percent or more, then the i-th point is likely influential: investigate this point further. Cook's distance (D_i) is an influence measure based on the difference between the regression parameter estimates b and what they become if the i-th data point is removed, b_-1.

The usual criterion is that a point is influential if D_i exceeds the median of the F_p,n-p distribution, where p is the number of regression coefficients (including the intercept) and n the number of data.

Inputs:
     D - matrix data (=[X Y]) (last column must be the Y-dependent variable). (X-independent variable entry can be for a simple [X], multiple [X1,X2,X3,...Xp] or polynomial [X,X^2,X^3,...,X^p] regression model).

Outputs:
     A complete summary (table and/or plot) of the Cook's influence index. For the graph, the cross-hair can be positioned with the mouse at the selected location.

NOTE.-One should be careful. This procedure it is not a conclusive test to detect any outliers on regression models, but unusual observations by its very high leverage and high influence values. For such a case you should to check it under the appropriate assumptions.

MATLAB release MATLAB 6.5 (R13)
Tags for This File  
Everyone's Tags
Tags I've Applied
Add New Tags Please login to tag files.
Comments and Ratings (1)
14 Oct 2005 urs (us) schwarz

nice addition to brett shoelson's great deleteoutliers snippet
although it is not (that) important, the programming is a bit awkward, eg,
- filling d in a loop? d=(1:n).';
- any(...)? just use the logical output; it is faster
- the plotting bit? could be done by stem
- the gtext bit is a bit annoying; it doesn't really give additional functionality. the value could (for instance) be part of the title
- it would be nice if the snippet returned some values, eg, the indices and the f-val
just a few thoughts
us

Please login to add a comment or rating.
Updates
14 Oct 2005

It was added an appropriate format to cite this file.

17 Oct 2005

File improved to address suggestions by Urs Schwartz.

18 Oct 2005

Text was improved.

14 Nov 2005

Text was improved.

Tag Activity for this File
Tag Applied By Date/Time
statistics Antonio Trujillo-Ortiz 22 Oct 2008 08:02:49
probability Antonio Trujillo-Ortiz 22 Oct 2008 08:02:49
cook Antonio Trujillo-Ortiz 22 Oct 2008 08:02:49
regression Antonio Trujillo-Ortiz 22 Oct 2008 08:02:49
influential Antonio Trujillo-Ortiz 22 Oct 2008 08:02:49
residual Antonio Trujillo-Ortiz 22 Oct 2008 08:02:49
distance Antonio Trujillo-Ortiz 22 Oct 2008 08:02:49

Contact us at files@mathworks.com