4.77778

4.8 | 9 ratings Rate this file 133 downloads (last 30 days) File Size: 2.54 KB File ID: #3961

deleteoutliers

by Brett Shoelson

 

15 Sep 2003 (Updated 08 Oct 2003)

Code covered by BSD License  

For input vector A, returns a vector B with outliers removed.

Download Now | Watch this File

File Information
Description

% [B, IDX, OUTLIERS] = DELETEOUTLIERS(A, ALPHA, REP)
%
% For input vector A, returns a vector B with outliers (at the significance
% level alpha) removed. Also, optional output argument idx returns the
% indices in A of outlier values. Optional output argument outliers returns
% the outlying values in A.
%
% ALPHA is the significance level for determination of outliers. If not
% provided, alpha defaults to 0.05.
%
% REP is an optional argument that forces the replacement of removed
% elements with NaNs to presereve the length of a. (Thanks for the
% suggestion, Urs.)
%
% This is an iterative implementation of the Grubbs Test that tests one
% value at a time. In any given iteration, the tested value is either the
% highest value, or the lowest, and is the value that is furthest
% from the sample mean. Infinite elements are discarded if rep is 0, or
% replaced with NaNs if rep is 1 (thanks again, Urs).
%
% Appropriate application of the test requires that data can be reasonably
% approximated by a normal distribution. For reference, see:
% 1) "Procedures for Detecting Outlying Observations in Samples," by F.E.
% Grubbs; Technometrics, 11-1:1--21; Feb., 1969, and
% 2) _Outliers in Statistical Data_, by V. Barnett and
% T. Lewis; Wiley Series in Probability and Mathematical Statistics;
% John Wiley & Sons; Chichester, 1994.
% A good online discussion of the test is also given in NIST's Engineering
% Statistics Handbook:
% http://www.itl.nist.gov/div898/handbook/eda/section3/eda35h.htm
%
% ex:
% [B,idx,outliers] = deleteoutliers([1.1 1.3 0.9 1.2 -6.4 1.2 0.94 4.2 1.3 1.0 6.8 1.3 1.2], 0.05)
% returns:
% B = 1.1000 1.3000 0.9000 1.2000 1.2000 0.9400 1.3000 1.0000 1.3000 1.2000
% idx = 5 8 11
% outliers = -6.4000 4.2000 6.8000
%
% ex:
% B = deleteoutliers([1.1 1.3 0.9 1.2 -6.4 1.2 0.94 4.2 1.3 1.0 6.8 1.3 1.2
% Inf 1.2 -Inf 1.1], 0.05, 1)
% returns:
% B = 1.1000 1.3000 0.9000 1.2000 NaN 1.2000 0.9400 NaN 1.3000 1.0000 NaN 1.3000 1.2000 NaN 1.2000 NaN 1.1000
% Written by Brett Shoelson, Ph.D.
% shoelson@helix.nih.gov
% 9/10/03
% Modified 9/23/03 to address suggestions by Urs Schwartz.

Required Products Statistics Toolbox
MATLAB release MATLAB 6.5 (R13)
Tags for This File  
Everyone's Tags
Tags I've Applied
Add New Tags Please login to tag files.
Comments and Ratings (9)
22 Sep 2003 Urs Schwarz (us)

very nice, brett. a few remarks re OUTPUT: there should be an option to replace outlieres with nans (to keep i/o vecs the same length); re INPUT: the option <ul> shows up in the help but doesn't seem to have a meaning (yet?); re PROCESSING: 1) nans are cut away (why? we don't know what a nan is in any context), 2) +-infs, on the other hand, are not (?).

25 Nov 2003 Effendi Widjaja  
25 Oct 2004 Torsten Staab

Nice job!

07 Dec 2004 Vadim Moldavsky

Great

16 Sep 2005 James J. Cai  
09 Oct 2005 s b

very useful!!!!

16 May 2007 dali kaafar  
19 Jan 2009 Hanna Modin

Thank you for a nice implementation of Grubbs test! If I might suggest an improvement that would be to make the test work with other than vectors, e.g. to remove outliers from each row in a matrix separately

01 Oct 2009 Marcin

Very good. I compared your results with the one from:
http://www.graphpad.com/quickcalcs/Grubbs1.cfm
on my data and got the same results. Good work!

Please login to add a comment or rating.
Updates
24 Sep 2003

Addresses comments of Urs Schwartz... now provides optional input argument form maintaining vector length; also now discards Inf's.

08 Oct 2003

Modified to avoid errors caused by duplicate "maxvals." (Thanks to Valeri Makarov for modification suggestion.)

Tag Activity for this File
Tag Applied By Date/Time
statistics Brett Shoelson 22 Oct 2008 07:07:37
probability Brett Shoelson 22 Oct 2008 07:07:37
outlier Brett Shoelson 22 Oct 2008 07:07:37
grubbs Brett Shoelson 22 Oct 2008 07:07:37
test Brett Shoelson 22 Oct 2008 07:07:37
data Brett Shoelson 22 Oct 2008 07:07:37
conditioning Brett Shoelson 22 Oct 2008 07:07:37
 

MATLAB Central Terms of Use

NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Terms prior to use.

Contact us at files@mathworks.com