Code covered by the BSD License  

Highlights from
moutlier1

from moutlier1 by Antonio Trujillo-Ortiz
Detection of Outlier in Multivariate Samples Test.

moutlier1(X,alpha)
function moutlier1(X,alpha)
%MOUTLIER1 Detection of Outlier in Multivariate Samples Test.
% This test is based on the Wilks'method (1963) designed for detection of a single
% outlier from a normal multivariate sample and approaching the maximun squared 
% Mahalanobis distance to a F distribution function by the Yang and Lee (1987) 
% formulation. A significative squared Mahalanobis distance means an outlier.
% To test the outlier, this function calls to the ACR m-function.
%
% Syntax: function moutlier1(X,alpha) 
%      
% Inputs:
%         X - multivariate data matrix. 
%     alpha - significance level (default = 0.05). 
% Output:
%           - Table of outliers detected in a multivariate sample.
%
% Example: From the example of Rencher (2002, p. 79). Consisting of measurements 
% of the ramus bone length at four different ages (8, 8.5, 9 & 9.5 yr) on each of
% 20 boys. We are interesting to detect any multivariate outliers. We use a 
% significance of 0.1.
%
%                ---------------------------------------------------
%                                             Age
%                         ------------------------------------------
%                Subject     8          8.5          9          9.5   
%                ---------------------------------------------------
%                    1     47.8        48.8        49.0        49.7
%                    2     46.4        47.3        47.7        48.4
%                    3     46.3        46.8        47.8        48.5
%                    4     45.1        45.3        46.1        47.2
%                    5     47.6        48.5        48.9        49.3
%                    6     52.5        53.2        53.3        53.7
%                    7     51.2        53.0        54.3        54.4
%                    8     49.8        50.0        50.3        52.7
%                    9     48.1        50.8        52.3        54.4
%                   10     45.0        47.0        47.3        48.3
%                   11     51.2        51.4        51.6        51.9
%                   12     48.5        49.2        53.0        55.5
%                   13     52.1        52.8        53.7        55.0
%                   14     48.2        48.9        49.3        49.8
%                   15     49.6        50.4        51.2        51.8
%                   16     50.7        51.7        52.7        53.3
%                   17     47.2        47.7        48.4        49.5
%                   18     53.3        54.6        55.1        55.3
%                   19     46.2        47.5        48.1        48.4
%                   20     46.3        47.6        51.3        51.8
%                ---------------------------------------------------                                        
%
% Total data matrix must be:
%     X = [47.8 48.8 49 49.7;46.4 47.3 47.7 48.4;46.3 46.8 47.8 48.5;45.1 45.3 46.1 47.2;47.6 48.5 48.9 49.3;
%     52.5 53.2 53.3 53.7;51.2 53 54.3 54.4;49.8 50 50.3 52.7;48.1 50.8 52.3 54.4;45 47 47.3 48.3;
%     51.2 51.4 51.6 51.9;48.5 49.2 53 55.5;52.1 52.8 53.7 55;48.2 48.9 49.3 49.8;49.6 50.4 51.2 51.8;
%     50.7 51.7 52.7 53.3;47.2 47.7 48.4 49.5;53.3 54.6 55.1 55.3;46.2 47.5 48.1 48.4;46.3 47.6 51.3 51.8];
%
% Calling on Matlab the function: 
%                moutlier1(X,0.10)
%
% Answer is:
%
% Table of observation(s) resulting as multivariate outlier(s).
% ----------------------------------------------
%                             D2
% Observation              observed
% ----------------------------------------------
%       9                  11.0301
% ----------------------------------------------
% With a given significance level of: 0.10
% Critical value for the maximum squared Mahalanobis distance: 10.9645
% D2 = squared Mahalanobis distance.
%
% Created by A. Trujillo-Ortiz, R. Hernandez-Walls, A. Castro-Perez and K. Barba-Rojo
%            Facultad de Ciencias Marinas
%            Universidad Autonoma de Baja California
%            Apdo. Postal 453
%            Ensenada, Baja California
%            Mexico.
%            atrujo@uabc.mx
%
% Copyright. September 13, 2006.
%
%  To cite this file, this would be an appropriate format:
%  Trujillo-Ortiz, A., R. Hernandez-Walls, A. Castro-Perez and K. Barba-Rojo. (2006).
%    MOUTLIER1:Detection of Outlier in Multivariate Samples Test. A MATLAB file. [WWW document].
%    URL http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=12252
%
% References:
% Rencher, A. C. (2002), Methods of Multivariate Analysis. 2nd. ed.
%           New-Jersey:John Wiley & Sons. Chapter 13 (pp. 408-450).
% Wilks, S. S. (1963), Multivariate Statistical Outliers. Sankhya, 
%           Series A, 25: 407-426.
% Yang, S. S. and Lee, Y. (1987), Identification of a Multivariate
%           Outlier. Presented at the Annual  Meeting of the American
%           Statistical Association, San Francisco, August 1987.
%

if nargin < 2, 
   alpha = 0.05;  %(default)
end 

if nargin < 1, 
   error('Requires at least one input arguments.');
end

mX = mean(X); %Means vector from data matrix X.
[n,p] = size(X);
difT = [];

for j = 1:p;
   eval(['difT=[difT,(X(:,j)-mean(X(:,j)))];']); %squared Mahalanobis distances.
end

S = cov(X);
D2T = difT*inv(S)*difT'; 
[D2,cc] = sort(diag(D2T));  %Ascending squared Mahalanobis distances.

D2C = ACR(p,n,alpha);

idx = find(D2 >= D2C);
o = cc(idx);
io = D2(idx);

if isempty(o);
    disp(' ')
    fprintf('With a given significance level of: %.2f\n', alpha);
    disp('Non observation(s) resulting as multivariate outlier(s).');
else
    disp(' ')
    disp('Table of observation(s) resulting as multivariate outlier(s).')
    fprintf('----------------------------------------------\n');  
    disp('                            D2');
    disp('Observation              observed');
    fprintf('----------------------------------------------\n');  
    fprintf(' %6.0f               %10.4f\n',[o,io].');
    fprintf('----------------------------------------------\n');  
    fprintf('With a given significance level of: %.2f\n', alpha);
    fprintf('Critical value for the maximum squared Mahalanobis distance: %.4f\n', D2C);
    disp('D2 = squared Mahalanobis distance.');
end
 
return,

Contact us at files@mathworks.com