%Function TLim
%
%TLim calculates the confidence limit for Hotelling T values of a
%principal components decomposition. Samples with a T value above this
%confidence limit are considered to be multivariate outliers.
%
%Syntax:
% limit = TLim (nrsamp, lv, confidence)
%
%Input parameters:
% nrsamp: the number of samples in the dataset
% lv: the number of latent variables that is used for the principal
% components analysis
% confidence: the confidence level (must be between 0 and 1), default: 0.95
%
%Output parameters:
% limit: the confidence limit for Hotelling T values
%
%Example:
% limit = TLim (100, 6, 0.95); %confidence limit for a dataset of 100 samples and 6 PC's
%
%Literature:
% 1) Mason RL, Chou YM and Young JC (2001). Applying Hotelling's T2
% statistic to batch processes. Journal of Quality Technology, 33(4),
% 466-479.
%
%This function uses the finv method from the statistics toolbox to
%calculate the value for the inverse cumulative distribution function of
%the F distribution.
%The Biodata toolbox for MATLAB: a spectral database system for storing and
%processing spectra
%C 2008, Kris De Gussem, Raman Spectroscopy Research Group, Department
%of analytical chemistry, Ghent University
%C 2009 Kris De Gussem
%
%This file is part of Biodata.
%
%Biodata is free software: you can redistribute it and/or modify
%it under the terms of the GNU General Public License as published by
%the Free Software Foundation, either version 3 of the License, or
%(at your option) any later version.
%
%Biodata is distributed in the hope that it will be useful,
%but WITHOUT ANY WARRANTY; without even the implied warranty of
%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
%GNU General Public License for more details.
%
%You should have received a copy of the GNU General Public License
%along with Biodata. If not, see <http://www.gnu.org/licenses/>.
%Copyright (c) 2008-2009, Kris De Gussem
%All rights reserved.
%
%Redistribution and use in source and binary forms, with or without
%modification, are permitted provided that the following conditions are
%met:
%
% * Redistributions of source code must retain the above copyright
% notice, this list of conditions and the following disclaimer.
% * Redistributions in binary form must reproduce the above copyright
% notice, this list of conditions and the following disclaimer in
% the documentation and/or other materials provided with the distribution
% * Neither the name of Raman Spectroscopy Research Group, Department of
% analytical chemistry, Ghent University nor the names
% of its contributors may be used to endorse or promote products derived
% from this software without specific prior written permission.
%
%THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
%AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
%IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
%ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
%LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
%CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
%SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
%INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
%CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
%ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
%POSSIBILITY OF SUCH DAMAGE.
function limit = TLim (nrsamp, lv, confidence)
switch nargin
case 2
confidence = 0.95;
case 3
otherwise
error ('Biodata:msg', 'Wrong number of input arguments. See help TLim for more information.');
end
if ~((0<confidence) && (confidence<1))
error ('Biodata:msg', 'Confidence limit must be between 0 and 1.');
end
if lv >= nrsamp
warning ('Biodata:msg', 'Confidence limit can not be calculated if the number of latent variables is equal to or higher than the number of samples.');
limit = NaN; %not a number
else
try
%use the statistics toolbox function
fvalue = finv (confidence, lv, nrsamp-lv);
catch
%if we do not have the statistics toolbox, then try the
%corresponding function in the PLS toolbox
fvalue = ftest (1 - confidence, lv, nrsamp-lv);
end
limit = (lv * (nrsamp-1)) / (nrsamp-lv) * fvalue;
end