from boothomvart by Antonio Trujillo-Ortiz
Bootstrap Homogeneity of Variances Test T Analytical Approach.

boothomvart.m
function boothomvart(x,s,alpha)
%BOOTHOMVART Bootstrap Homogeinity of Variances Test T Analytical Approach.
%   The bootstrap is a way of estimating the variability of a statistic   
%   from a single data set by resampling it independently and with equal
%   probabilities (Monte Carlo resampling). Allows the estimation of 
%   measures where the underlying distribution is unknown or where sample 
%   sizes are small. Their results are consistent with the statistical 
%   properties of those analytical methods (Efron and Tibshirani, 1993).
%
%   The name 'bootstrap' originates from the expression 'pulling yourself 
%   up by your own bootstraps' and refers to the basic idea of the 
%   bootstrap, sampling with replacement from the data. In this way a
%   large number of 'bootstrap samples' is generated, each of the same size
%   as the original data set. From each bootstrap sample the statistical 
%   parameter of interest is calculated (Wehrens and Van der Linden, 1997).
%  
%   Here, we use the Non-parametric Bootstrap. Non-parametric bootstrap is 
%   simpler. It does not use the structure of the model to construct 
%   artificial data. The data is instead directly resampled with
%   replecement.
%
%   The homogeneity of variances test is a useful tool in many scientific 
%   applications. Boos and Brownie (2004) and Conover et al. (1981) give a
%   broad review. 
%
%   Cahoy (2010) proposed a variance-based statistic that led to a bootstrap 
%   test for heterogeneity of variances, for any distribution and with a 
%   slight modification of the Alam and Cahoy's test (1999). This procedure, 
%   who used a generalized box-type acceptance region is shown to be more 
%   sensitive to slight deviations from the null specifications. Cahoy 
%   (2010) remarks that the properties of the test may change when there are
%   more than four populations involved, and these populations are not from
%   a location-scale family and may have different kurtosis. Meaning that 
%   experimenters should exercise caution when this method is used in 
%   practice. Within the boundaries of the study, he generally recommend the
%   test T under most conditions.
%
%   As Cahoy (2010) did, here a m-file analytical procedure using 
%   bootstrap method is developed as an alternative to the homogeinity of 
%   variances test. 
%
%   BOOTHOMVART treats NaN values as missing values, and removes them.
%
%   Syntax: function boothomvarT(x,s,alpha)
%
%     Inputs:
%          x  data nx2 matrix (Col 1 = data; Col 2 = sample code)
%          s - boot times or number of Bootstrap simulations (resamplings)
%      alpha - significance level (default=0.05)
%
%     Outputs:
%          - Summary statistics from the samples
%          - Decision on the null-hypothesis tested
%
%     --We would appreciate any suggestions to improve this m-code in order
%      to reduce the elapsed time. The below example, with 3000 resamplings
%      took near 38 seconds.--
%
%   Taking the numerical example given by Dr. Burt B. Gerstman, Department
%   of Health Science, San Jose State University, in his internet site
%   [http://www.sjsu.edu/faculty/gerstman/StatPrimer/anova-b.pdf], from a 
%   study on skin pigmentation from four families from the same 'racial 
%   group'. The dependent variable is a measure of skin pigmentation.
%   Data are:
%
%                  ---------------------------------------
%                                  Family
%                  ---------------------------------------
%                     1          2          3          4
%                  ---------------------------------------
%                    36         46         40         45
%                    39         47         50         53
%                    43         47         44         56
%                    38         47         48         52
%                    37         43         50         56
%                  ---------------------------------------
%
%   We wish to test whether the four variables have the same variances after
%   3000 re-samplings and with a significance of 0.05.
%   
%   Input data:
%
%   X = [36 1;39 1;43 1;38 1;37 1;46 2;47 2;47 2;47 2;43 2;40 3;50 3;
%   44 3;48 3;50 3;45 4;53 4;56 4;52 4;56 4];
%
%   Calling on Matlab the function: 
%                boothomvart(X,3000,0.05)
%
%   Answer is:
%
%   Summary statistics from the samples.
%   --------------------------------------------------
%    Sample       Size        Mean           Variance
%   --------------------------------------------------
%       1           5        38.6000           7.3000
%       2           5        46.0000           3.0000
%       3           5        46.4000          18.8000
%       4           5        52.4000          20.3000
%   --------------------------------------------------
% 
%   Critical value = 2.8233; 
%   No. of test statistic values at least equal to critical value = 0
%  
%   After 3000 resamplings and with a significance of 0.05
%   We accept the null hypothesis that the variances are homogeneous.
%
%   Created by A. Trujillo-Ortiz and R. Hernandez-Walls
%             Facultad de Ciencias Marinas
%             Universidad Autonoma de Baja California
%             Apdo. Postal 453
%             Ensenada, Baja California
%             Mexico.
%             atrujo@uabc.edu.mx
%
%   Copyright (C) August 15, 2011. 
%
%  ---We thank Dr. Dexter O. Cahoy (Program of Mathematics and Statistics, 
%  College of Engineering and Science, Louisiana Tech University, Ruston, LA)
%  for provided us the paper's hard copy to be possible this work.---
%
%   To cite this file, this would be an appropriate format:
%   Trujillo-Ortiz, A. and R. Hernandez-Walls. (2011). boothomvart: 
%      Bootstrap Homogeinity of Variance Test T Analytical Approach. 
%      [WWW document]. URL http://www.mathworks.com/matlabcentral/fileexchange/
%      32646-boothomvart
%
%   References:
%   Alam, K. and Cahoy, D. O. (1999), A test for equality of variences. 
%              Journal of Mathematical Sciences, Philippines, 2(1):1-19.
%   Boos, D. and Brownie, C. (2004), Comparing variances and other measures
%              of dispersion. Stat. Sci., 19(4):571-578.
%   Conover, M. E., Johnson, M. E. and Johnson, M. M. (1981), A comparative 
%              study of variances with applications to the outer continental
%   Cahoy, D. O. (2010), A bootstrap test for equality of variances. Comp.
%              Stat. and Data Analysis. 54(10):2306-2316.
%              shelf bidding data. Technometrics, 23:351-361.
%   Efron, B. and Tibshirani, R. J. (1993), An Introduction to the Bootstrap
%              Chapman and Hall:New York.
%   Wehrens, R and Van der Linden, W. E. (1997), Bootstrapping Principal
%              Component Regression Models. Journal of Chemometrics, 
%              11:157171.
%

if  nargin < 2,
    error('boothomvart:TooFewInputs', ...
          'BOOTHOMVART requires at least three input arguments.');
end

if nargin < 3 || isempty(alpha)
    alpha = 0.05; %default
elseif numel(alpha) ~= 1 || alpha <= 0 || alpha >= 1
    error('boothomvart:BadAlpha','ALPHA must be a scalar between 0 and 1.');
end

X = x;

c = size(X,2);
if c ~= 2
    error('stats:boothomvart:BadData','X must have two colums.');
end

%Remove NaN values, if any
X = X(~any(isnan(X),2),:);

k = max(X(:,2));

indice = X(:,2);
for i = 1:k
    Xe = indice == i;
    d(i).X = X(Xe,1);
    d(i).m = mean(d(i).X);
    d(i).vo = var(d(i).X);
    d(i).e = d(i).X - d(i).m;
    d(i).v = var(d(i).e);
    d(i).n = length(d(i).e);
end
m=cat(1,d.m);vo=cat(1,d.vo);e=cat(1,d.e);v=cat(1,d.v);n=cat(1,d.n);

disp(' ')
disp('Summary statistics from the samples.')
disp('--------------------------------------------------')
disp(' Sample       Size        Mean           Variance ')
disp('--------------------------------------------------')
for i = 1:k
   fprintf('   %d           %i        %7.4f          %7.4f\n',i,n(i),m(i),vo(i))
end
disp('--------------------------------------------------')
disp(' ')

k2 = kurtosis(e);

for i = 1:k
    d(i).vlnc = ((1/(d(i).n - 1)))*(k2-(d(i).n - 3)/d(i).n);
end
varlns2c=cat(1,d.vlnc);

for i = 1:k
    d(i).nuc = (log(d(i).v/prod(v)^(1/(k))));
    d(i).lamc = sqrt((1-2/k)*d(i).vlnc + (1/(k^2)*sum(varlns2c)));
    d(i).tc = d(i).nuc/d(i).lamc;
end
nuc=cat(1,d.nuc);lamc=cat(1,d.lamc);tc=cat(1,d.tc);

warning off
for i = 1:s
    for j = 1:k
        Xe = indice == j;
        d(j).X = X(Xe,1);
        d(j).n = length(d(j).X);
        d(j).id = ceil(rand(d(j).n,s)*d(j).n);
        d(j).bd = d(j).X(d(j).id);
    end
    bd=cat(1,d.bd);id=cat(1,d.id);
end

xx = X(:,2);

NU=[];LAM=[];T=[];
for i = 1:s
    X = [bd(:,i) xx];
    indice = X(:,2);
    for j = 1:k
        Xe = indice == j;
        d(j).X = X(Xe,1);
        d(j).e = d(j).X - mean(d(j).X);
        d(j).v = var(d(j).e);
        d(j).n = length(d(j).e);
    end
    e=cat(1,d.e);v=cat(1,d.v);n=cat(1,d.n);

    k2 = kurtosis(e);

    for j = 1:k
        d(j).vln = (1/(d(j).n - 1))*(k2-(d(j).n - 3)/d(j).n);
    end
    varlns2=cat(1,d.vln);

    for j = 1:k
        d(j).nu = log(d(j).v/prod(v)^(1/(k)));
        d(j).lam = sqrt((1-2/k)*d(j).vln + (1/(k^2)*sum(varlns2)));
        d(j).t = d(j).nu/d(j).lam;
    end
    nu=cat(1,d.nu);lam=cat(1,d.lam);t=cat(1,d.t);
    NU = [NU;nu];
    LAM = [LAM;lam];
    T = [T;t];
end

nans = isnan(T);	% finds the elements of T which are NaNs
T(nans) = 0;	    % and/or Infs and set them to 0
infs = isinf(T);
T(infs) = 0;
T = reshape(T,k,s);
T(:,T(1,:) == 0) = [];

T = T - repmat(mean(T,2),1,size(T,2));

if k == 2
    S = sort(abs(T(1,:)),'descend');
    cv = S(ceil(s*alpha)+1);
else
    S = sort(abs(T(:)),'descend');
    LS = S;LI = -S;
    st = length(T);
    r = round(k*.3*st);
    
    D = [];
    for i = 1:r
        d(i).li = repmat(LI(i),st,k);
        d(i).ls = repmat(LS(i),st,k);
        d(i).c = (T' <= d(i).ls) & (T' >= d(i).li);
        d(i).d = sum(d(i).c);
    end 
    d=cat(1,d.d);
    D = [D;d];
    
    for v = 1:size(D,1)
        if any(D(v,:) == (floor(st*(1-alpha))))
            break
        end
    end
    cv = S(v);
end

ct = length(find(abs(tc) >= cv));

fprintf('Critical value = %3.4f\n',cv); 
fprintf('No. of test statistic values at least equal to critical value = %g\n',ct);
disp('  ')
if any(abs(tc) >= cv);
    fprintf('After %g resamplings and with a significance of %3.2f\n',s,alpha);
    fprintf('We reject the null hypothesis that the variances are homogeneous.\n');
else
    fprintf('After %g resamplings and with a significance of %3.2f\n',s,alpha);
    fprintf('We accept the null hypothesis that the variances are homogeneous.\n');
end
warning on

return,

Contact us