Code covered by the BSD License  

Highlights from
Statistical Learning Toolbox

from Statistical Learning Toolbox by Dahua Lin
Functions for statistical learning, pattern recognition and computer vision, covering many topics.

Description of slcovlarge
Home > sltoolbox > stat > slcovlarge.m

slcovlarge

PURPOSE ^

SLCOVLARGE Computes large covariance matrix using memory-efficient way

SYNOPSIS ^

function C = slcovlarge(X, w, vmean, cachesize)

DESCRIPTION ^

SLCOVLARGE Computes large covariance matrix using memory-efficient way

 $ Syntax $
   - C = slcovlarge(X, w, vmean, cachesize)

 $ Arguments $
   - X:            the sample matrix
   - w:            the weights of samples (default = [])
                   if it is specifed, it should be a 1 x n vector
   - vmean:        the pre-computed mean (default = [])
   - cachesize:    the size of working cache (in the unit of Mbytes)

 $ Description $
   - C = slcovlarge(X, w, vmean, cachesize) computes the large covariance 
     matrix within a memory-limited context. Compared to slcov, which is
     faster at the expense of using more memory, it is slower, however
     the memory used is under control. Thus it can effectively prevent
     from the situation of out of memory. 

 $ Remarks $
   - The memory increased in the function will not exceed 
     the size of the covariance matrix plus the cachesize.

   - For vmean
       - if it is empty, then a mean vector will be calculated
       - if it is zero, then the data is assumed to be centralized
       - if it is specified, then its size should be d x 1.

   - The memory estimated to be used for each section is
     if has no weight
           2 x sd x sn + 2 x (sd + sn). 
     else
           3 x sd x sn + 2 x (sd + sn).
     end
     Here sd and sn are the sub-dimension
     and sub-number of that section.

 $ History $
   - Created by Dahua Lin, on Aug 17, 2006

CROSS-REFERENCE INFORMATION ^

This function calls:
  • slmean SLMEAN Compute the mean vector of samples
  • raise_lackinput RAISE_LACKINPUT Raises an error indicating lack of input argument
  • slignorevars SLIGNOREVARS Ignores the input variables
  • slpartition SLPARTITION Partition a range into blocks in a specified manner
This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 function C = slcovlarge(X, w, vmean, cachesize)
0002 %SLCOVLARGE Computes large covariance matrix using memory-efficient way
0003 %
0004 % $ Syntax $
0005 %   - C = slcovlarge(X, w, vmean, cachesize)
0006 %
0007 % $ Arguments $
0008 %   - X:            the sample matrix
0009 %   - w:            the weights of samples (default = [])
0010 %                   if it is specifed, it should be a 1 x n vector
0011 %   - vmean:        the pre-computed mean (default = [])
0012 %   - cachesize:    the size of working cache (in the unit of Mbytes)
0013 %
0014 % $ Description $
0015 %   - C = slcovlarge(X, w, vmean, cachesize) computes the large covariance
0016 %     matrix within a memory-limited context. Compared to slcov, which is
0017 %     faster at the expense of using more memory, it is slower, however
0018 %     the memory used is under control. Thus it can effectively prevent
0019 %     from the situation of out of memory.
0020 %
0021 % $ Remarks $
0022 %   - The memory increased in the function will not exceed
0023 %     the size of the covariance matrix plus the cachesize.
0024 %
0025 %   - For vmean
0026 %       - if it is empty, then a mean vector will be calculated
0027 %       - if it is zero, then the data is assumed to be centralized
0028 %       - if it is specified, then its size should be d x 1.
0029 %
0030 %   - The memory estimated to be used for each section is
0031 %     if has no weight
0032 %           2 x sd x sn + 2 x (sd + sn).
0033 %     else
0034 %           3 x sd x sn + 2 x (sd + sn).
0035 %     end
0036 %     Here sd and sn are the sub-dimension
0037 %     and sub-number of that section.
0038 %
0039 % $ History $
0040 %   - Created by Dahua Lin, on Aug 17, 2006
0041 %
0042 
0043 %% parse and verify input arguments
0044 
0045 if nargin < 4
0046     raise_lackinput('slcovlarge', 4);
0047 end
0048 
0049 if ndims(X) ~= 2 || ~isnumeric(X)
0050     error('sltoolbox:invalidarg', ...
0051         'X should be a 2D numeric matrix');
0052 end
0053 
0054 [d, n] = size(X);
0055 
0056 if ~isempty(w)
0057     if ~isequal(size(w), [1, n])
0058         error('sltoolbox:sizmismatch', ...
0059             'The w should be an 1 x n vector');
0060     end
0061 end
0062 
0063 % convert cache size
0064 cacheelems = floor(cachesize * 1e6 / 8);
0065 
0066 if isempty(vmean)
0067     if cacheelems < d
0068         error('sltoolbox:notenoughmem', ...
0069             'The cache size is not enough to hold even a mean vector');
0070     end    
0071     cacheelems = cacheelems - d;
0072     vmean = slmean(X, w);
0073 else
0074     if ~isequal(vmean, 0)
0075         if ~isequal(size(vmean), [d, 1])
0076             error('sltoolbox:sizmismatch', ...
0077                 'The vmean should be a d x 1 vector');
0078         end
0079     end
0080 end
0081 
0082 
0083 %% decide partition struct
0084 
0085 if cacheelems > 2*d*n + 2*(d+n)   % can compute without partitioning
0086     ps = [];            
0087 else     
0088     if isempty(w)
0089         cblks = 2;
0090     else
0091         cblks = 3;
0092     end
0093             
0094     sd = floor( (cacheelems - 2 * n) / (cblks * n + 2) );
0095     divide_row = false;
0096     if sd <= 0
0097         sd = 1;
0098         if cacheelems < cblks + 4
0099             error('sltoolbox:notenoughmem', ...
0100                 'The cache is not large enough');
0101         end
0102         sn = floor((cacheelems - 2) / (cblks + 2));
0103         divide_row = true;
0104     end
0105             
0106     ps = slpartition(d, 'maxblksize', sd);
0107     if divide_row
0108         rps = slpartition(d, 'maxblksize', sn);
0109     end        
0110 end
0111 
0112 
0113 %% compute
0114 
0115 if isempty(ps)
0116     
0117     if ~isempty(w);
0118         X = weight_x(X, w);
0119     end
0120     X = shift_x(X, vmean);
0121     Xt = X';
0122     C = X * Xt;
0123     
0124 else
0125     
0126     if ~divide_row      % each row is taken as integral
0127         
0128         C = zeros(d, d);        
0129         nsecs = length(ps.sinds);
0130         
0131         for i = 1 : nsecs
0132             for j = 1 : nsecs
0133                 si = ps.sinds(i);
0134                 ei = ps.einds(i);
0135                 sj = ps.sinds(j);
0136                 ej = ps.einds(j);
0137                                    
0138                 if isempty(w)
0139                     curXj = X(sj:ej, :);  
0140                     curXj = shift_x(curXj, vmean, sj, ej);
0141                     curXjt = curXj';
0142                     clear curXj;
0143                     
0144                     curXi = X(si:ei, :);    
0145                     curXi = shift_x(curXi, vmean, si, ei);
0146                 else
0147                     curXj = X(sj:ej, :);
0148                     curXj = shift_x(curXj, vmean, sj, ej);               
0149                     curXj = weight_x(curXj, w);
0150                     clear curwj;
0151                     curXjt = curXj';
0152                     clear curXj;
0153                     
0154                     curXi = X(si:ei, :);
0155                     curXi = shift_x(curXi, vmean, si, ei);
0156                     curXi = weight_x(curXi, w);
0157                     clear curwi;
0158                 end
0159                 
0160                 C(si:ei, sj:ej) = curXi * curXjt;
0161 
0162             end
0163         end
0164                                 
0165     else                % even each row need to be divided
0166         
0167         slignorevars(rps);
0168         
0169         error('sltoolbox:rterror', ...
0170             'In current implementation, row-division is not implemeted yet');        
0171     end    
0172     
0173 end
0174               
0175 
0176 %% scale down
0177 
0178 if isempty(w)
0179     C = C / n;
0180 else
0181     tw = sum(w);
0182     C = C / tw;
0183 end
0184 
0185 
0186 %% Core computing function
0187 
0188 function sx = weight_x(sx, w)
0189 
0190 sn = size(sx, 2);
0191 for i = 1 : sn
0192     sx(:, i) = sx(:, i) * sqrt(w(i));
0193 end
0194 
0195 function sy = shift_x(sx, vmean, sidx, eidx)
0196 
0197 [sd, sn] = size(sx);
0198 if ~isequal(vmean, 0)
0199     
0200     if nargin == 2
0201         curmean = vmean;
0202     else
0203         curmean = vmean(sidx:eidx);
0204     end    
0205     
0206     sy = zeros(sd, sn);
0207     for i = 1 : sn
0208         sy(:,i) = sx(:,i) - curmean;
0209     end
0210 else
0211     sy = sx;
0212 end
0213 
0214 
0215 
0216 
0217         
0218 
0219

Generated on Wed 20-Sep-2006 12:43:11 by m2html © 2003

Contact us at files@mathworks.com