Code covered by the BSD License  

Highlights from
Statistical Learning Toolbox

from Statistical Learning Toolbox by Dahua Lin
Functions for statistical learning, pattern recognition and computer vision, covering many topics.

Description of slvechist
Home > sltoolbox > discrete > slvechist.m

slvechist

PURPOSE ^

SLVECHIST Makes the histogram on prototype vectors by voting

SYNOPSIS ^

function H = slvechist(X0, X, varargin)

DESCRIPTION ^

SLVECHIST Makes the histogram on prototype vectors by voting

 $ Syntax $
   - H = slvechist(X0, X, ...)

 $ Arguments $
   - X0:       The sample matrix of the prototypes (be voted)
   - X:        The samples to vote (voters)
   - H:        The resultant histogram 

 $ Description $
   - H = slvechist(X0, X, ...) makes the histogram on vectors by counting
     how many of X belong to each of X0, or evaluating the confidences
     of the ownership. If there are m samples in X0, then H would be a
     m x 1 column vector. The process consists of two stages: first, it
     evaluates which sample belongs to which prototype and also the 
     confidences if necessary, this stage is called voting, then the 
     histogram would be built based on the voting results.
     You can specify the following properties to control the process:
       - 'countrule':      The rule of counting the votes (default = 'nm')
                           Please refer to slcountvote for a list of
                           available counting rules.
       - 'clsmethod':      The method of classifying the vectors to 
                           the prototypes. (default = 'pwcomp')
                           - 'pwcomp': computing the metrics between 
                             all samples and all prototypes pairwisely.
                             then select the most close ones
                           - 'kdtree': using KD-tree to select the most
                             close ones.
                           This property only takes effect when countrule
                           is 'nm' or 'nmx'.
       - 'nnparams'        The parameters to find the neighboring 
                           prototypes in the form {method, ...}. Please
                           refer to slfindnn for details.
                           This property only takes effect when countrule
                           is 'mmc', 'mms' or 'mmns'.
                           (default = [], means to use all prototypes as
                            contributive neighbors)
       - 'metric'          The type of distances to compare vectors.
                           It can be a string representing the name of
                           the metric type, or a cell array of
                           parameters given as {method, ...} for 
                           parameterized metric computation. Please refer
                           to slmetric_pw for a list of available methods
                           and the specification of the parameters.
                           You can also define your own distance by using 
                           a distance computing function handle here, it
                           is invoked using the syntax:
                               D = f(X1, X2) 
                           to compute pairwise distances. 
                           (default = 'eucdist')
       - 'cfunc'           The functor to evaluate likelihood (confidence)
                           values based on metric values.
                           This property only takes effect when countrule
                           uses confidences values. 
                               C = f(V)
                           when countrule is 'nmx', the input V is a 1 x n                          
                           row vector, otherwise the input V is a m x n 
                           matrix (or sparse matrix). The output should
                           be of the same size as V, and translate the
                           metric values to the confidence values in the
                           corresponding positions. 
                           Please note that the input to f contains all
                           metric values, actually you can use their 
                           integral attributes and relationship in the
                           computation of confidence values.
                           (default = [], for the countrule using 
                           confidences, it must be specified. If you 
                           would like to just use the metric values as
                           confidences just set cfunc to 'um')
       - 'evalfunctor'     The user-supplied functor to do voting. If
                           it is specified the function just invokes the
                           functor to do voting and ignore other options.
                           It would be invoked as
                               V = f(X0, X, ...)
                           (default = [])
       - 'weights':        The weights of samples. They will be multiplied
                           to the contributions of the samples. 
                           (default = [], if specified, it is 1 x n row)
       - 'normalized':     Whether to normalize the histogram so that the
                           sum of the votings to all bins are normalized
                           to 1. (default = false)
 
 $ Remarks $
   - The metric specification will not take effect for KD-tree (ANN) based
     methods, they can only use Euclidean distances.

   - When counting rule uses one-best-prototype strategy, such as 'nm'
     and 'nmx', the function uses clsmethod to classify samples to 
     the best prototypes, otherwise, it uses multi-best-prototype strategy
     the function uses nnparams to construct the neighborhood graph.
     When nnparams is [], all prototypes will be considered as
     neighborhood of all samples, then all pairwise relationship will be
     utilized.

 $ History $
   - Created by Dahua Lin on Sep 18th, 2006

CROSS-REFERENCE INFORMATION ^

This function calls:
  • annsearch ANNSEARCH Approximate Nearest Neighbor Search
  • slmetric_pw SLMETRIC_PW Compute the metric between column vectors pairwisely
  • slvote SLVOTE Builds histogram by voting (or fuzzy voting)
  • slnngraph SLNNGRAPH Constructs a nearest neighborhood based graph
  • raise_lackinput RAISE_LACKINPUT Raises an error indicating lack of input argument
  • slparseprops SLPARSEPROPS Parses input parameters
This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 function H = slvechist(X0, X, varargin)
0002 %SLVECHIST Makes the histogram on prototype vectors by voting
0003 %
0004 % $ Syntax $
0005 %   - H = slvechist(X0, X, ...)
0006 %
0007 % $ Arguments $
0008 %   - X0:       The sample matrix of the prototypes (be voted)
0009 %   - X:        The samples to vote (voters)
0010 %   - H:        The resultant histogram
0011 %
0012 % $ Description $
0013 %   - H = slvechist(X0, X, ...) makes the histogram on vectors by counting
0014 %     how many of X belong to each of X0, or evaluating the confidences
0015 %     of the ownership. If there are m samples in X0, then H would be a
0016 %     m x 1 column vector. The process consists of two stages: first, it
0017 %     evaluates which sample belongs to which prototype and also the
0018 %     confidences if necessary, this stage is called voting, then the
0019 %     histogram would be built based on the voting results.
0020 %     You can specify the following properties to control the process:
0021 %       - 'countrule':      The rule of counting the votes (default = 'nm')
0022 %                           Please refer to slcountvote for a list of
0023 %                           available counting rules.
0024 %       - 'clsmethod':      The method of classifying the vectors to
0025 %                           the prototypes. (default = 'pwcomp')
0026 %                           - 'pwcomp': computing the metrics between
0027 %                             all samples and all prototypes pairwisely.
0028 %                             then select the most close ones
0029 %                           - 'kdtree': using KD-tree to select the most
0030 %                             close ones.
0031 %                           This property only takes effect when countrule
0032 %                           is 'nm' or 'nmx'.
0033 %       - 'nnparams'        The parameters to find the neighboring
0034 %                           prototypes in the form {method, ...}. Please
0035 %                           refer to slfindnn for details.
0036 %                           This property only takes effect when countrule
0037 %                           is 'mmc', 'mms' or 'mmns'.
0038 %                           (default = [], means to use all prototypes as
0039 %                            contributive neighbors)
0040 %       - 'metric'          The type of distances to compare vectors.
0041 %                           It can be a string representing the name of
0042 %                           the metric type, or a cell array of
0043 %                           parameters given as {method, ...} for
0044 %                           parameterized metric computation. Please refer
0045 %                           to slmetric_pw for a list of available methods
0046 %                           and the specification of the parameters.
0047 %                           You can also define your own distance by using
0048 %                           a distance computing function handle here, it
0049 %                           is invoked using the syntax:
0050 %                               D = f(X1, X2)
0051 %                           to compute pairwise distances.
0052 %                           (default = 'eucdist')
0053 %       - 'cfunc'           The functor to evaluate likelihood (confidence)
0054 %                           values based on metric values.
0055 %                           This property only takes effect when countrule
0056 %                           uses confidences values.
0057 %                               C = f(V)
0058 %                           when countrule is 'nmx', the input V is a 1 x n
0059 %                           row vector, otherwise the input V is a m x n
0060 %                           matrix (or sparse matrix). The output should
0061 %                           be of the same size as V, and translate the
0062 %                           metric values to the confidence values in the
0063 %                           corresponding positions.
0064 %                           Please note that the input to f contains all
0065 %                           metric values, actually you can use their
0066 %                           integral attributes and relationship in the
0067 %                           computation of confidence values.
0068 %                           (default = [], for the countrule using
0069 %                           confidences, it must be specified. If you
0070 %                           would like to just use the metric values as
0071 %                           confidences just set cfunc to 'um')
0072 %       - 'evalfunctor'     The user-supplied functor to do voting. If
0073 %                           it is specified the function just invokes the
0074 %                           functor to do voting and ignore other options.
0075 %                           It would be invoked as
0076 %                               V = f(X0, X, ...)
0077 %                           (default = [])
0078 %       - 'weights':        The weights of samples. They will be multiplied
0079 %                           to the contributions of the samples.
0080 %                           (default = [], if specified, it is 1 x n row)
0081 %       - 'normalized':     Whether to normalize the histogram so that the
0082 %                           sum of the votings to all bins are normalized
0083 %                           to 1. (default = false)
0084 %
0085 % $ Remarks $
0086 %   - The metric specification will not take effect for KD-tree (ANN) based
0087 %     methods, they can only use Euclidean distances.
0088 %
0089 %   - When counting rule uses one-best-prototype strategy, such as 'nm'
0090 %     and 'nmx', the function uses clsmethod to classify samples to
0091 %     the best prototypes, otherwise, it uses multi-best-prototype strategy
0092 %     the function uses nnparams to construct the neighborhood graph.
0093 %     When nnparams is [], all prototypes will be considered as
0094 %     neighborhood of all samples, then all pairwise relationship will be
0095 %     utilized.
0096 %
0097 % $ History $
0098 %   - Created by Dahua Lin on Sep 18th, 2006
0099 %
0100 
0101 %% parse and verify input arguments
0102 
0103 if nargin < 2
0104     raise_lackinput('slvechist', 2);
0105 end
0106 
0107 if ~isnumeric(X0) || ndims(X0) ~= 2
0108     error('sltoolbox:invalidarg', 'X0 should be a 2D numeric matrix');
0109 end
0110 if ~isnumeric(X) || ndims(X) ~= 2
0111     error('sltoolbox:invalidarg', 'X should be a 2D numeric matrix');
0112 end
0113 
0114 [d, m] = size(X0);
0115 [d2, n] = size(X);
0116 if d ~= d2
0117     error('sltoolbox:sizmismatch', 'The dimensions of X and X0 are mismatched');
0118 end
0119 
0120     
0121 opts.countrule = 'nm';
0122 opts.clsmethod = 'pwcomp';
0123 opts.nnparams = [];
0124 opts.metric = 'eucdist';
0125 opts.cfunc = [];
0126 opts.evalfunctor = [];
0127 opts.weights = [];
0128 opts.normalized = false;
0129 opts = slparseprops(opts, varargin{:});
0130 
0131 
0132 %% main skeleton
0133 
0134 % decide scheme
0135 % Vform:
0136 %   - 0:    1 x n or 2 x n matrix
0137 %   - 1:    m x n sparse graph
0138 %   - 2:    m x n full
0139 %
0140 
0141 if isempty(opts.evalfunctor)
0142 
0143     switch opts.countrule
0144         case {'nm', 'nmx'}
0145             Vform = 0;
0146             usecvalue = strcmp(opts.countrule, 'nmx');
0147             fhmetric = get_metricfunc(opts.metric);
0148             fvote = @(x, y) evaldist_one(x, y, usecvalue, opts.clsmethod, fhmetric);
0149 
0150         case {'mmc', 'mms', 'mmns'}
0151             usecvalue = ~strcmp(opts.countrule, 'mmc');
0152             fhmetric = get_metricfunc(opts.metric);
0153             if isempty(opts.nnparams)
0154                 Vform = 2;
0155                 fvote = @(x, y) evaldist_all(x, y, usecvalue, fhmetric);
0156             else
0157                 Vform = 1;
0158                 fvote = @(x, y) evaldist_multi(x, y, usecvalue, opts.nnparams, fhmetric);
0159             end
0160 
0161         otherwise
0162             error('sltoolbox:invalidarg', ...
0163                 'Unknown counting rule: %s', opts.countrule);
0164     end    
0165 
0166     if usecvalue
0167         if isempty(opts.cfunc)
0168             error('sltoolbox:invalidarg', ...
0169                 'In current rule, the confidence function is required.');
0170         end
0171         if ischar(opts.cfunc) && strcmp(opts.cfunc, 'um')
0172             cvf = [];
0173         elseif isa(opts.cfunc, 'function_handle')
0174             cvf = opts.cfunc;
0175         else
0176             error('sltoolbox:invalidarg', 'Illegal form of cfunc');
0177         end
0178     else
0179         cvf = [];
0180     end
0181 
0182     evalfunctor = {@do_vote, fvote, cvf, Vform};
0183     
0184 else
0185     evalfunctor = opts.evalfunctor;
0186 end
0187         
0188 % Do voting
0189 H = slvote(X0, m, X, n, evalfunctor, opts.countrule, ...
0190     'weights', opts.weights, ...
0191     'normalized', opts.normalized);
0192 
0193 
0194 %% Core functions
0195 
0196 function V = do_vote(X0, X, fvote, cvf, Vform)
0197 
0198 V = fvote(X0, X);
0199 if ~isempty(cvf)
0200     if Vform == 0
0201         V(2,:) = cvf(V(2,:));
0202     else
0203         V = cvf(V);
0204     end
0205 end
0206 
0207 
0208 function V = evaldist_one(X0, X, usecvalue, clsmethod, fhmetric)
0209 
0210 switch clsmethod
0211     case 'pwcomp'
0212         D = fhmetric(X0, X);
0213         [vals, si] = min(D, [], 1);
0214     case 'kdtree'
0215         [si, vals] = annsearch(X0, X, 1);
0216     otherwise
0217         error('sltoolbox:invalidarg', ...
0218             'Invalid clsmethod: %s', clsmethod);
0219 end
0220 
0221 if ~usecvalue
0222     V = si;    
0223 else
0224     V = [si; vals];
0225 end
0226 
0227 
0228 function V = evaldist_multi(X0, X, usecvalue, nnparams, fhmetric)
0229        
0230 switch nnparams{1}
0231     case {'knn', 'eps'}
0232         use_nnparams = [nnparams, {'metric', fhmetric}];
0233     case 'ann'
0234         use_nnparams = nnparams;
0235     otherwise
0236         error('sltoolbox:invalidarg', ...
0237             'Invalid neighborhood finding method: %s', nnparams{1});
0238 end
0239 
0240 if usecvalue
0241     valtype = 'numeric';
0242 else
0243     valtype = 'logical';
0244 end
0245 
0246 V = slnngraph(X0, X, use_nnparams, 'sparse', true, 'valtype', valtype);
0247 
0248 
0249 function V = evaldist_all(X0, X, usecvalue, fhmetric)
0250 
0251 V = fhmetric(X0, X);
0252 
0253 if ~usecvalue
0254     V = (V ~= 0);
0255 end
0256 
0257 
0258 
0259 %% Auxiliary functions
0260 
0261 function fh = get_metricfunc(m)
0262 
0263 if ischar(m)
0264     fh = @(X, Y) slmetric_pw(X, Y, m);
0265 elseif iscell(m)
0266     fh = @(X, Y) slmetric_pw(X, Y, m{:});
0267 elseif isa(m, 'function_handle')
0268     fh = m;
0269 else
0270     error('sltoolbox:invalidarg', 'The metric is specified incorrectly');
0271 end
0272 
0273 
0274 
0275 
0276

Generated on Wed 20-Sep-2006 12:43:11 by m2html © 2003

Contact us at files@mathworks.com