Information loss of the Mahalanobis distance in high dimensions: Matlab implementation
The Mahalanobis distance between a pattern measurement vector of dimensionality D and the center of the class it belongs to is distributed as a chi^2 with D degrees of freedom, when an infinite training set is used. However, the distribution of Mahalanobis distance becomes either Fisher or Beta depending on whether cross-validation or re-substitution is used for parameter estimation in finite training sets. The total variation between chi^2 and Fisher as well as between chi^2 and Beta allows us to measure the information loss in high dimensions. The information loss is exploited then to set a lower limit for the correct classification rate achieved by the Bayes classifier that is used in subset feature selection.
Installation:
-------------
The 5 functions should be in the current path of Matlab.
Usage:
------
LowCCRLimit = LowCCRLimitInfLoss(D, CCR, NDc, CClasses, ErrorEstMethod)
% D: Dimensionality of the vector (2,3,4,5,...)
% CCR: The Correct Classification rate in [1/CClasses,1] (e.g. 0.8)
% NDc: The number of training samples per class (>D+1)
% CClasses: The number of classes in your problem (2,3,4,...)
% ErrorEstMethod: "Resub" for resubstitution
% "Cross" for cross-validation
Example:
--------
LowCCRLimitInfLoss(5, 0.75, 100, 5, 'Cross')
ans = 0.7288
References:
-----------
[1] Dimitrios Ververidis and Constantine Kotropoulos, "Information loss of the Mahalanobis distance in high dimensions: Application to feature selection,"
IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 12, pp. 2275-2281, 2009.
[2] Jeffrey, Knuth, "On the Lambert W Function", Advances in Computational Mathematics, volume 5, 1996, pp. 329-359.
Special thanks to Dr. Pascal Getreuer for implementing the lambertw2 function from Jeffrey Knuth publication.
Cite As
Dimitrios Ververidis (2024). Information loss of the Mahalanobis distance in high dimensions: Matlab implementation (https://www.mathworks.com/matlabcentral/fileexchange/30522-information-loss-of-the-mahalanobis-distance-in-high-dimensions-matlab-implementation), MATLAB Central File Exchange. Retrieved .
MATLAB Release Compatibility
Platform Compatibility
Windows macOS LinuxCategories
- AI, Data Science, and Statistics > Statistics and Machine Learning Toolbox > Dimensionality Reduction and Feature Extraction >
Tags
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!Discover Live Editor
Create scripts with code, output, and formatted text in a single executable document.
LowCCRLimit/
Version | Published | Release Notes | |
---|---|---|---|
1.0.0.0 |