Evaluate quality of a classification feature based on distance matrix
Show older comments
Hello,
I am currently trying to do feature extraction on measurement data which are preprocessed in the form of normalized 1D-histograms. As I don't know in advance where relevant features (they are expected to be somewhere within certain areas along the histogram indices) might be, I am "scanning" my data using custom distance metrics on data sets with known labels. Having applied the metric I get a square distance matrix with pairwise distances as shown below. The upper left and lower right quadrants always contain the pairwise distances within the two respective clusters, the others the distances between them. The Indices belonging to each cluster are always known.
In order to get a quick evaluation on relevant features (as there is a lot of data to scan), I thought of a measure that somehow reflects the discriminative power of the feature observed:
- score=

Although this score value works quite well in most cases and is easy and fast to compute, I wanted to check whether there might be a better and more expressive way that is computationally effective, results in a single value and is less sensitive regarding outliers in the data.
Unfortunately the approaches I found use raw data (instead of distances, which are definitely the input here) and/ or require an interpretation of the result.
I hope y'all get what I am looking for and hope that my approach is not fundamentally stupid. If so, let me know:)
Thanks in advance!
5 Comments
Image Analyst
on 27 Nov 2019
Edited: Image Analyst
on 27 Nov 2019
What are the features? The histograms vectors themselves? What is D - does it somehow depend on the histogram values directly, like the sqrt of the sum of the squares of bin differences? What are the clusters? How are you getting clusters from a ton of image histograms?
Have you considered doing Principal Components Analysis and looking at which data fit the model and which have high Q residual scores or high T Hotelling scores?
Can you attach your data?
LMarcel
on 27 Nov 2019
Image Analyst
on 27 Nov 2019
Edited: Image Analyst
on 27 Nov 2019
Show how you compute D from two histogram vectors.
And please insert screenshots as PNG files, so we can see them right here, instead of .fig files.
LMarcel
on 28 Nov 2019
Answers (0)
Categories
Find more on Design Condition Indicators Interactively in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!