One-dimensional Bhattacharyya distance between two independent data groups to measure class separability
bhattacharyyaDistance is a function used in code generated by
calculates the one-dimensional Bhattacharyya distances between two independent subsets of
Z = bhattacharyyaDistance(
X that are grouped according to the logical labels in
I. The Bhattacharyya distance provides a metric for ranking features
according to their ability to separate two classes of data, such as data from healthy and
faulty machines. The distance calculation assumes that the data in
follows a Gaussian distribution.
Code that is generated by Diagnostic Feature
bhattacharyyaDistance when ranking features with
X — Data samples to group
vector | matrix
Data set containing data samples that can be logically classified into two groups, specified as a vector when you have a single set of samples, such as values for one feature, and a matrix when you have multiple sets of samples.
Xcontains a single set of n features, such as a set of multiple features extracted from a single data source,
Xis a 1-by-n vector.
Xcontains m sets of n features,
Xis an m-by-n matrix. Each row in
Xrepresents one data source and must correspond to a single logical class.
X must contain at least two rows that correspond to
the logical class in
0 and two rows that
correspond to the label
1 to calculate legitimate Bhattacharyya
For example, suppose that you have a set of five features for each of 20 gearboxes
and you are computing the Bhattacharyya distances to assess these features.
X is a 20-by-5 matrix. Each row represents a gearbox that is
either healthy or faulty, as indicated by the associated logical class label of
1. At least two gearboxes must be healthy
and at least two gearboxes must be faulty. The Bhattacharyya distance indicates how well
each feature separates the data for the healthy gearboxes from the data for the faulty
I — Logical classification labels
Logical classification labels that assign the rows in
X to one
of two logical classes, specified as a vector of length m, where
m is the number of rows in
For example, suppose once more that
X is a 20-by-5 matrix
corresponding to 20 gearboxes. The first 9 gearboxes are healthy. The remaining 11
gearboxes are faulty. Define the healthy state as
0 and the faulty
I has a length of 20. The first
9 labels in
I are equal to
0 and the remaining
11 labels are equal to
Z — Bhattacharyya distances
scalar | vector
Bhattacharyya distances between labeled groups, returned as a scalar or a vector of length n.
Xis a vector, then
Zis a scalar.
Xis a matrix, then
bhattacharyyaDistancecalculates the distance separately for each feature.
Zis then a vector of length n, where n is the number of columns in
NaN entries in
X as missing values and ignores them.
 Theodoridis, Sergios, and Konstantinos Koutroumbas. Pattern Recognition, 177–179. 2nd ed. Amsterdam; Boston: Academic Press, 2003.