Cluster analysis involves applying one or more clustering algorithms with the goal of finding hidden patterns or groupings in a dataset. Clustering algorithms form groupings or clusters in such a way that data within a cluster have a higher measure of similarity than data in any other cluster. The measure of similarity on which the clusters are modeled can be defined by Euclidean distance, probabilistic distance, or another metric.
Cluster analysis is an unsupervised learning method and an important task in exploratory data analysis. Popular clustering algorithms include:
The distinguishing feature of each of these algorithms is the metric to measure similarity.
Cluster analysis is used in bioinformatics for sequence analysis and genetic clustering; in data mining for sequence and pattern mining; in medical imaging for image segmentation; and in computer vision for object recognition.