Solving Clustering Problems
Clustering operations partition examples in a dataset into a number k of groups (clusters), according to a given measure of similarity: two examples belonging to the same group must exhibit a higher value of similarity than two patterns associated with different clusters.
This type of problem is called a unsupervised learning problem, and its output is a collection of clusters characterized by an index, a central vector (centroid) and a dispersion value measuring the normalized average distance of cluster members from the centroid.
Rulex Clustering Tasks
Rulex provides the following tasks to solve clustering problems:
Task | Description | Corresponding page |
---|---|---|
Label Clustering (K-means) | Clusters data using the k-means approach, after having aggregated and filtered data according to a subset of label variables. | |
Projection Clustering (K-means) | Clusters data using the k-means approach, after having aggregated and filtered data according to a subset of label variables. Projection clustering ensures that the projection of the set of derived clusters on the domain of each of the label variables determines itself a clustering on that domain. | |
Standard Clustering (K-means) | Clusters data using the k-means approach. |