Solving Clustering Problems

Clustering operations partition examples in a dataset into a number k of groups (clusters), according to a given measure of similarity: two examples belonging to the same group must exhibit a higher value of similarity than two patterns associated with different clusters.

This type of problem is called a unsupervised learning problem, and its output is a collection of clusters characterized by an index, a central vector (centroid) and a dispersion value measuring the normalized average distance of cluster members from the centroid.


Rulex Clustering Tasks

Rulex provides the following tasks to solve clustering problems:

Task

Description

Corresponding page

Label Clustering (K-means)

Clusters data using the k-means approach, after having aggregated and filtered data according to a subset of label variables.

Using Label Clustering to Cluster Data

Projection Clustering (K-means)

Clusters data using the k-means approach, after having aggregated and filtered data according to a subset of label variables. 

Projection clustering ensures that the projection of the set of derived clusters on the domain of each of the label variables determines itself a clustering on that domain.

Using Projection Clustering to Cluster Data

Standard Clustering (K-means)

Clusters data using the k-means approach.

Using Standard Clustering to Cluster Data