mlr3cluster

Cluster analysis for mlr3

tic CRAN status StackOverflow Mattermost

mlr3cluster is an extension package for cluster analysis within the mlr3 ecosystem. It is a successor of clustering capabilities of mlr2.

Installation

Install the last release from CRAN:

install.packages("mlr3cluster")

Install the development version from GitHub:

devtools::install_github("mlr-org/mlr3cluster")

Feature Overview

The current version of mlr3cluster contains:

Also, the package is integrated with mlr3viz which enables you to create great visualizations with just one line of code!

Cluster Analysis

Cluster Learners

ID Learner Package
clust.agnes Agglomerative Hierarchical Clustering cluster
clust.ap Affinity Propagation Clustering apcluster
clust.cmeans Fuzzy C-Means Clustering e1071
clust.cobweb Cobweb Clustering Algorithm RWeka
clust.dbscan Density-based Clustering dbscan
clust.diana Divisive Hierarchical Clustering cluster
clust.em Expectation-Maximization Clustering RWeka
clust.fanny Fuzzy Clustering cluster
clust.featureless Simple Featureless Clustering mlr3cluster
clust.ff FarthestFirst Clustering Algorithm RWeka
clust.hclust Agglomerative Hierarchical Clustering stats
clust.kkmeans Kernel K-Means Clustering kernlab
clust.kmeans K-Means Clustering stats
clust.MBatchKMeans Mini Batch K-Means Clustering ClusterR
clust.meanshift Mean Shift Clustering LPCM
clust.pam Clustering Around Medoids cluster
clust.SimpleKMeans K-Means Clustering (WEKA) RWeka
clust.xmeans K-Means with Automatic Determination of k RWeka

Cluster Measures

ID Measure Package
clust.db Davies-Bouldin Cluster Separation clusterCrit
clust.dunn Dunn index clusterCrit
clust.ch Calinski Harabasz Pseudo F-Statistic clusterCrit
clust.silhouette Rousseeuw’s Silhouette Quality Index clusterCrit
clust.wss Within Sum of Squares clusterCrit

Example

library(mlr3)
library(mlr3cluster)

task = mlr_tasks$get("usarrests")
learner = mlr_learners$get("clust.kmeans")
learner$train(task)
preds = learner$predict(task = task)

More Resources

Check out the blogpost for a more detailed introduction to the package. Also, mlr3book has a section on clustering.

Future Plans

If you have any questions, feedback or ideas, feel free to open an issue here.