Maintainer: | Friedrich Leisch, Bettina Gruen |
Contact: | Bettina.Gruen at R-project.org |
Version: | 2022-08-06 |
URL: | https://CRAN.R-project.org/view=Cluster |
Source: | https://github.com/cran-task-views/Cluster/ |
Contributions: | Suggestions and improvements for this task view are very welcome and can be made through issues or pull requests on GitHub or via e-mail to the maintainer address. For further details see the Contributing guide. |
Citation: | Friedrich Leisch, Bettina Gruen (2022). CRAN Task View: Cluster Analysis & Finite Mixture Models. Version 2022-08-06. URL https://CRAN.R-project.org/view=Cluster. |
Installation: | The packages from this task view can be installed automatically using the ctv package. For example, ctv::install.views("Cluster", coreOnly = TRUE) installs all the core packages or ctv::update.views("Cluster") installs all packages that are not yet installed and up-to-date. See the CRAN Task View Initiative for more details. |
This CRAN Task View contains a list of packages that can be used for finding groups in data and modeling unobserved cross-sectional heterogeneity. Many packages provide functionality for more than one of the topics listed below, the section headings are mainly meant as quick starting points rather than as an ultimate categorization. Except for packages stats and cluster (which essentially ship with base R and hence are part of every R installation), each package is listed only once.
Most of the packages listed in this view, but not all, are distributed under the GPL. Please have a look at the DESCRIPTION file of each package to check under which license it is distributed.
hclust()
from package stats and agnes()
from cluster are the primary functions for agglomerative hierarchical clustering, function diana()
can be used for divisive hierarchical clustering. Faster alternatives to hclust()
are provided by the packages fastcluster and flashClust.dendrogram()
from package stats and associated methods can be used for improved visualization for cluster dendrograms.plot()
function, one can produce dendrograms that are prototype-labeled and are therefore easier to interpret.kmeans()
from package stats provides several algorithms for computing partitions with respect to Euclidean distance.pam()
from package cluster implements partitioning around medoids and can work with arbitrary distances. Function clara()
is a wrapper to pam()
for larger data sets. Silhouette plots and spanning ellipses can be used for visualization.kkmeans
and spectral clustering by specc
.hddc
to fit Gaussian mixture model to high-dimensional data where it is assumed that the data lives in a lower dimension than the original space.cluster.stats()
in package fpc.bootFlexclust()
using bootstrap methods.dissplot()
for visualizing dissimilarity matrices using seriation and matrix shading. This also allows to inspect cluster quality by restricting objects belonging to the same cluster to be displayed in consecutive order.