This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
C++ interface to fast hierarchical clustering algorithms
========================================================
This is a simplified C++ interface to fast implementations of hierarchical
clustering by Daniel Müllner. The original library with interfaces to R
and Python is described in:
Daniel Müllner: "fastcluster: Fast Hierarchical, Agglomerative Clustering
Routines for R and Python." Journal of Statistical Software 53 (2013),
no. 9, pp. 1–18, http://www.jstatsoft.org/v53/i09/
Usage of the library
--------------------
For using the library, the following source files are needed:
fastcluster_dm.cpp, fastcluster_R_dm.cpp
original code by Daniel Müllner
these are included by fastcluster.cpp via #include, and therefore
need not be compiled to object code
fastcluster.[h|cpp]
simplified C++ interface
fastcluster.cpp is the only file that must be compiled
The library provides the clustering function *hclust_fast* for
creating the dendrogram information in an encoding as used by the
R function *hclust*. For a description of the parameters, see fastcluster.h.
Its parameter *method* can be one of
HCLUST_METHOD_SINGLE
single link with the minimum spanning tree algorithm (Rohlf, 1973)
HHCLUST_METHOD_COMPLETE
complete link with the nearest-neighbor-chain algorithm (Murtagh, 1984)
HCLUST_METHOD_AVERAGE
complete link with the nearest-neighbor-chain algorithm (Murtagh, 1984)
HCLUST_METHOD_MEDIAN
median link with the generic algorithm (Müllner, 2011)
For splitting the dendrogram into clusters, the two functions *cutree_k*
and *cutree_cdist* are provided.
Note that output parameters must be allocated beforehand, e.g.
int* merge = new int[2*(npoints-1)];
For a complete usage example, see lines 135-142 of demo.cpp.
Demonstration program
---------------------
A simple demo is implemented in demo.cpp, which can be compiled and run with
make
./hclust-demo -m complete lines.csv
It creates two clusters of line segments such that the segment angle between
line segments of different clusters have a maximum (cosine) dissimilarity.
For visualizing the result, plotresult.r can be used as follows
(requires R <https://r-project.org> to be installed):
./hclust-demo -m complete lines.csv | Rscript plotresult.r
Authors & Copyright
-------------------
Daniel Müllner, 2011, <http://danifold.net>
Christoph Dalitz, 2018, <http://www.hsnr.de/ipattern/>
License
-------
This code is provided under a BSD-style license.
See the file LICENSE for details.