Agglomerative Hierarchical Clustering: An Introduction to Essentials. (1) Proximity Coefficients and Creation of a Vector-Distance Matrix and (2) Construction of the Hierarchical Tree and a Selection of Methods

Author: Refat

Keywords: proximity, metric space, vector space, (non) euclidean space, symmetric matrix, agglomeration, centroid, sum of squares, median.


The article is on a particular type of cluster analysis, agglomerative hierarchical analysis, and is a series of four main parts. The first part deals with proximity coefficients and the creation of a vector-distance matrix. The second part deals with the construction of the hierarchical tree and introduces a selection of clustering methods. The third deals with a variety of ways to transform data prior to agglomerative cluster analysis. The fourth deals with deals with measures and methods of cluster validity. The fifth and final part deals with hypothesis generation. The present article covers the first and second parts only. It explains how agglomerative cluster analysis works by implementing it in a data matrix step by step.