R数据分析:如何用层次聚类分析做“症状群”,实例操练
10.1016/j.hrtlng.2021.07.001.
层次聚类的原理
Divisive : A divisive method begins with all patterns in a single cluster and performs splitting until a stopping criterion is met. Agglomerative : An agglomerative approach begins with each observation in a distinct (singleton) cluster, and successively merges clusters together until a stopping criterion is satisfied.
先计算每一个类之间的距离 将最近距离的类合并 重复1,2直到所有类合并为1个类
Centroid linkage Single linkage Complete linkage Average linkage Ward’s method
层次聚类的做法
hc = hclust(dist(mtcars))
plot(hc)
data2 <- t(data2)
mycluster = hclust(dist(data2))
plot(mycluster )
hc_dend_obj <- as.dendrogram(mycluster)
hc_col_dend <- color_branches(hc_dend_obj, h = 6)
plot(hc_col_dend,hang=-1)
hc = hclust(dist(scale(data2)))
cut_avg <- cutree(hc, k = 2)
data_cl <- mutate(data1, cluster = cut_avg)
小结
赞 (0)