4.5 Article

A new procedure to optimize the selection of groups in a classification tree: Applications for ecological data

期刊

ECOLOGICAL MODELLING
卷 220, 期 4, 页码 451-461

出版社

ELSEVIER
DOI: 10.1016/j.ecolmodel.2008.11.006

关键词

Classification; Clustering; Dendrogram; Stopping rule; Outliers

类别

资金

  1. French Ministere de l'Education Nationale
  2. de I'Enseignement Superieur et de la Recherche
  3. EC [GOCE-036949]
  4. NERC [SAH01001] Funding Source: UKRI
  5. Natural Environment Research Council [SAH01001] Funding Source: researchfish

向作者/读者索取更多资源

Agglomerative cluster analyses encompass many techniques, which have been widely used in various fields of science. In biology, and specifically ecology, datasets; are generally highly variable and may contain outliers, which increase the difficulty to identify the number of clusters. Here we present a new criterion to determine statistically the optimal level of partition in a classification tree. The criterion robustness is tested against perturbated data (outliers) using an observation or variable with values randomly generated. The technique, called Random Simulation Test (RST), is tested on (1) the well-known Iris dataset (Fisher, R.A., 1936. The use of multiple measurements in taxonomic problems. Ann. Eugenic. 7, 179-188], (2) simulated data with predetermined numbers of clusters following Milligan and Cooper [Milligan, G.W, Cooper, M.C., 1985. An examination of procedures for determining the number of clusters in a data set. Psychometrika SO, 159-1791 and finally (3) is applied on real copepod communities data previously analyzed in Beaugrand et al. (Beaugrand, G., Ibanez, F., Lindley, J.A., Reid, P.C., 2002. Diversity of calanoid copepods in the North Atlantic and adjacent seas: species associations and biogeography. Mar. Ecol. Prog. Ser. 232, 179-1951. The technique is compared to several standard techniques. RST performed generally better than existing algorithms on simulated data and proved to be especially efficient with highly variable datasets. (C) 2008 Elsevier B.V All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据