期刊
AMERICAN JOURNAL OF BIOLOGICAL ANTHROPOLOGY
卷 179, 期 4, 页码 687-692出版社
WILEY
DOI: 10.1002/ajpa.24631
关键词
data contency; SIS; Howells' craniometric data; simotic chord; simotic subtense; sis; WNB
资金
- College of Public Health, the University of South Florida
Howells' craniometric data set, the largest publicly available data set on the internet, has been widely used for craniometric methods development. The study reveals data inconsistency between the main and test data sets, with missing decimal points likely causing the abnormality.
Howells' craniometric data set is the largest publicly available craniometric data set on the internet and has been widely used in craniometric methods development. The data consists of a main data set of 2524 human crania from 28 populations and an additional test data set of 524 crania. Up to 82 measurements were recorded from those crania. We studied the data consistency between the main and test data sets for potential combined usage of the two. We found that the two data sets can be separated clearly via Uniform Manifold Approximation and Projection, suggesting some data inconsistency between the two. To further investigate the cause, we split the two data sets into six continental groups (African, Austro-Melanesian, East Asian, European, Native American, and Polynesian) and tested the distribution difference between the two data sets for each of the groups. We found that the measures of simotic chord (WNB) and simotic subtense (SIS) are significantly and abnormally larger in the test data set than in the main data set. After removing the two measures, the two data sets are broadly comparable. We further showed the evidence that missing decimal points likely caused the abnormality.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据