4.4 Article

Artificial intelligence powered statistical genetics in biobanks

期刊

JOURNAL OF HUMAN GENETICS
卷 66, 期 1, 页码 61-65

出版社

SPRINGERNATURE
DOI: 10.1038/s10038-020-0822-y

关键词

-

资金

  1. Ministry of Education, Culture, Sports, Science and Technology (MEXT) of Japan

向作者/读者索取更多资源

Biobanks offer the opportunity to study the relationships between genetic and environmental factors in common complex diseases, but face challenges of small sample sizes with high dimensions, multi-layered and heterogeneous endophenotypes. Researchers are using statistical machine-learning and deep-learning technologies to tackle the complexities inherent in biobank data.
Large-scale, sometimes nationwide, prospective genomic cohorts biobanking rich biological specimens such as blood, urine and tissues, have been established and released their vast amount of data in several countries. These genetic and epidemiological resources are expected to allow investigators to disentangle genetic and environmental components conferring common complex diseases. There are, however, two major challenges to statistical genetics for this goal: small sample size-high dimensionality and multilayered-heterogenous endophenotypes. Rather counterintuitively, biobank data generally have small sample size relative to their data dimensionality consisting of genomic variation, lifestyle questionnaire, and sometimes their interaction. This is a widely acknowledged difficulty in data analysis, so-called p >> n problem in statistics or curse of dimensionality in machine-learning field. On the other hand, we have too many measurements of individual health status, which are endophenotypes, such as health check-up data, images, psychological test scores in addition to metabolomics and proteomics data. These endophenotypes are rich but not so tractable because of their worsen dimensionality, and substantial correlation, sometimes confusing causation among them. We have tried to overcome the problems inherent to biobank data, using statistical machine-learning and deep-learning technologies.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据