4.5 Article

DISEASES 2.0: a weekly updated database of disease-gene associations from text mining and data integration

出版社

OXFORD UNIV PRESS
DOI: 10.1093/database/baac019

关键词

-

资金

  1. US National Institutes of Health [U24 224 370]
  2. Novo Nordisk Foundation [NNF14CC0001]

向作者/读者索取更多资源

The DISEASES database has been significantly updated to provide a more comprehensive overview of disease-gene associations, with a major increase in the number of associations from various sources. The increase in text-mined associations, primarily due to the inclusion of full-text articles in the corpus and improvements in entity recognition dictionaries, is particularly remarkable. Additionally, the integration of a new GWAS database has substantially contributed to the increase in GWAS-derived disease-gene associations.
The scientific knowledge about which genes are involved in which diseases grows rapidly, which makes it difficult to keep up with new publications and genetics datasets. The DISEASES database aims to provide a comprehensive overview by systematically integrating and assigning confidence scores to evidence for disease-gene associations from curated databases, genome-wide association studies (GWAS) and automatic text mining of the biomedical literature. Here, we present a major update to this resource, which greatly increases the number of associations from all these sources. This is especially true for the text-mined associations, which have increased by at least 9-fold at all confidence cutoffs. We show that this dramatic increase is primarily due to adding full-text articles to the text corpus, secondarily due to improvements to both the disease and gene dictionaries used for named entity recognition, and only to a very small extent due to the growth in number of PubMed abstracts. DISEASES now also makes use of a new GWAS database, Target Illumination by GWAS Analytics, which considerably increased the number of GWAS-derived disease-gene associations. DISEASES itself is also integrated into several other databases and resources, including GeneCards/MalaCards, Pharos/Target Central Resource Database and the Cytoscape stringApp. All data in DISEASES are updated on a weekly basis and is available via a web interface at , from where it can also be downloaded under open licenses.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据