4.7 Article

Theory-Guided Data Science: A New Paradigm for Scientific Discovery from Data

期刊

出版社

IEEE COMPUTER SOC
DOI: 10.1109/TKDE.2017.2720168

关键词

Data science; knowledge discovery; domain knowledge; scientific theory; physical consistency; interpretability

资金

  1. US National Science Foundation Expeditions in Computing Grant [1029711]
  2. Div Of Information & Intelligent Systems
  3. Direct For Computer & Info Scie & Enginr [1029711, 1555949] Funding Source: National Science Foundation

向作者/读者索取更多资源

Data science models, although successful in a number of commercial domains, have had limited applicability in scientific problems involving complex physical phenomena. Theory-guided data science (TGDS) is an emerging paradigm that aims to leverage the wealth of scientific knowledge for improving the effectiveness of data science models in enabling scientific discovery. The overarching vision of TGDS is to introduce scientific consistency as an essential component for learning generalizable models. Further, by producing scientifically interpretable models, TGDS aims to advance our scientific understanding by discovering novel domain insights. Indeed, the paradigm of TGDS has started to gain prominence in a number of scientific disciplines such as turbulence modeling, material discovery, quantum chemistry, bio-medical science, bio-marker discovery, climate science, and hydrology. In this paper, we formally conceptualize the paradigm of TGDS and present a taxonomy of research themes in TGDS. We describe several approaches for integrating domain knowledge in different research themes using illustrative examples from different disciplines. We also highlight some of the promising avenues of novel research for realizing the full potential of theory-guided data science.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据