4.8 Article

Importance of Structural Features and the Influence of Individual Structures of Graphene Oxide Using Shapley Value Analysis

期刊

CHEMISTRY OF MATERIALS
卷 35, 期 21, 页码 8840-8856

出版社

AMER CHEMICAL SOC
DOI: 10.1021/acs.chemmater.3c00715

关键词

machine learning; explainable AI; graphene; graphene oxide; energy; prediction

向作者/读者索取更多资源

The application of machine learning to materials chemistry can accelerate the design process and guide future research. Shapley value analysis provides a comprehensive analysis of the underlying reasons behind structure/property relationships. In this study, ML models trained on graphene oxide nanomaterials data accurately predicted the formation energy and Fermi energy, and Shapley value analysis was used to understand the results.
The application of machine learning (ML) to materials chemistry can accelerate the design process, and when coupled with a detailed explanation, can guide future research. Shapley value analysis is a complementary approach capable of providing a comprehensive analysis of the underlying reasons behind a structure/property relationship. In this study, we have used data sets of graphene oxide nanomaterials generated using electronic structure simulations to train ML models with outstanding accuracy, generalizability, and stability to predict the formation energy and the Fermi energy and applied Shapley value analysis to understand the results. Feature important profiles that rank the value of structural characteristics to each property confirmed that the underlying structure/property relationships are relatively simple and scientifically intuitive, even though the ML models need complex information to achieve high performance. We have also reported instance influence profiles that rank the value of each individual graphene oxide structure to the training process. Feature/instance interactions are also investigated to explain which structural characteristics make particular structures influential, revealing that the most influential structures typically have very high or very low concentrations of H or O. Since the range of concentrations is typically chosen by researchers based on domain knowledge at the outset, this highlights that extreme care should be taken when gathering training data as these decisions will have a very big impact on the final model once trained. In general, the reproducible workflow demonstrated here can be applied to any similar materials data set to make reliable model-agnostic predictions of how the structural characteristics and individual structures contribute to the prediction of functional properties.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据