4.8 Article

DigiMOF: A Database of Metal-Organic Framework Synthesis Information Generated via Text Mining

期刊

CHEMISTRY OF MATERIALS
卷 35, 期 11, 页码 4510-4524

出版社

AMER CHEMICAL SOC
DOI: 10.1021/acs.chemmater.3c00788

关键词

-

向作者/读者索取更多资源

To efficiently identify promising MOFs for specific applications, a open-source database called DigiMOF was generated by data-mining published MOF papers. This database contains information about the synthesis methods, solvents, linkers, metal precursors, and other properties of MOFs.
The vastness of materialsspace, particularly that which is concernedwith metal-organic frameworks (MOFs), creates the criticalproblem of performing efficient identification of promising materialsfor specific applications. Although high-throughput computationalapproaches, including the use of machine learning, have been usefulin rapid screening and rational design of MOFs, they tend to neglectdescriptors related to their synthesis. One way to improve the efficiencyof MOF discovery is to data-mine published MOF papers to extract thematerials informatics knowledge contained within journal articles.Here, by adapting the chemistry-aware natural language processingtool, ChemDataExtractor (CDE), we generated an open-source databaseof MOFs focused on their synthetic properties: the DigiMOF database.Using the CDE web scraping package alongside the Cambridge StructuralDatabase (CSD) MOF subset, we automatically downloaded 43,281 uniqueMOF journal articles, extracted 15,501 unique MOF materials, and text-minedover 52,680 associated properties including the synthesis method,solvent, organic linker, metal precursor, and topology. Additionally,we developed an alternative data extraction technique to obtain andtransform the chemical names assigned to each CSD entry in order todetermine linker types for each structure in the CSD MOF subset. Thisdata enabled us to match MOFs to a list of known linkers providedby Tokyo Chemical Industry UK Ltd. (TCI) and analyze the cost of theseimportant chemicals. This centralized, structured database revealsthe MOF synthetic data embedded within thousands of MOF publicationsand contains further topology, metal type, accessible surface area,largest cavity diameter, pore limiting diameter, open metal sites,and density calculations for all 3D MOFs in the CSD MOF subset. TheDigiMOF database and associated software are publicly available forother researchers to rapidly search for MOFs with specific properties,conduct further analysis of alternative MOF production pathways, andcreate additional parsers to search for additional desirable properties.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据