4.5 Article

Open data and algorithms for open science in AI- driven molecular informatics

相关参考文献

注意:仅列出部分参考文献,下载原文获取全部文献信息。
Article Biochemistry & Molecular Biology

A dynamic simulation study of FDA drug from zinc database against COVID-19 main protease receptor

Shalini Mathpal et al.

Summary: This research aims to identify new drug candidates for the inhibition of the Main protease enzyme of COVID-19 using in silico techniques. By performing molecular docking and molecular dynamics simulation, four potential drugs were discovered among 3180 FDA-approved drugs. These findings provide important leads for the development of novel drugs against COVID-19.

JOURNAL OF BIOMOLECULAR STRUCTURE & DYNAMICS (2022)

Article Biochemistry & Molecular Biology

Identification of Iron-Sulfur (Fe-S) Cluster and Zinc (Zn) Binding Sites Within Proteomes Predicted by DeepMind's AlphaFold2 Program Dramatically Expands the Metalloproteome Zachary J. Wehrspan † Robert T. McDonnell † and Adrian H. Elcock ⇈

Zachary J. Wehrspan et al.

Summary: DeepMind's AlphaFold2 software can accurately predict ligand binding sites in protein structures, providing an important tool for the functional annotation of proteomes.

JOURNAL OF MOLECULAR BIOLOGY (2022)

Article Biochemistry & Molecular Biology

The AlphaFold Database of Protein Structures: A Biologist's Guide

Alessia David et al.

Summary: AlphaFold, a deep learning algorithm developed by DeepMind, recently released three-dimensional models of the entire human proteome to the scientific community. In this discussion, we explore the advantages, limitations, and unresolved challenges of the AlphaFold models from the perspective of a biologist.

JOURNAL OF MOLECULAR BIOLOGY (2022)

Article Biochemistry & Molecular Biology

AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models

Mihaly Varadi et al.

Summary: AlphaFold DB is an openly accessible database with high-accuracy protein-structure predictions, powered by DeepMind's AlphaFold v2.0. It provides programmatic access to a vast number of predicted structures and is expanding to cover more sequences.

NUCLEIC ACIDS RESEARCH (2022)

Article Biochemistry & Molecular Biology

NP-MRD: the Natural Products Magnetic Resonance Database

David S. Wishart et al.

Summary: NP-MRD is a comprehensive electronic resource for NMR data on natural products, metabolites, and other biologically derived chemicals. It is funded by NIH and has quickly become the world's largest repository for NMR data on natural products. The database contains both structural and NMR data for nearly 41,000 natural product compounds from over 7400 different living species, and is accessible at https://np-mrd.org.

NUCLEIC ACIDS RESEARCH (2022)

Article Biochemistry & Molecular Biology

The Natural Products Atlas 2.0: a database of microbially-derived natural products

Jeffrey A. van Santen et al.

Summary: This paper reports the release of a new version of the Natural Products Atlas database, which includes a large number of new compounds and significant upgrades. In addition to adding detailed descriptions of microbial taxa and chemical ontology terms, manual curation and data integration were carried out to improve the user experience.

NUCLEIC ACIDS RESEARCH (2022)

Article Biochemistry & Molecular Biology

HMDB 5.0: the Human Metabolome Database for 2022

David S. Wishart et al.

Summary: The Human Metabolome Database (HMDB) has been providing comprehensive information about human metabolites since 2007, and has undergone significant improvements and upgrades in its latest update, HMDB 5.0. These improvements include an increase in the number of metabolite entries, enhancements to metabolite descriptions, new visualization tools, and more accurately predicted spectral data sets. These upgrades are aimed at improving the usability and potential applications of the HMDB in various fields, including human metabolomics, exposomics, lipidomics, nutritional science, biochemistry, and clinical chemistry.

NUCLEIC ACIDS RESEARCH (2022)

Article Chemistry, Multidisciplinary

Natural product drug discovery in the artificial intelligence era

F. I. Saldivar-Gonzalez et al.

Summary: Natural products are considered privileged structures to interact with protein drug targets, sparking interest in developing NP-inspired medicines. The advancement of artificial intelligence has democratized the field of natural product drug discovery, with the introduction of natural language processing and machine learning algorithms enhancing molecular design and target selectivity.

CHEMICAL SCIENCE (2022)

Article Multidisciplinary Sciences

Biocatalysed synthesis planning using data-driven learning

Daniel Probst et al.

Summary: This study extends the data-driven forward reaction and retrosynthetic pathway prediction models based on the Molecular Transformer architecture to biocatalysis. The authors provide a publicly available enzymatic knowledge dataset and aim to facilitate the adoption of enzymatic catalysis in the design of greener chemistry processes.

NATURE COMMUNICATIONS (2022)

Article Chemistry, Multidisciplinary

Machine Learning for Chemical Reactivity: The Importance of Failed Experiments

Felix Strieth-Kalthoff et al.

Summary: The article examines the impact of biases in chemical reaction data on drawing general conclusions and highlights the importance of negative examples. The research showcases the potential of data expansion methods to address these limitations and demonstrates future prospects for improving data quality in the field of chemistry.

ANGEWANDTE CHEMIE-INTERNATIONAL EDITION (2022)

Article Chemistry, Multidisciplinary

The Long and Winding Road towards FAIR Data as an Integral Component of the Computational Modelling and Dissemination of Chemistry

Henry S. Rzepa

Summary: The author reflects on 50 years of activities as a computational chemist, highlighting the importance of scientific data. The essay is divided into two parts, with the first focusing on the evolution of data handling methods and the second discussing the explosive growth of the digital information era and the author's involvement.

ISRAEL JOURNAL OF CHEMISTRY (2022)

Article Chemistry, Medicinal

Structure-Aware Multimodal Deep Learning for Drug-Protein Interaction Prediction

Penglei Wang et al.

Summary: In this study, we propose a structure-aware multimodal deep DPI prediction model (STAMP-DPI), which accurately predicts drug-protein interactions by training on a carefully curated industry-scale benchmark dataset. The model combines the feature representations of molecules and proteins, effectively capturing the interaction features between them using graph neural networks and pretrained embeddings. Experimental results demonstrate that STAMP-DPI outperforms existing methods on multiple datasets and has interpretability.

JOURNAL OF CHEMICAL INFORMATION AND MODELING (2022)

Article Chemistry, Medicinal

PDFDataExtractor: A Tool for Reading Scientific Text and Interpreting Metadata from the Typeset Literature in the Portable Document Format

Miao Zhu et al.

Summary: This article introduces the PDFDataExtractor tool, which can be used as a plugin for ChemDataExtractor to extract information from PDF files. Compared to other PDF extraction tools, PDFDataExtractor performs better in the field of chemical literature. It is capable of extracting semantic information from the PDF files of scientific articles and reconstructing the logical structure of the articles.

JOURNAL OF CHEMICAL INFORMATION AND MODELING (2022)

Article Chemistry, Medicinal

Single Model for Organic and Inorganic Chemical Named Entity Recognition in ChemDataExtractor

Taketomo Isazawa et al.

Summary: This paper introduces a single model that performs at close to the state-of-the-art for both organic and inorganic Named Entity Recognition (NER) tasks in the chemical domain. The NER system utilizes the BERT architecture and is available as part of ChemDataExtractor 2.1, along with the datasets and scripts used to train the model.

JOURNAL OF CHEMICAL INFORMATION AND MODELING (2022)

Article Biochemistry & Molecular Biology

PubChem Protein, Gene, Pathway, and Taxonomy Data Collections: Bridging Biology and Chemistry through Target- Centric Views of PubChem Data

Sunghwan Kim et al.

Summary: PubChem is a public chemical database that serves as a vital resource for biomedical research communities. It provides information on chemicals related to biological targets, helping users analyze and interpret the biological activity data of molecules. The database contains data from hundreds of contributors and is organized into various collections based on different record types.

JOURNAL OF MOLECULAR BIOLOGY (2022)

Review Chemistry, Multidisciplinary

Machine intelligence for chemical reaction space

Philippe Schwaller et al.

Summary: New data-driven technologies have revolutionized chemical reaction tasks, including reaction prediction, optimization, and catalyst design. Accurate prediction of chemical reactivity has transformed the R&D processes and accelerated discovery in academia and the chemical and pharmaceutical industries.

WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE (2022)

Article Multidisciplinary Sciences

A database of refractive indices and dielectric constants auto-generated using ChemDataExtractor

Jiuyang Zhao et al.

Summary: The auto-generated optical property database has great potential for advancing optical research and data-driven discovery of optical materials. It provides a representative overview of linear optical properties from scientific papers in the past 30 years.

SCIENTIFIC DATA (2022)

News Item Multidisciplinary Sciences

'THE ENTIRE PROTEIN UNIVERSE': AI PREDICTS SHAPE OF NEARLY EVERY KNOWN PROTEIN

Ewen Callaway

NATURE (2022)

Article Multidisciplinary Sciences

Digitization and validation of a chemical synthesis literature database in the ChemPU

Simon Rohrbacht et al.

Summary: Despite potential, the automation of synthetic chemistry has only made slow progress in recent decades. In this study, we introduce an automated chemical reaction database with 100 representative molecules. Through robotic experimentation, over 50 reactions from the database were performed, achieving yields and purities comparable to those achieved by experts.

SCIENCE (2022)

Article Multidisciplinary Sciences

AI-based structure prediction empowers integrative structural analysis of human nuclear pores

Shyamal Mosalaganti et al.

Summary: This study combines artificial intelligence-based structure prediction with in situ structural biology to reveal the structure and function of the human nuclear pore complex (NPC) scaffold. The findings demonstrate the importance of linker nucleoporins in establishing the higher-order structure of the scaffold and suggest that it widens the central pore rather than stabilizing the fusion of the inner and outer nuclear membranes.

SCIENCE (2022)

Article Multidisciplinary Sciences

Perovskite- and Dye-Sensitized Solar-Cell Device Databases Auto-generated Using ChemDataExtractor

Edward J. Beard et al.

Summary: The number of scientific publications on third-generation photovoltaic devices is rapidly increasing in response to the urgent need for renewable energy technologies to address climate change. This study presents two databases generated by text-mining techniques, containing performance and material data for dye-sensitized solar cells (DSCs) and perovskite solar cells (PSCs). These databases, consisting of 660,881 data entries representing 57,678 photovoltaic devices, have undergone a comprehensive evaluation process to ensure data quality, with precision metrics ranging from 73.1% to 95.8%. The databases are available in MongoDB and JSON formats for querying in various programming languages, facilitating data-driven discovery of photovoltaic materials.

SCIENTIFIC DATA (2022)

News Item Multidisciplinary Sciences

US GOVERNMENT REVEALS BIG CHANGES TO OPEN-ACCESS POLICY

Jeff Tollefson et al.

Summary: The Biden administration has instructed all US agencies to require immediate access to federally funded research after it is published, starting in 2026.

NATURE (2022)

News Item Chemistry, Multidisciplinary

MACHINE LEARNING The chemistry of errors

Jacqueline M. Cole

Summary: The application of machine learning to predict reaction outcomes using big data has encountered significant challenges. Many chemical reaction data are not suitable for accurate predictions, making it necessary for synthetic chemists to change their reaction design and reporting practices.

NATURE CHEMISTRY (2022)

Article Computer Science, Artificial Intelligence

SELFIES and the future of molecular string representations

Mario Krenn et al.

Summary: Artificial intelligence and machine learning have gained popularity in the field of chemistry and materials science, requiring a fluent chemical language. The traditional molecular string representation, SMILES, has limitations, but the introduction of SELFIES in 2020 has solved these issues and enabled new applications in chemistry. Looking ahead, 16 future projects for robust molecular representations are proposed.

PATTERNS (2022)

Article Biochemical Research Methods

MolTrans: Molecular Interaction Transformer for drug-target interaction prediction

Kexin Huang et al.

Summary: The MolTrans model improves the accuracy and interpretability of drug-target interaction prediction through knowledge-inspired sub-structural pattern mining algorithm and augmented transformer encoder, better extracting and capturing semantic relations among sub-structures extracted from massive unlabeled biomedical data.

BIOINFORMATICS (2021)

Editorial Material Medicine, Research & Experimental

State-of-the-art of artificial intelligence in medicinal chemistry

Jurgen Bajorath

FUTURE SCIENCE OA (2021)

Article

Nationale Forschungsdateninfrastruktur (NFDI)

Nathalie Hartl et al.

Informatik-Spektrum (2021)

Article Biochemistry & Molecular Biology

RCSB Protein Data Bank: powerful new tools for exploring 3D structures of biological macromolecules for basic and applied research and education in fundamental biology, biomedicine, biotechnology, bioengineering and energy sciences

Stephen K. Burley et al.

Summary: RCSB PDB, as the US data center for the global PDB archive, provides free access to 3D macromolecular structure data for millions of users worldwide, including educators, students, and the general public, integrating over 40 external biodata resources. The redesigned website now features improved search functionality and easier access to PDB data, showcasing new structures relevant to the understanding and addressing of the COVID-19 pandemic.

NUCLEIC ACIDS RESEARCH (2021)

Article Biochemistry & Molecular Biology

UniProt: the universal protein knowledgebase in 2021

Alex Bateman et al.

Summary: The UniProt Knowledgebase aims to provide users with a comprehensive, high-quality set of protein sequences annotated with functional information. Updates over the past two years have increased the number of sequences to approximately 190 million, with new methods to assess proteome completeness and quality. UniProtKB has responded to the COVID-19 pandemic by expertly curating relevant entries and making them rapidly available through a dedicated portal.

NUCLEIC ACIDS RESEARCH (2021)

Review Chemistry, Multidisciplinary

The Open Reaction Database

Steven M. Kearnes et al.

Summary: The study introduces the Open Reaction Database (ORD) schema and infrastructure for structuring and sharing organic reaction data, providing a centralized data repository on GitHub. This consistent data representation and infrastructure aim to enhance the development of computer-aided synthesis planning, reaction prediction, and other predictive chemistry tasks.

JOURNAL OF THE AMERICAN CHEMICAL SOCIETY (2021)

Article Biochemistry & Molecular Biology

A community resource for paired genomic and metabolomic data mining

Michelle A. Schorn et al.

Summary: Genomics and metabolomics are commonly used to explore the diversity of specialized metabolites. The Paired Omics Data Platform aims to systematically document the connections between metabolome and (meta)genome data to aid in the identification of natural product biosynthetic origins and metabolite structures.

NATURE CHEMICAL BIOLOGY (2021)

Review Biotechnology & Applied Microbiology

Natural products in drug discovery: advances and opportunities

Atanas G. Atanasov et al.

Summary: Natural products and their analogues have historically played a significant role in pharmacotherapy, however, they also present challenges. Recent technological and scientific developments are addressing these challenges and revitalizing interest in natural products as drug leads, particularly for combating antimicrobial resistance.

NATURE REVIEWS DRUG DISCOVERY (2021)

Article Multidisciplinary Sciences

Pushing the frontiers of density functionals by solving the fractional electron problem

James Kirkpatrick et al.

Summary: Density functional theory has long been plagued by systematic errors in approximations, but a new neural network-based functional, DM21, shows promise in accurately describing complex systems and outperforming traditional functionals in benchmarks. By relying on data and constraints, DM21 represents a viable pathway toward the exact universal functional.

SCIENCE (2021)

Review Pharmacology & Pharmacy

Artificial intelligence in drug discovery: recent advances and future perspectives

Jose Jimenez-Luna et al.

Summary: This article reviews the current status of AI in chemoinformatics, discussing topics such as quantitative structure-activity/property relationship and structure-based modeling, de novo molecular design, and chemical synthesis prediction. The advantages and limitations of current deep learning applications are highlighted, offering a perspective on next-generation AI for drug discovery.

EXPERT OPINION ON DRUG DISCOVERY (2021)

Review Pharmacology & Pharmacy

Critical assessment of AI in drug discovery

W. Patrick Walters et al.

Summary: AI has become an integral part of everyday life, with applications in various fields including drug discovery. The use of AI in drug discovery encompasses property prediction, molecule generation, image analysis, and organic synthesis planning. While machine learning methods are commonly used for predicting biological activity, the development of new molecule generation methods has the potential to explore uncharted chemical space. The continued advancement of AI in drug discovery will rely on dedicated research and progress in AI technology.

EXPERT OPINION ON DRUG DISCOVERY (2021)

Article Computer Science, Hardware & Architecture

Marvell ThunderX3: Next-Generation Arm-Based Server Processor

Thomas Norrie et al.

Summary: Marvell ThunderX3 is a third-generation Arm-based server processor with unique support for four SMT threads. Initial benchmark results demonstrate significant performance gains over the prior generation, establishing industry-leading performance in single thread and socket levels.

IEEE MICRO (2021)

Article Chemistry, Medicinal

ChemDataExtractor 2.0: Autopopulated Ontologies for Materials Science

Juraj Mavracic et al.

Summary: The article introduces a framework for automated populating ontologies, enabling direct extraction of a larger group of properties linked by a semantic network. Exploiting data-rich sources, a new model concept is presented for data extraction of chemical and physical properties. With automatically generated parsers for data extraction and forward-looking interdependency resolution, the power of the approach is illustrated through automatic extraction of a crystallographic hierarchy.

JOURNAL OF CHEMICAL INFORMATION AND MODELING (2021)

Article Biochemistry & Molecular Biology

Evolving scenario of big data and Artificial Intelligence (AI) in drug discovery

Manish Kumar Tripathi et al.

Summary: The accumulation of massive data in Cheminformatics databases has made big data and artificial intelligence indispensable in drug design. The development of newer algorithms and architectures has fulfilled the specific needs of various drug discovery processes, while deep learning neural networks have resulted in a paradigm shift in chemical information mining.

MOLECULAR DIVERSITY (2021)

Article Multidisciplinary Sciences

Highly accurate protein structure prediction with AlphaFold

John Jumper et al.

Summary: Proteins are essential for life, and accurate prediction of their structures is a crucial research problem. Current experimental methods are time-consuming, highlighting the need for accurate computational approaches to address the gap in structural coverage. Despite recent progress, existing methods fall short of atomic accuracy in protein structure prediction.

NATURE (2021)

Article Multidisciplinary Sciences

Highly accurate protein structure prediction for the human proteome

Kathryn Tunyasuvunakool et al.

Summary: Using the AlphaFold method, the structural coverage of the proteome has been significantly expanded, covering 98.5% of human proteins with 58% of residues having confident predictions and 36% having very high confidence. Introducing new metrics to interpret the dataset and identify disordered regions, this study aims to provide high-quality predictions for generating biological hypotheses.

NATURE (2021)

Editorial Material Chemistry, Multidisciplinary

Best practices in machine learning for chemistry comment

Nongnuch Artrith et al.

Summary: In chemistry research, statistical tools based on machine learning are being integrated to train reliable, repeatable, and reproducible models. Guidelines for machine learning reports are recommended to ensure the quality of the models.

NATURE CHEMISTRY (2021)

Article Biochemistry & Molecular Biology

Critical assessment of methods of protein structure prediction (CASP)-Round XIV

Andriy Kryshtafovych et al.

Summary: CASP is a community experiment aimed at advancing methods for computing three-dimensional protein structure, including rigorous blind testing and evaluation by independent assessors. In the recent CASP14 experiment, deep-learning methods from one research group consistently delivered computed structures rivaling the corresponding experimental ones in accuracy. These results represent a solution to the classical protein-folding problem, at least for single proteins.

PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS (2021)

Review Biochemistry & Molecular Biology

Benefiting from big data in natural products: importance of preserving foundational skills and prioritizing data quality

Nadja B. Cech et al.

Summary: The use of big data has transformed natural product sciences, allowing researchers to engage with taxonomic, genomic, proteomic, and metabolomic data to generate and test hypotheses. However, the goal of rapidly increasing the rate of new drug discovery has not yet been achieved. New technologies have provided unexpected opportunities for natural products chemists to ask and answer new questions.

NATURAL PRODUCT REPORTS (2021)

Review Biochemistry & Molecular Biology

Genome mining methods to discover bioactive natural products

Katherine D. Bauman et al.

Summary: With the availability of genetic information for hundreds of thousands of organisms in publicly accessible databases, scientists have unprecedented opportunities to explore the diversity and inner workings of life. By harnessing this information, researchers are able to specifically discover bioactive natural products and their gene clusters using orthogonal genome mining strategies.

NATURAL PRODUCT REPORTS (2021)

Article Chemistry, Multidisciplinary

Img2Mol-accurate SMILES recognition from molecular graphical depictions

Djork-Arne Clevert et al.

Summary: The paper introduces a model that combines deep convolutional neural network learning and a pre-trained decoder to accurately translate molecular images into SMILES representation. Evaluation shows that the model can correctly translate up to 88% of molecular images.

CHEMICAL SCIENCE (2021)

Review Biochemistry & Molecular Biology

Advancements in capturing and mining mass spectrometry data are transforming natural products research

Scott A. Jarmusch et al.

Summary: This article discusses the importance of mass spectrometry technology in natural products research and the new trend of open MS data and data mining tools in this field. Over the past 5 years, this shift has rapidly developed with huge potential for the future. The article proposes a new framework and challenges for utilizing repository data, highlighting the importance of data openness and data mining as the next important development stage in natural products research.

NATURAL PRODUCT REPORTS (2021)

Review Biochemistry & Molecular Biology

Metabolomics and genomics in natural products research: complementary tools for targeting new chemical entities

Lindsay K. Caesar et al.

Summary: Organisms in nature have evolved specialized enzymatic machinery to biosynthesize a variety of secondary metabolites, which has profound impacts on human health. Recent advancements in metabolomics and genomics have allowed for efficient exploration of new chemical spaces in natural product discovery. Integrated strategies now enable researchers to simultaneously identify expressed secondary metabolites and their biosynthetic machinery.

NATURAL PRODUCT REPORTS (2021)

Article Chemistry, Multidisciplinary

ChemPix: automated recognition of hand-drawn hydrocarbon structures using deep learning

Hayley Weir et al.

Summary: A chemical software tool was developed to recognize hand-drawn hydrocarbon structures, utilizing machine learning methods for training. By forming a neural network committee, the accuracy and confidence of molecule recognition were significantly improved.

CHEMICAL SCIENCE (2021)

Article Chemistry, Multidisciplinary

Beyond generative models: superfast traversal, optimization, novelty, exploration and discovery (STONED) algorithm for molecules using SELFIES

AkshatKumar Nigam et al.

Summary: The study introduces a new algorithm called STONED, which achieves performance comparable to deep generative models in the chemical space through interpolation and exploration without the need for large amounts of data and training time.

CHEMICAL SCIENCE (2021)

Article Computer Science, Artificial Intelligence

Mapping the space of chemical reactions using attention-based neural networks

Philippe Schwaller et al.

Summary: This study demonstrates how transformer-based models can infer reaction classes from non-annotated text-based representations of chemical reactions with an accuracy of 98.2%. The learned representations can be used as reaction fingerprints to capture fine-grained differences between reaction classes better than traditional fingerprints. Insights into chemical reaction space are illustrated through an interactive reaction atlas providing visual clustering and similarity searching.

NATURE MACHINE INTELLIGENCE (2021)

Article Biochemistry & Molecular Biology

PubChem in 2021: new data content and improved web interfaces

Sunghwan Kim et al.

Summary: PubChem, a popular chemical information resource, has made substantial improvements in the past two years by adding data from over 100 new sources, updating its homepage and record pages, introducing new services like the Periodic Table and Pathway pages, and creating a special data collection related to COVID-19 and SARS-CoV-2 in response to the pandemic.

NUCLEIC ACIDS RESEARCH (2021)

Article Computer Science, Information Systems

When SMILES Smiles, Practicality Judgment and Yield Prediction of Chemical Reaction via Deep Chemical Language Processing

Shu Jiang et al.

Summary: SMILES provides a text-based encoding method to describe chemical structures and reactions; a symbol-only model is proposed to predict organic synthesis reaction yields with high accuracy and low error; the study demonstrates the potential for automatic yield prediction in organic reactions and its applications in synthesis path prediction.

IEEE ACCESS (2021)

Article Chemistry, Medicinal

Learning Molecular Representations for Medicinal Chemistry Miniperspective

Kangway Chuang et al.

JOURNAL OF MEDICINAL CHEMISTRY (2020)

Article Multidisciplinary Sciences

A database of battery materials auto-generated using ChemDataExtractor

Shu Huang et al.

SCIENTIFIC DATA (2020)

Article Multidisciplinary Sciences

Automated extraction of chemical synthesis actions from experimental procedures

Alain C. Vaucher et al.

NATURE COMMUNICATIONS (2020)

Article Chemistry, Medicinal

ChemGrapher: Optical Graph Recognition of Chemical Compounds by Deep Learning

Martijn Oldenhof et al.

JOURNAL OF CHEMICAL INFORMATION AND MODELING (2020)

Article Chemistry, Medicinal

ZINC20-A Free Ultralarge-Scale Chemical Database for Ligand Discovery

John J. Irwin et al.

JOURNAL OF CHEMICAL INFORMATION AND MODELING (2020)

Article Chemistry, Inorganic & Nuclear

Research Data in Chemistry - Results of the first NFDI4Chem Community Survey

Sonja Herres-Pawlis et al.

ZEITSCHRIFT FUR ANORGANISCHE UND ALLGEMEINE CHEMIE (2020)

Article Chemistry, Multidisciplinary

Predicting retrosynthetic pathways using transformer-based models and a hyper-graph exploration strategy

Philippe Schwaller et al.

CHEMICAL SCIENCE (2020)

Article Multidisciplinary Sciences

The digitization of organic synthesis

Ian W. Davies

NATURE (2019)

Article Chemistry, Multidisciplinary

Molecular Transformer: A Model for Uncertainty-Calibrated Chemical Reaction Prediction

Philippe Schwaller et al.

ACS CENTRAL SCIENCE (2019)

Article Computer Science, Theory & Methods

Fast Deep Neural Network Training on Distributed Systems and Cloud TPUs

Yang You et al.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS (2019)

Article Biochemistry & Molecular Biology

Protein Data Bank: the single global archive for 3D macromolecular structure data

Stephen K. Burley et al.

NUCLEIC ACIDS RESEARCH (2019)

Article Multidisciplinary Sciences

Comparative dataset of experimental and computational attributes of UV/vis absorption spectra

Edward J. Beard et al.

SCIENTIFIC DATA (2019)

Article Biochemistry & Molecular Biology

DrugBank 5.0: a major update to the DrugBank database for 2018

David S. Wishart et al.

NUCLEIC ACIDS RESEARCH (2018)

Article Chemistry, Multidisciplinary

Deoxyfluorination with Sulfonyl Fluorides: Navigating Reaction Space with Machine Learning

Matthew K. Nielsen et al.

JOURNAL OF THE AMERICAN CHEMICAL SOCIETY (2018)

Article Multidisciplinary Sciences

Planning chemical syntheses with deep neural networks and symbolic AI

Marwin H. S. Segler et al.

NATURE (2018)

Article Biochemistry & Molecular Biology

The ChEMBL database in 2017

Anna Gaulton et al.

NUCLEIC ACIDS RESEARCH (2017)

Article Multidisciplinary Sciences

Mastering the game of Go without human knowledge

David Silver et al.

NATURE (2017)

Article Chemistry, Multidisciplinary

Prediction of Organic Reaction Outcomes Using Machine Learning

Connor W. Coley et al.

ACS CENTRAL SCIENCE (2017)

Article Biochemistry & Molecular Biology

ChEBI in 2016: Improved services and an expanding collection of metabolites

Janna Hastings et al.

NUCLEIC ACIDS RESEARCH (2016)

Article Chemistry, Medicinal

ChemDataExtractor: A Toolkit for Automated Extraction of Chemical Information from the Scientific Literature

Matthew C. Swain et al.

JOURNAL OF CHEMICAL INFORMATION AND MODELING (2016)

Article Chemistry, Multidisciplinary

The Cambridge Structural Database

Colin R. Groom et al.

ACTA CRYSTALLOGRAPHICA SECTION B-STRUCTURAL SCIENCE CRYSTAL ENGINEERING AND MATERIALS (2016)

News Item Multidisciplinary Sciences

Elsevier opens its papers to text-mining

Richard Van Noorden

NATURE (2014)

Article Computer Science, Hardware & Architecture

Cheminformatics

Joerg Kurt Wegner et al.

COMMUNICATIONS OF THE ACM (2012)

Article Chemistry, Medicinal

Blue Obelisk - Interoperability in chemical informatics

Rajarshi Guha et al.

JOURNAL OF CHEMICAL INFORMATION AND MODELING (2006)