4.8 Article

SureChEMBL: a large-scale, chemically annotated patent document database

Journal

NUCLEIC ACIDS RESEARCH
Volume 44, Issue D1, Pages D1220-D1228

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/nar/gkv1253

Keywords

-

Funding

  1. Wellcome Trust [WT086151/Z/08/Z, WT104104/Z/14/Z]
  2. European Molecular Biology Laboratory

Ask authors/readers for more resources

SureChEMBL is a publicly available large-scale resource containing compounds extracted from the full text, images and attachments of patent documents. The data are extracted from the patent literature according to an automated text and image-mining pipeline on a daily basis. SureChEMBL provides access to a previously unavailable, open and timely set of annotated compound-patent associations, complemented with sophisticated combined structure and keyword-based search capabilities against the compound repository and patent document corpus; given the wealth of knowledge hidden in patent documents, analysis of SureChEMBL data has immediate applications in drug discovery, medicinal chemistry and other commercial areas of chemical science. Currently, the database contains 17 million compounds extracted from 14 million patent documents. Access is available through a dedicated web-based interface and data downloads at: https: //www.surechembl.org/.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available