4.7 Article

Anatomy of fingerprint search calculations on structurally diverse sets of active compounds

Ask authors/readers for more resources

Similarity searching using molecular fingerprints is a widely used approach for the identification of novel hits. A fingerprint search involves many pairwise comparisons of bit string representations of known active molecules with those precomputed for database compounds. Bit string overlap, as evaluated by various similarity metrics, is used as a measure of molecular similarity. Results of a number of studies focusing on fingerprints suggest that it is difficult, if not impossible, to develop generally applicable search parameters and strategies, irrespective of the compound classes under investigation. Rather, more or less, each individual search problem requires an adjustment of calculation conditions. Thus, there is a need for diagnostic tools to analyze fingerprint-based similarity searching. We report an analysis of fingerprint search calculations on different sets of structurally diverse active compounds. Calculations on five biological activity classes were carried out with two fingerprints in two compound Source databases, and the results were analyzed in histograms. Tanimoto coefficient (Tc) value ranges where active compounds were detected were compared to the distribution of Tc values in the database. The analysis revealed that compound class-specific effects strongly influenced the outcome of these fingerprint calculations. Among the five diverse compound sets studied, very different search results were obtained. The analysis described here can be applied to determine Tc intervals where scaffold hopping Occurs. It can also be used to benchmark fingerprint calculations or estimate their probability of success.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available