4.7 Article

Classification of Cytochrome P450 Activities Using Machine Learning Methods

Journal

MOLECULAR PHARMACEUTICS
Volume 6, Issue 6, Pages 1920-1926

Publisher

AMER CHEMICAL SOC
DOI: 10.1021/mp900217x

Keywords

QSAR; cytochrome P-450; machine learning; drug safety; drug design; support vector machine; artificial neural network; decision trees; k nearest neighbors; random forest

Ask authors/readers for more resources

The cytochrome P-450 (GYP) system plays an integral part in the metabolism of drugs and other xenobiotics. Knowledge of the structural features required for interaction with any of the different isoforms of the CYP system is therefore immensely valuable in early drug discovery. In this paper, we focus on three major isoforms (CYP 1 A2, CYP 2D6, and CYP 3A4) and present a data set of 335 structurally diverse drug compounds classified for their interaction (as substrate, inhibitor, or any interaction) with these isoforms. We also present machine learning models using a variety of commonly used methods (k-nearest neighbors, decision tree induction using the CHAID and CRT algorithms, random forests, artificial neural networks, and support vector machines using the radial basis function (RBF) and homogeneous polynomials as kernel functions). We discuss the physicochemical features relevant for each end point and compare it to similar studies. Many of these models perform exceptionally well, even with 10-fold cross-validation, yielding corrected classification rates of 81.7 to 91.9% for CYP 1A2, 89.2 to 92.9% for CYP 2D6, and 87.4 to 89.9% for CYP3A4. Our models help in understanding the structural requirements for CYP interactions and can serve as sensitive tools in virtual screenings and lead optimization for toxicological profiles in drug discovery.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available