4.7 Article

QuantumTox: Utilizing quantum chemistry with ensemble learning for molecular toxicity prediction

Journal

COMPUTERS IN BIOLOGY AND MEDICINE
Volume 157, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.compbiomed.2023.106744

Keywords

Toxicity prediction; Quantum chemistry; Ensemble learning; Molecular representation; Tox21; Gradient Boosting Decision Tree; Bagging

Ask authors/readers for more resources

Molecular toxicity prediction is crucial for drug discovery and human health. Existing machine learning models for toxicity prediction do not fully utilize the 3D information of molecules, which can influence their toxicity. In this study, we propose QuantumTox, the first application of quantum chemistry in drug molecule toxicity prediction. Our model extracts quantum chemical information as 3D features and uses ensemble learning methods to improve accuracy and generalization. Experimental results demonstrate consistent outperformance compared to baseline models, even on small datasets.
Molecular toxicity prediction plays an important role in drug discovery, which is directly related to human health and drug fate. Accurately determining the toxicity of molecules can help weed out low-quality molecules in the early stage of drug discovery process and avoid depletion later in the drug development process. Nowadays, more and more researchers are starting to use machine learning methods to predict the toxicity of molecules, but these models do not fully exploit the 3D information of molecules. Quantum chemical information, which provides stereo structural information of molecules, can influence their toxicity. To this end, we propose QuantumTox, the first application of quantum chemistry in the field of drug molecule toxicity prediction compared to existing work. We extract the quantum chemical information of molecules as their 3D features. In the downstream prediction phase, we use Gradient Boosting Decision Tree and Bagging ensemble learning methods together to improve the accuracy and generalization of the model. A series of experiments on various tasks show that our model consistently outperforms the baseline model and that the model still performs well on small datasets of less than 300.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available