4.7 Review

Evaluation of gene-drug common module identification methods using pharmacogenomics data

Journal

BRIEFINGS IN BIOINFORMATICS
Volume 22, Issue 3, Pages -

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bib/bbaa087

Keywords

gene-drug interactions; common modules; machine learning; non-negative matrix factorization; partial least squares; network analyses

Funding

  1. National Natural Science Foundation of China [61771007]
  2. Health & Medical Collaborative Innovation Project of Guangzhou City [201803010021]

Ask authors/readers for more resources

Accurately identifying the interactions between genomic factors and the response of cancer drugs is crucial in drug discovery, drug repositioning, and cancer treatment. Studies have shown that interactions between genes and drugs are 'many-genes-to-many drugs' interactions, requiring improved strategies to identify common modules among pharmacogenomics data. This paper evaluates state-of-the-art common module identification techniques from a machine learning perspective, highlighting the importance of understanding complex biological regulatory mechanisms in cancer drug interactions.
Accurately identifying the interactions between genomic factors and the response of cancer drugs plays important roles in drug discovery, drug repositioning and cancer treatment. A number of studies revealed that interactions between genes and drugs were 'many-genes-to-many drugs' interactions, i.e. common modules, opposed to 'one-gene-to-one-drug' interactions. Such modules fully explain the interactions between complex biological regulatory mechanisms and cancer drugs. However, strategies for effectively and robustly identifying the underlying common modules among pharmacogenomics data remain to be improved. In this paper, we aim to provide a detailed evaluation of three categories of state-of-the-art common module identification techniques from a machine learning perspective, including non-negative matrix factorization (NMF), partial least squares (PLS) and network analyses. We first evaluate the performance of six methods, namely SNMNMF, NetNMF, SNPLS, O2PLS, NSBM and HOGMMNC, using two series of simulated data sets with different noise levels and outlier ratios. Then, we conduct experiments using a real world data set of 2091 genes and 101 drugs in 392 cancer cell lines and compare the real experimental results from the aspect of biological process term enrichment, gene-drug and drug-drug interactions. Finally, we present interesting findings from our evaluation study and discuss the advantages and drawbacks of each method. Supplementary information: Supplementary file is available at Briefings in Bioinformatics online.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available