4.7 Review

Transcriptomic Harmonization as the Way for Suppressing Cross-Platform Bias and Batch Effect

Related references

Note: Only part of the references are listed.
Article Biochemical Research Methods

A novel meta-analysis based on data augmentation and elastic data shared lasso regularization for gene expression

Hai-Hui Huang et al.

Summary: In this study, a novel meta-analysis framework is proposed for improving the performance of gene expression analysis. The framework incorporates data augmentation and elastic data shared lasso methods. Experimental results show that the proposed method has high prediction and gene selection performance. Furthermore, the method is successfully applied to non-small cell lung cancer research.

BMC BIOINFORMATICS (2022)

Article Computer Science, Artificial Intelligence

The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation

Davide Chicco et al.

Summary: Regression analysis is a key component of supervised machine learning, predicting continuous targets from other variables. While binary classification has targets with two values, regression can have multiple values. However, there is no consensus on a standard metric for evaluating regression results, with commonly used measures like MSE, RMSE, MAE, and MAPE having interpretability limitations.

PEERJ COMPUTER SCIENCE (2021)

Article Biochemical Research Methods

CuBlock: a cross-platform normalization method for gene-expression microarrays

Valentin Junet et al.

Summary: The study focused on the normalization of gene-expression microarray data across multiple platforms and time points, developing the CuBlock algorithm which demonstrated superior performance in separating samples from different biological groups. Other popular methods were found to be not applicable in this context.

BIOINFORMATICS (2021)

Article Biochemistry & Molecular Biology

Synthetic lethality-mediated precision oncology via the tumor transcriptome

Joo Sang Lee et al.

Summary: Precision oncology has advanced significantly by targeting actionable mutations in cancer driver genes and exploring the utility of tumor transcriptome to guide patient treatment. SELECT, a precision oncology framework harnessing genetic interactions, has shown predictive accuracy in 80% of tested clinical trials and is publicly available for academic use, providing a foundation for future prospective clinical studies.
Article Multidisciplinary Sciences

Identification of transcriptional subtypes in lung adenocarcinoma and squamous cell carcinoma through integrative analysis of microarray and RNA sequencing data

Francois Fauteux et al.

Summary: Classification of lung cancer subtypes based on different gene expression profiling technologies can inform personalized treatment approaches. By integrating microarray and RNA-seq data and utilizing specific preprocessing, cross-platform normalization, and unsupervised feature selection methods, robust gene expression subtypes can be identified. This study confirms the existence of three lung adenocarcinoma transcriptional subtypes, two squamous cell carcinoma subtypes, and shows that these tumor subtypes are associated with distinct patterns of genomic alterations in therapeutic target genes. Integration of quantitative proteomics data allows for the identification of tumor subtype biomarkers that effectively classify samples based on both gene and protein expression, providing a basis for further integrative data analysis across gene and protein expression profiling platforms.

SCIENTIFIC REPORTS (2021)

Review Biochemical Research Methods

Demystifying emerging bulk RNA-Seq applications: the application and utility of bioinformatic methodology

Amarinder Singh Thind et al.

Summary: Innovations in next-generation sequencing techniques and bioinformatics tools have revolutionized our understanding of RNA. Bulk RNA-Seq data is commonly used to study gene expression, isoform expression, alternative splicing, and more, with hidden biological information such as copy number alterations and presence of neoantigens also being extracted. Advanced bioinformatic algorithms have expanded the capacity to retrieve this hidden biological information, positioning bulk RNA-Seq as a powerful tool for providing biological insights.

BRIEFINGS IN BIOINFORMATICS (2021)

Article Biochemical Research Methods

Cross-platform comparison of immune-related gene expression to assess intratumor immune responses following cancer immunotherapy

Li Zhang et al.

Summary: Neoadjuvant immunotherapy can induce immune responses within the tumor microenvironment, and gene expression can be used to assess these responses. This study aimed to evaluate the concordance of immune-related gene expression data across different platforms and panels. While a high level of consistency was observed, there were subsets of genes that showed differential expression across the panels. Despite these differences, the HTG PIP panel showed the best classification performance among the three panels.

JOURNAL OF IMMUNOLOGICAL METHODS (2021)

Article Multidisciplinary Sciences

A metabolomics pipeline for the mechanistic interrogation of the gut microbiome

Shuo Han et al.

Summary: Research has shown that gut microorganisms can influence host physiology, and using mass spectrometry technology can accelerate the identification of microbial metabolites, thereby enabling in-depth study of the relationship between microorganisms and hosts.

NATURE (2021)

Article Biochemistry & Molecular Biology

Rank-in: enabling integrative analysis across microarray and RNA-seq for cancer

Kailin Tang et al.

Summary: Although transcriptomics technologies have advanced rapidly in the past decades, integrating mixed data from microarray and RNA-seq remains challenging due to inherent variability differences. Rank-In is a novel method proposed to correct nonbiological effects and enable consolidated analysis of blended data. Validated on public cell and tissue samples, Rank-In demonstrated superior classification and prediction accuracy, showing potential for integrative study of cancer profiles.

NUCLEIC ACIDS RESEARCH (2021)

Article Biochemistry & Molecular Biology

Rank-in: enabling integrative analysis across microarray and RNA-seq for cancer

Kailin Tang et al.

Summary: Rank-In is a method that corrects nonbiological effects in mixed microarray and RNA-seq data for integrated analysis. It has been validated to accurately classify samples and achieve high accuracy in predicting differentially expressed genes.

NUCLEIC ACIDS RESEARCH (2021)

Article Biochemical Research Methods

MatchMixeR: a cross-platform normalization method for gene expression data integration

Serin Zhang et al.

BIOINFORMATICS (2020)

Article Biochemistry & Molecular Biology

Flexible Data Trimming Improves Performance of Global Machine Learning Methods in Omics-Based Personalized Oncology

Victor Tkachev et al.

INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES (2020)

Article Biochemistry & Molecular Biology

Disparity between Inter-Patient Molecular Heterogeneity and Repertoires of Target Drugs Used for Different Types of Cancer in Clinical Oncology

Marianna A. Zolotovskaia et al.

INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES (2020)

Article Genetics & Heredity

Cancer gene expression profiles associated with clinical outcomes to chemotherapy treatments

Nicolas Borisov et al.

BMC MEDICAL GENOMICS (2020)

Article Rheumatology

An integrative Bayesian network approach to highlight key drivers in systemic lupus erythematosus

Samaneh Maleknia et al.

ARTHRITIS RESEARCH & THERAPY (2020)

Article Biochemical Research Methods

Shambhala: a platform-agnostic data harmonizer for gene expression data

Nicolas Borisov et al.

BMC BIOINFORMATICS (2019)

Article Multidisciplinary Sciences

Atlas of RNA sequencing profiles for normal human tissues

Maria Suntsova et al.

SCIENTIFIC DATA (2019)

Article Biology

Clinical intelligence: New machine learning techniques for predicting clinical drug response

Turki Turki et al.

COMPUTERS IN BIOLOGY AND MEDICINE (2019)

Article Multidisciplinary Sciences

Digitizing omics profiles by divergence from a baseline

Wikum Dinalankara et al.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2018)

Article Biology

Clinker: visualizing fusion genes detected in RNA-seq data

Breon M. Schmidt et al.

GIGASCIENCE (2018)

Article Biochemical Research Methods

Node-based learning of differential networks from multi-platform gene expression data

Le Ou-Yang et al.

METHODS (2017)

Review Mathematical & Computational Biology

Ten quick tips for machine learning in computational biology

Davide Chicco

BIODATA MINING (2017)

Article Biotechnology & Applied Microbiology

Based Transcription Characterization of Fusion Breakpoints as a Potential Estimator for Its Oncogenic Potential

Jian-lei Gu et al.

BIOMED RESEARCH INTERNATIONAL (2017)

Article Biochemical Research Methods

DBNorm: normalizing high-density oligonucleotide microarray data based on distributions

Qinxue Meng et al.

BMC BIOINFORMATICS (2017)

Article Biochemical Research Methods

Gene expression inference with deep learning

Yifei Chen et al.

BIOINFORMATICS (2016)

Article Oncology

Novel fusion transcripts in bladder cancer identified by RNA-seq

T. Kekeeva et al.

CANCER LETTERS (2016)

Article Multidisciplinary Sciences

Cross-platform normalization of microarray and RNA-seq data for machine learning applications

Jeffrey A. Thompson et al.

PEERJ (2016)

Article Biochemical Research Methods

Seq-ing improved gene expression estimates from microarrays using machine learning

Paul K. Korir et al.

BMC BIOINFORMATICS (2015)

Article Multidisciplinary Sciences

The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans

Kristin G. Ardlie et al.

SCIENCE (2015)

Article

Review The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge

Katarzyna Tomczak et al.

Wspolczesna Onkologia-Contemporary Oncology (2015)

Article Biochemical Research Methods

PLIDA: cross-platform gene expression normalization using perturbed topic models

Amit G. Deshwar et al.

BIOINFORMATICS (2014)

Article Biotechnology & Applied Microbiology

Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2

Michael I. Love et al.

GENOME BIOLOGY (2014)

Article Biochemical Research Methods

Batch effect removal methods for microarray gene expression data integration: a survey

Cosmin Lazar et al.

BRIEFINGS IN BIOINFORMATICS (2013)

Editorial Material Genetics & Heredity

The Genotype-Tissue Expression (GTEx) project

John Lonsdale et al.

NATURE GENETICS (2013)

Article Biochemistry & Molecular Biology

Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells

Wanjuan Yang et al.

NUCLEIC ACIDS RESEARCH (2013)

Article Multidisciplinary Sciences

Multiplatform single-sample estimates of transcriptional activation

Stephen R. Piccolo et al.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2013)

Article Biochemical Research Methods

fRMA ST: frozen robust multiarray analysis for Affymetrix Exon and Gene ST arrays

Matthew N. McCall et al.

BIOINFORMATICS (2012)

Article Biochemical Research Methods

R/DWD: distance-weighted discrimination for classification, visualization and batch adjustment

Hanwen Huang et al.

BIOINFORMATICS (2012)

Article Biotechnology & Applied Microbiology

A single-sample microarray normalization method to facilitate personalized-medicine workflows

Stephen R. Piccolo et al.

GENOMICS (2012)

Article Biochemistry & Molecular Biology

RNA Sequencing: Platform Selection, Experimental Design, and Data Interpretation

Yongjun Chu et al.

NUCLEIC ACID THERAPEUTICS (2012)

Article Biochemical Research Methods

Empirical comparison of cross-platform normalization methods for gene expression data

Jason Rudy et al.

BMC BIOINFORMATICS (2011)

Article Biochemical Research Methods

Assessing affymetrix GeneChip microarray quality

Matthew N. McCall et al.

BMC BIOINFORMATICS (2011)

Article Biochemistry & Molecular Biology

The Gene Expression Barcode: leveraging public data repositories to begin cataloging the human and murine transcriptomes

Matthew N. McCall et al.

NUCLEIC ACIDS RESEARCH (2011)

Article Mathematical & Computational Biology

Frozen robust multiarray analysis (fRMA)

Matthew N. McCall et al.

BIOSTATISTICS (2010)

Article Computer Science, Interdisciplinary Applications

Regularization Paths for Generalized Linear Models via Coordinate Descent

Jerome Friedman et al.

JOURNAL OF STATISTICAL SOFTWARE (2010)

Article Chemistry, Medicinal

How to Deal with Batch Effect in Sequential Microarray Experiments?

Nino Demetrashvili et al.

MOLECULAR INFORMATICS (2010)

Article Biotechnology & Applied Microbiology

Differential expression analysis for sequence count data

Simon Anders et al.

GENOME BIOLOGY (2010)

Article Biochemical Research Methods

WebArrayDB: cross-platform microarray data analysis and public data repository

Xiao-Qin Xia et al.

BIOINFORMATICS (2009)

Article Biotechnology & Applied Microbiology

Transcriptome sequencing of the Microarray Quality Control (MAQC) RNA reference samples using next generation sequencing

Shrinivasrao P. Mane et al.

BMC GENOMICS (2009)

Article Multidisciplinary Sciences

Transcriptome sequencing to detect gene fusions in cancer

Christopher A. Maher et al.

NATURE (2009)

Review Genetics & Heredity

RNA-Seq: a revolutionary tool for transcriptomics

Zhong Wang et al.

NATURE REVIEWS GENETICS (2009)

Article Biochemical Research Methods

GenMiner: mining non-redundant association rules from integrated gene expression data and annotations

Ricardo Martinez et al.

BIOINFORMATICS (2008)

Article Biochemical Research Methods

Merging two gene-expression studies via cross-platform normalization

Andrey A. Shabalin et al.

BIOINFORMATICS (2008)

Article Multidisciplinary Sciences

The transcriptional landscape of the yeast genome defined by RNA sequencing

Ugrappa Nagalakshmi et al.

SCIENCE (2008)

Article Biochemical Research Methods

GSEA-P:: A desktop application for Gene Set Enrichment Analysis

Aravind Subramanian et al.

BIOINFORMATICS (2007)

Article Statistics & Probability

Distance-weighted discrimination

J. S. Marron et al.

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION (2007)

Letter Biochemical Research Methods

Reproducibility of microarray data: a further analysis of microarray quality control (MAQC) data

James J. Chen et al.

BMC BIOINFORMATICS (2007)

Letter Biotechnology & Applied Microbiology

MAQC papers over the cracks

Peng Liang

NATURE BIOTECHNOLOGY (2007)

Article Biochemistry & Molecular Biology

ArrayExpress - a public database of microarray experiments and gene expression profiles

H. Parkinson et al.

NUCLEIC ACIDS RESEARCH (2007)

Article Mathematical & Computational Biology

Adjusting batch effects in microarray expression data using empirical Bayes methods

W. Evan Johnson et al.

BIOSTATISTICS (2007)

Article Biotechnology & Applied Microbiology

The molecular portraits of breast tumors are conserved across microarray platforms

Zhiyuan Hu et al.

BMC GENOMICS (2006)

Article Statistics & Probability

A model-based background adjustment for oligonucleotide expression arrays

ZJ Wu et al.

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION (2004)

Article Biochemical Research Methods

Joint analysis of two microarray gene-expression data sets to select lung adenocarcinoma marker genes

HY Jiang et al.

BMC BIOINFORMATICS (2004)

Article Biochemical Research Methods

Adjustment of systematic microarray data biases

M Benito et al.

BIOINFORMATICS (2004)

Article Biology

ArrayExpress: a public database of gene expression data at EBI

P Rocca-Serra et al.

COMPTES RENDUS BIOLOGIES (2003)

Article Medicine, General & Internal

Gene expression predictors of breast cancer outcomes

E Huang et al.

LANCET (2003)

Article Mathematical & Computational Biology

Exploration, normalization, and summaries of high density oligonucleotide array probe level data

RA Irizarry et al.

BIOSTATISTICS (2003)

Article Multidisciplinary Sciences

Diagnosis of multiple cancer types by shrunken centroids of gene expression

R Tibshirani et al.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2002)

Article Multidisciplinary Sciences

Gene expression profiling predicts clinical outcome of breast cancer

LJ van't Veer et al.

NATURE (2002)

Article Biochemistry & Molecular Biology

Gene Expression Omnibus: NCBI gene expression and hybridization array data repository

R Edgar et al.

NUCLEIC ACIDS RESEARCH (2002)

Article Medicine, General & Internal

Gene expression profile analysis by DNA microarrays - Promise and pitfalls

HC King et al.

JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION (2001)

Article Computer Science, Artificial Intelligence

Bounds on error expectation for support vector machines

V Vapnik et al.

NEURAL COMPUTATION (2000)