4.5 Article

More Accurate Transcript Assembly via Parameter Advising

Journal

JOURNAL OF COMPUTATIONAL BIOLOGY
Volume 27, Issue 8, Pages 1181-1189

Publisher

MARY ANN LIEBERT, INC
DOI: 10.1089/cmb.2019.0286

Keywords

automated bioinformatics; genomics; parameter advising; transcript assembly

Funding

  1. Gordon and Betty Moore Foundation's Data-Driven Discovery Initiative [GBMF4554]
  2. U.S. National Institutes of Health [R01HG007104, R01GM122935]
  3. Shurl and Kay Curci Foundation

Ask authors/readers for more resources

Computational tools used for genomic analyses are becoming more accurate but also increasingly sophisticated and complex. This introduces a new problem in that these pieces of software have a large number of tunable parameters that often have a large influence on the results that are reported. We quantify the impact of parameter choice on transcript assembly and take some first steps toward generating a truly automated genomic analysis pipeline by developing a method for automatically choosing input-specific parameter values for reference-based transcript assembly using the Scallop tool. By choosing parameter values for each input, the area under the receiver operator characteristic curve (AUC) when comparing assembled transcripts to a reference transcriptome is increased by an average of 28.9% over using only the default parameter choices on 1595 RNA-Seq samples in the Sequence Read Archive. This approach is general, and when applied to StringTie, it increases the AUC by an average of 13.1% on a set of 65 RNA-Seq experiments from ENCODE. Parameter advisors for both Scallop and StringTie are available on Github.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available