☆ 4.6 Article

Tximeta: Reference sequence checksums for provenance identification in RNA-seq

PLOS COMPUTATIONAL BIOLOGY (2020)

Journal

PLOS COMPUTATIONAL BIOLOGY

Volume 16, Issue 2, Pages -

Publisher

PUBLIC LIBRARY SCIENCE

DOI: 10.1371/journal.pcbi.1007664

Keywords

Funding

NSF [PRFB 1711984]
[R01 HG009937]
[R01 MH118349]
[P01 CA142538]
[P30 365 ES010126]
[366 U41 HG004059]
[BIO-1564917]
[CCF-1750472]
[367 CNS1763680]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Correct annotation metadata is critical for reproducible and accurate RNA-seq analysis. When files are shared publicly or among collaborators with incorrect or missing annotation metadata, it becomes difficult or impossible to reproduce bioinformatic analyses from raw data. It also makes it more difficult to locate the transcriptomic features, such as transcripts or genes, in their proper genomic context, which is necessary for overlapping expression data with other datasets. We provide a solution in the form of an R/Bioconductor package tximeta that performs numerous annotation and metadata gathering tasks automatically on behalf of users during the import of transcript quantification files. The correct reference transcriptome is identified via a hashed checksum stored in the quantification output, and key transcript databases are downloaded and cached locally. The computational paradigm of automatically adding annotation metadata based on reference sequence checksums can greatly facilitate genomic workflows, by helping to reduce overhead during bioinformatic analyses, preventing costly bioinformatic mistakes, and promoting computational reproducibility. The tximeta package is available at https://bioconductor.org/packages/tximeta.

Tximeta: Reference sequence checksums for provenance identification in RNA-seq

Journal

PLOS COMPUTATIONAL BIOLOGY

Publisher

PUBLIC LIBRARY SCIENCE

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Tximeta: Reference sequence checksums for provenance identification in RNA-seq

Journal

PLOS COMPUTATIONAL BIOLOGY

Publisher

PUBLIC LIBRARY SCIENCE

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper