4.8 Article

Systematic bias in high-throughput sequencing data and its correction by BEADS

Journal

NUCLEIC ACIDS RESEARCH
Volume 39, Issue 15, Pages -

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/nar/gkr425

Keywords

-

Funding

  1. National Human Genome Research Institute [1-U01-HG004270-01]
  2. Wellcome Trust [054523]
  3. Cambridge Newton Trust
  4. Cancer Research UK

Ask authors/readers for more resources

Genomic sequences obtained through high-throughput sequencing are not uniformly distributed across the genome. For example, sequencing data of total genomic DNA show significant, yet unexpected enrichments on promoters and exons. This systematic bias is a particular problem for techniques such as chromatin immunoprecipitation, where the signal for a target factor is plotted across genomic features. We have focused on data obtained from Illumina's Genome Analyser platform, where at least three factors contribute to sequence bias: GC content, mappability of sequencing reads, and regional biases that might be generated by local structure. We show that relying on input control as a normalizer is not generally appropriate due to sample to sample variation in bias. To correct sequence bias, we present BEADS (bias elimination algorithm for deep sequencing), a simple three-step normalization scheme that successfully unmasks real binding patterns in ChIP-seq data. We suggest that this procedure be done routinely prior to data interpretation and downstream analyses.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available