4.7 Article

LOESS correction for length variation in gene set-based genomic sequence analysis

期刊

BIOINFORMATICS
卷 28, 期 11, 页码 1446-1454

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/bts155

关键词

-

资金

  1. National Institutes of Health [NIH/NHGRI] [R01 HG005287]
  2. American Heart Association

向作者/读者索取更多资源

Motivation: Sequence analysis algorithms are often applied to sets of DNA, RNA or protein sequences to identify common or distinguishing features. Controlling for sequence length variation is critical to properly score sequence features and identify true biological signals rather than length-dependent artifacts. Results: Several cis-regulatory module discovery algorithms exhibit a substantial dependence between DNA sequence score and sequence length. Our newly developed LOESS method is flexible in capturing diverse score-length relationships and is more effective in correcting DNA sequence scores for length-dependent artifacts, compared with four other approaches. Application of this method to genes co-expressed during Drosophila melanogaster embryonic mesoderm development or neural development scored by the Lever motif analysis algorithm resulted in successful recovery of their biologically validated cis-regulatory codes. The LOESS length-correction method is broadly applicable, and may be useful not only for more accurate inference of cis-regulatory codes, but also for detection of other types of patterns in biological sequences.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据