4.5 Article

Superbubbles, Ultrabubbles, and Cacti

期刊

JOURNAL OF COMPUTATIONAL BIOLOGY
卷 25, 期 7, 页码 649-663

出版社

MARY ANN LIEBERT, INC
DOI: 10.1089/cmb.2017.0251

关键词

genome assembly; genome graphs; genomic variation; sequence analysis; variant discovery

资金

  1. National Human Genome Research Institute of the National Institutes of Health [5U54HG007990]
  2. W.M. Keck Foundation
  3. Simons Foundation

向作者/读者索取更多资源

A superbubble is a type of directed acyclic subgraph with single distinct source and sink vertices. In genome assembly and genetics, the possible paths through a superbubble can be considered to represent the set of possible sequences at a location in a genome. Bidirected and biedged graphs are a generalization of digraphs that are increasingly being used to more fully represent genome assembly and variation problems. In this study, we define snarls and ultrabubbles, generalizations of superbubbles for bidirected and biedged graphs, and give an efficient algorithm for the detection of these more general structures. Key to this algorithm is the cactus graph, which, we show, encodes the nested decomposition of a graph into snarls and ultrabubbles within its structure. We propose and demonstrate empirically that this decomposition on bidirected and biedged graphs solves a fundamental problem by defining genetic sites for any collection of genomic variations, including complex structural variations, without need for any single reference genome coordinate system. Further, the nesting of the decomposition gives a natural way to describe and model variations contained within large variations, a case not currently dealt with by existing formats [e.g., varient cell format (VCF)].

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据