☆ 4.7 Article

Multiple alignment-free sequence comparison

BIOINFORMATICS (2013)

Journal

BIOINFORMATICS

Volume 29, Issue 21, Pages 2690-2698

Publisher

OXFORD UNIV PRESS

DOI: 10.1093/bioinformatics/btt462

Keywords

Funding

Oxford Martin School
US NIH [R21HG006199]
NSF [DMS-1043075]
OCE [1136818]
National Natural Science Foundation of China [31171262, 11021463]
National Key Basic Research Project of China [2009CB918503]
EPSRC [EP/K032402/1] Funding Source: UKRI
Engineering and Physical Sciences Research Council [EP/K032402/1] Funding Source: researchfish
Directorate For Geosciences
Division Of Ocean Sciences [1136818] Funding Source: National Science Foundation

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Motivation: Recently, a range of new statistics have become available for the alignment-free comparison of two sequences based on k-tuple word content. Here, we extend these statistics to the simultaneous comparison of more than two sequences. Our suite of statistics contains, first, C-l* and C-l(S), extensions of statistics for pairwise comparison of the joint k-tuple content of all the sequences, and second, (C-2*) over bar, <(C-2(S))over bar> and <(C-2(geo))over bar>, averages of sums of pairwise comparison statistics. The two tasks we consider are, first, to identify sequences that are similar to a set of target sequences, and, second, to measure the similarity within a set of sequences. Results: Our investigation uses both simulated data as well as cis-regulatory module data where the task is to identify cis-regulatory modules with similar transcription factor binding sites. We find that although for real data, all of our statistics show a similar performance, on simulated data the Shepp-type statistics are in some instances outperformed by star-type statistics. The multiple alignment-free statistics are more sensitive to contamination in the data than the pairwise average statistics.

Multiple alignment-free sequence comparison

Journal

BIOINFORMATICS

Publisher

OXFORD UNIV PRESS

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Multiple alignment-free sequence comparison

Journal

BIOINFORMATICS

Publisher

OXFORD UNIV PRESS

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper