4.6 Article

sitePath: a visual tool to identify polymorphism clades and help find fixed and parallel mutations

Journal

BMC BIOINFORMATICS
Volume 23, Issue 1, Pages -

Publisher

BMC
DOI: 10.1186/s12859-022-05064-4

Keywords

Phylogenetics; Sequence analysis; Visualization

Funding

  1. National Key Research and Development Program [2021YFC230130]
  2. CAMS Innovation Fund for Medical Sciences [2021-I2M-1-061]
  3. National Natural Science Foundation of China [92169106]
  4. special research fund for central universities, Peking Union Medical College [2021-PT180-001]
  5. China postdoctoral science foundation [2019M660548, 2020T130007ZX]
  6. Suzhou Science and Technology Development Plan [szs2020311]
  7. Natural Science Foundation of Jiangsu Province [BK20220278]
  8. Youthful Teacher Project of Peking Union Medical College [3332019114]

Ask authors/readers for more resources

sitePath is a computational tool based on R that automates the identification of polymorphism clades, inference of fixed and parallel mutations, and generation of high-quality figures.
Background Identifying polymorphism clades on phylogenetic trees could help detect punctual mutations that are associated with viral functions. With visualization tools coloring the tree, it is easy to visually find clades where most sequences have the same polymorphism state. However, with the fast accumulation of viral sequences, a computational tool to automate this process is urgently needed. Results Here, by implementing a branch-and-bound-like search method, we developed an R package named sitePath to identify polymorphism clades automatically. Based on the identified polymorphism clades, fixed and parallel mutations could be inferred. Furthermore, sitePath also integrated visualization tools to generate figures of the calculated results. In an example with the influenza A virus H3N2 dataset, the detected fixed mutations coincide with antigenic shift mutations. The highly specificity and sensitivity of sitePath in finding fixed mutations were achieved for a range of parameters and different phylogenetic tree inference software. Conclusions The result suggests that sitePath can identify polymorphism clades per site. The clustering of sequences on a phylogenetic tree can be used to infer fixed and parallel mutations. High-quality figures of the calculated results could also be generated by sitePath.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available