4.5 Article Proceedings Paper

Fast median computation for symmetric, orthogonal matrices under the rank distance

Journal

LINEAR ALGEBRA AND ITS APPLICATIONS
Volume 614, Issue -, Pages 394-414

Publisher

ELSEVIER SCIENCE INC
DOI: 10.1016/j.laa.2020.10.030

Keywords

Orthogonal matrices; Hermitian, skew-Hermitian, and related matrices; Eigenvalues, singular values, and eigenvectors

Funding

  1. State of Sao Paulo Research Foundation(FAPESP) [2018/00031-7]
  2. National Science and Engineering Research Council (NSERC) of Canada [RGPIN/04622-2016]
  3. Sloan Foundation [FG-2016-6392]

Ask authors/readers for more resources

Biological genomes can be represented by matrices, and the rank distance between genomes is related to the minimum number of rearrangement mutations explaining their differences. The median problem in genome matrices is computationally complex, but fast algorithms exist for orthogonal matrices. These algorithms use rank-1 steps towards the median and can be found in polynomial time.
Biological genomes can be represented as square, symmetric, orthogonal, 0-1 matrices. It turns out that the rank distance applied to two genome matrices has a biological significance: it is related to the smallest number of basic rearrangement mutations, such as reversals, translocations, transpositions (taken with weight 2), etc. that explain the differences between the two genomes. Therefore, closer genomes will produce smaller rank distances. An important tool in this context is the median problem: given three genomes A, B, and C, find a fourth genome M that minimizes d(A, M) + d(B, M) + d(C, M). For genome matrices, the computational complexity of this problem is currently unknown. However, for orthogonal matrices, there are fast algorithms that solve it exactly. One such algorithm uses a walk towards the median paradigm. Starting from any of the input matrices, say, B, the algorithm produces rank-1 steps, which are rank-1 matrices that, added to B, decrease its rank distance to both A and C simultaneously. It can be shown that such steps always exist for orthogonal matrices, and can be found in polynomial time. The algorithm stops when no more improvement can be made, which is equivalent to saying that B lies between A and C in terms of the rank distance (the triangle inequality becomes an equality). There is an O(n(omega+1)) algorithm implementing this idea, where w is the matrix multiplication exponent. Here we propose a novel scheme that works for symmetric orthogonal matrices, and produces a median, also guaranteed to be symmetric, in O(n(omega)) time. There is another O(n(omega)) time algorithm that produces the so-called M-I median, which agrees with the majority in the subspaces where A = B, B = C, or C = A, and is equal to the identity in the orthogonal complement of the sum of these subspaces. However, this algorithm produces a different median, and has only been proved to be correct for genomic matrices. The algorithm we present here is more general. (C) 2020 Elsevier Inc. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available