☆ 4.5 Article

Fast computation of mutual information in the frequency domain with applications to global multimodal image alignment

PATTERN RECOGNITION LETTERS (2022)

期刊

PATTERN RECOGNITION LETTERS

卷 159, 期 -, 页码 196-203

出版社

ELSEVIER

DOI: 10.1016/j.patrec.2022.05.022

关键词

Mutual information; Image alignment; Global optimization; Multimodal; Entropy

类别

Computer Science, Artificial Intelligence

资金

Wallenberg AI, Autonomous Systems and Software Program (WASP) AI-Math initiative
VIN-NOVA [2017-02447]
Swedish Research Council [2017-04385]
Swedish Research Council [2017-04385] Funding Source: Swedish Research Council

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Multimodal image alignment is the process of finding spatial correspondences between images formed by different imaging techniques or under different conditions. This article proposes an efficient algorithm based on frequency domain cross-correlation for computing mutual information (MI) for all discrete displacements. The proposed algorithm is shown to be equivalent to a direct method while offering superior runtime performance. Additionally, the article presents a multimodal image alignment method for transformation models with few degrees of freedom. The method is evaluated on benchmark datasets and outperforms alternative methods, including both local optimization of MI and recent deep learning-based approaches.

Multimodal image alignment is the process of finding spatial correspondences between images formed by different imaging techniques or under different conditions, to facilitate heterogeneous data fusion and correlative analysis. The information-theoretic concept of mutual information (MI) is widely used as a similarity measure to guide multimodal alignment processes, where most works have focused on local maximization of MI, which typically works well only for small displacements. This points to a need for global maximization of MI, which has previously been computationally infeasible due to the high run-time complexity of existing algorithms. We propose an efficient algorithm for computing MI for all discrete displacements (formalized as the cross-mutual information function (CMIF)), which is based on cross-correlation computed in the frequency domain. We show that the algorithm is equivalent to a direct method while superior in terms of run-time. Furthermore, we propose a method for multimodal image alignment for transformation models with few degrees of freedom (e.g., rigid) based on the proposed CMIF-algorithm. We evaluate the efficacy of the proposed method on three distinct benchmark datasets, containing remote sensing images, cytological images, and histological images, and we observe excellent success-rates (in recovering known rigid transformations), overall outperforming alternative methods, including local optimization of MI, as well as several recent deep learning-based approaches. We also evaluate the run-times of a GPU implementation of the proposed algorithm and observe speed-ups from 100 to more than 10,000 times for realistic image sizes compared to a GPU implementation of a direct method. Code is shared as open-source at github.com/MIDA-group/globalign. (C) 2022 The Author(s). Published by Elsevier B.V.

Fast computation of mutual information in the frequency domain with applications to global multimodal image alignment

期刊

PATTERN RECOGNITION LETTERS

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Fast computation of mutual information in the frequency domain with applications to global multimodal image alignment

期刊

PATTERN RECOGNITION LETTERS

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文