☆ 4.7 Article

Reconstructing controllable faces from brain activity with hierarchical multiview representations

NEURAL NETWORKS (2023)

期刊

NEURAL NETWORKS

卷 166, 期 -, 页码 487-500

出版社

PERGAMON-ELSEVIER SCIENCE LTD

DOI: 10.1016/j.neunet.2023.07.016

关键词

Neural decoding; fMRI; Face reconstruction; Hierarchical multiview representations; Feature disentanglement; StyleGAN

类别

Computer Science, Artificial Intelligence Neurosciences

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

The research proposes a novel neural decoding framework called VSPnet, which utilizes hierarchical encoding and decoding networks with disentangled latent representations to recover visual stimuli more elaborately. The experimental results on public datasets demonstrate that our proposed method achieves higher reconstruction accuracy than existing approaches and greatly improves the identifiability of different reconstructed faces.

Reconstructing visual experience from brain responses measured by functional magnetic resonance imaging (fMRI) is a challenging yet important research topic in brain decoding, especially it has proved more difficult to decode visually similar stimuli, such as faces. Although face attributes are known as the key to face recognition, most existing methods generally ignore how to decode facial attributes more precisely in perceived face reconstruction, which often leads to indistinguishable reconstructed faces. To solve this problem, we propose a novel neural decoding framework called VSPnet (voxel2style2pixel) by establishing hierarchical encoding and decoding networks with disentangled latent representations as media, so that to recover visual stimuli more elaborately. And we design a hierarchical visual encoder (named HVE) to pre-extract features containing both high-level semantic knowledge and low-level visual details from stimuli. The proposed VSPnet consists of two networks: Multi-branch cognitive encoder and style-based image generator. The encoder network is constructed by multiple linear regression branches to map brain signals to the latent space provided by the preextracted visual features and obtain representations containing hierarchical information consistent to the corresponding stimuli. We make the generator network inspired by StyleGAN to untangle the complexity of fMRI representations and generate images. And the HVE network is composed of a standard feature pyramid over a ResNet backbone. Extensive experimental results on the latest public datasets have demonstrated the reconstruction accuracy of our proposed method outperforms the state-of-the-art approaches and the identifiability of different reconstructed faces has been greatly improved. In particular, we achieve feature editing for several facial attributes in fMRI domain based on the multiview (i.e., visual stimuli and evoked fMRI) latent representations.& COPY; 2023 Elsevier Ltd. All rights reserved.

Reconstructing controllable faces from brain activity with hierarchical multiview representations

期刊

NEURAL NETWORKS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Reconstructing controllable faces from brain activity with hierarchical multiview representations

期刊

NEURAL NETWORKS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文