4.6 Article

Dense-view synthesis for three-dimensional light-field display based on unsupervised learning

Journal

OPTICS EXPRESS
Volume 27, Issue 17, Pages 24624-24641

Publisher

OPTICAL SOC AMER
DOI: 10.1364/OE.27.024624

Keywords

-

Categories

Funding

  1. 973 Program [2017YFB1002900]
  2. Fundamental Research Funds for the Central Universities [2018PTB-00-01]
  3. Fund of State Key Laboratory of Information Photonics and Optical Communications [IPOC2017ZZ02]

Ask authors/readers for more resources

Three-dimensional (3D) light field display, as a potential future display method, has attracted considerable attention. However, there still exist certain issues to be addressed, especially the capture of dense views in real 3D scenes. Using sparse cameras associated with view synthesis algorithm has become a practical method. Supervised convolutional neural network (CNN) is used to synthesize virtual views. However, such a large amount of training target views is sometimes difficult to be obtained and the training position is relatively fixed. Novel views can also be synthesized by unsupervised network MPVN, but the method has strict requirements on capturing multiple uniform horizontal viewpoints, which is not suitable in practice. Here, a method of dense-view synthesis based on unsupervised learning is presented, which can synthesize arbitrary virtual views with multiple free-posed views captured in the real 3D scene based on unsupervised learning. Multiple posed views are reprojected to the target position and input into the neural network. The network outputs a color tower and a selection tower indicating the scene distribution along the depth direction. A single image is yielded by the weighted summation of two towers. The proposed network is end-to-end trained based on unsupervised learning by minimizing errors during reconstructions of posed views. A virtual view can be predicted in a high quality by reprojecting posed views to the desired position. Additionally, a sequence of dense virtual views can be generated for 3D light-field display by repeated predictions. Experimental results demonstrate the validity of our proposed network. PSNR of synthesized views are around 30dB and SSIM are over 0.90. Since multiple cameras are supported to be placed in free-posed positions, there are not strict physical requirements and the proposed method can be flexibly used for the real scene capture. We believe this approach will contribute to the wide applications of 3D light-field display in the future. (C) 2019 Optical Society of America under the terms of the OSA Open Access Publishing Agreement.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available