☆ 4.8 Article

Synthesizing Supervision for Learning Deep Saliency Network without Human Annotation

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2020)

Journal

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

Volume 42, Issue 7, Pages 1755-1769

Publisher

IEEE COMPUTER SOC

DOI: 10.1109/TPAMI.2019.2900649

Keywords

Object detection; Detectors; Training; Knowledge engineering; Task analysis; Semantics; Feature extraction; Salient object detection; supervision synthesis; annotation-free; weakly supervised semantic segmentation

Funding

National Key RAMP
D Program of China [2017YFB0502904]
National Science Foundation of China [61876140]
China Postdoctoral Support Scheme for Innovative Talents [BX20180236]
Australian Research Council Future Fellowship [FT180100116]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Recently, the research field of salient object detection is undergoing a rapid and remarkable development along with the wide usage of deep neural networks. Being trained with a large number of images annotated with strong pixel-level ground-truth masks, the deep salient object detectors have achieved the state-of-the-art performance. However, it is expensive and time-consuming to provide the pixel-level ground-truth masks for each training image. To address this problem, this paper proposes one of the earliest frameworks to learn deep salient object detectors without requiring any human annotation. The supervisory signals used in our learning framework are generated through a novel supervision synthesis scheme, in which the key insights are knowledge source transition and supervision by fusion. Specifically, in the proposed learning framework, both the external knowledge source and the internal knowledge source are explored dynamically to provide informative cues for synthesizing supervision required in our approach, while a two-stream fusion mechanism is also established to implement the supervision synthesis process. Comprehensive experiments on four benchmark datasets demonstrate that the deep salient object detector trained by our newly proposed learning framework often works well without requiring any human annotated masks, which even approaches to its upper-bound obtained under the fully supervised learning fashion (within only 3 percent performance gap). Besides, we also apply the salient object detector learnt with our annotation-free learning framework to assist the weakly supervised semantic segmentation task, which demonstrates that our approach can also alleviate the heavy supplementary supervision required in the existing weakly supervised semantic segmentation framework.

Synthesizing Supervision for Learning Deep Saliency Network without Human Annotation

Journal

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Synthesizing Supervision for Learning Deep Saliency Network without Human Annotation

Journal

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper