4.7 Article

Automatic Spoken Language Acquisition Based on Observation and Dialogue

Journal

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING
Volume 16, Issue 6, Pages 1480-1492

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JSTSP.2022.3189279

Keywords

Vocabulary; Speech recognition; Grounding; Reinforcement learning; Uniform resource locators; Signal processing algorithms; Routing; Autonomous agent; reinforcement learning; self-supervised learning; spoken language acquisition; unsupervised learning

Funding

  1. Toray Science Foundation
  2. JSPS KAKENHI [JP22K12069]

Ask authors/readers for more resources

Researchers propose spoken language acquisition agents that simulate the process of human language learning. By integrating multiple learning types, the agents successfully acquire spoken language from scratch and improve learning efficiency.
Human babies are born without knowledge of any specific language. They acquire language directly from observation and dialogue without being limited by the availability of labeled data. We propose spoken language acquisition agents that simulate the process. Such an ability requires multiple types of learning, including 1) word discovery, 2) symbol grounding, 3) message generation, and 4) pronunciation generation. Several studies have targeted one or combined learning types to elucidate human intelligence and aimed to equip spoken dialogue systems with human-like flexible language learning ability. However, their language ability was partially lacking some of the components. Our agents are the first to integrate them all. Our key concept is to design an architecture to integrate unsupervised, self-supervised, and reinforcement learning to utilize clues naturally existing in raw sensory signals and drive the learning based on the agent's intrinsic motivation. Experimental results show agents successfully acquire spoken language from scratch by interacting with an environment to act by speaking. Our proposed focusing mechanism significantly improves learning efficiency. We also demonstrate that our agents can learn neural vocoder and the concept of logical negation as a part of language acquisition.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available