4.5 Article

Intention Understanding in Human-Robot Interaction Based on Visual-NLP Semantics

期刊

FRONTIERS IN NEUROROBOTICS
卷 14, 期 -, 页码 -

出版社

FRONTIERS MEDIA SA
DOI: 10.3389/fnbot.2020.610139

关键词

human– robot interaction; intention estimation; scene understanding; visual-NLP; semantics

资金

  1. Shenzhen Science and Technology Innovation Commission [JCYJ20170410172100520, JCYJ20170818104502599]
  2. Shenzhen Institute of Artificial Intelligence and Robotics for Society [2019-INT020]

向作者/读者索取更多资源

This research proposes a novel task-based framework that enables robots to understand human intentions using visual semantics information and satisfy human intentions based on natural language instructions, further advancing human-robot interaction.
With the rapid development of robotic and AI technology in recent years, human-robot interaction has made great advancement, making practical social impact. Verbal commands are one of the most direct and frequently used means for human-robot interaction. Currently, such technology can enable robots to execute pre-defined tasks based on simple and direct and explicit language instructions, e.g., certain keywords must be used and detected. However, that is not the natural way for human to communicate. In this paper, we propose a novel task-based framework to enable the robot to comprehend human intentions using visual semantics information, such that the robot is able to satisfy human intentions based on natural language instructions (total three types, namely clear, vague, and feeling, are defined and tested). The proposed framework includes a language semantics module to extract the keywords despite the explicitly of the command instruction, a visual object recognition module to identify the objects in front of the robot, and a similarity computation algorithm to infer the intention based on the given task. The task is then translated into the commands for the robot accordingly. Experiments are performed and validated on a humanoid robot with a defined task: to pick the desired item out of multiple objects on the table, and hand over to one desired user out of multiple human participants. The results show that our algorithm can interact with different types of instructions, even with unseen sentence structures.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据