Audio-Visual Integration For Human-Robot Interaction in Multi-person Scenarios

Authors
Quang NguyenChoi, JongSukYun, Sang-Seok
Issue Date
2014-09
Publisher
IEEE
Citation
19th IEEE International Conference on Emerging Technology and Factory Automation (ETFA)
Abstract
This paper presents the integration of audio-visual perception components for human robot interaction in the Robot Operating System (ROS). Visual-based nodes consist of skeleton tracking and gesture recognition using a depth camera, and face recognition using an RGB camera. Auditory perception is based on sound source localization using a microphone array. We present an integration framework of these nodes using a top-down hierarchical messaging protocol. On the top of the integration, a message carries information about the number of persons and their corresponding states (who, what, where), which are updated from many low-level perception nodes. The top message is passed to a planning node to make a reaction of the robot, according to the perception about surrounding people. This paper demonstrates human-robot interaction in multipersons scenario where robot pays its attention to the speaking or waving hand persons. Moreover, this modularization architecture enables reusing modules for other applications. To validate this approach, two sound source localization algorithms are evaluated in real-time where ground-truth localization is provided by the face recognition module.
URI
https://pubs.kist.re.kr/handle/201004/115327
Appears in Collections:
KIST Conference Paper > 2014
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML

qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE