Monocular 3D object detection for an indoor robot environment

Authors
Kim, JiwonLee, GiJaeKim, Jun-SikKim, Hyunwoo J.Kim, KangGeon
Issue Date
2020-08
Publisher
IEEE
Citation
29th IEEE International Conference on Robot and Human Interactive Communication (IEEE RO-MAN), pp.438 - 445
Abstract
For a service robot to assist humans, it should interact with objects of varying sizes and shapes existing in an indoor environment. 3D object detection must be preceded to achieve this goal since it provides the robot with the ability to perceive visual information. Most of the existing methods are anchor-based and predict the bounding box close to the ground truth among multiple candidates. However, it is complex to compute Intersection over Union (IoU) and Non-Maximum Suppression (NMS) per each anchor box. Therefore, we propose keypoint-based monocular 3D object detection, where each object's center location is only needed for reproducing predicted 3D bounding box without extra computation of the anchor boxes. Our 3D object detection also works well even if images are rotated corresponding to the robot's head movement. To properly train our network, the object center is based on a projected 3D location instead of 2D to take advantage of the exact center position of the object. Furthermore, we apply data augmentation using a perspective transformation. The method facilitates adding a small perturbation to the camera rotation angle randomly. We use the SUN RGB-D dataset, which has images taken indoor scenes with camera rotations for training and test set. Our approach particularly shows that the errors of object center location based on a single image reduce 15.4% and 24.2%, respectively, compared to the method without data augmentation.
ISSN
1944-9445
URI
https://pubs.kist.re.kr/handle/201004/113594
Appears in Collections:
KIST Conference Paper > 2020
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML

qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE