Monocular 3D object detection for an indoor robot environment

Monocular 3D object detection for an indoor robot environment
monocular 3D object detection; keypoint-based object detection; data augmentation
Issue Date
2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)
For a service robot to assist humans, it should interact with objects of varying sizes and shapes existing in an indoor environment. 3D object detection must be preceded to achieve this goal since it provides the robot with the ability to perceive visual information. Most of the existing methods are anchor-based and predict the bounding box close to the ground truth among multiple candidates. However, it is complex to compute Intersection over Union (IoU) and Non-Maximum Suppression (NMS) per each anchor box. Therefore, we propose keypoint-based monocular 3D object detection, where each object’s center location is only needed for reproducing predicted 3D bounding box without extra computation of the anchor boxes. Our 3D object detection also works well even if images are rotated corresponding to the robot’s head movement. To properly train our network, the object center is based on a projected 3D location instead of 2D to take advantage of the exact center position of the object. Furthermore, we apply data augmentation using a perspective transformation. The method facilitates adding a small perturbation to the camera rotation angle randomly. We use the SUN RGB-D dataset, which has images taken indoor scenes with camera rotations for training and test set. Our approach particularly shows that the errors of object center location based on a single image reduce 15.4% and 24.2%, respectively, compared to the method without data augmentation.
Appears in Collections:
KIST Publication > Conference Paper
Files in This Item:
There are no files associated with this item.
RIS (EndNote)
XLS (Excel)


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.