Real time instruction classification system

Authors
이상훈권동진
Issue Date
2024-08
Publisher
한국인터넷방송통신학회
Citation
The International Journal of Internet, Broadcasting and Communication, v.16, no.3, pp.212 - 220
Abstract
A recently the advancement of society, AI technology has made significant strides, especially in the fields of computer vision and voice recognition. This study introduces a system that leverages these technologies to recognize users through a camera and relay commands within a vehicle based on voice commands. The system uses the YOLO (You Only Look Once) machine learning algorithm, widely used for object and entity recognition, to identify specific users. For voice command recognition, a machine learning model based on spectrogram voice analysis is employed to identify specific commands. This design aims to enhance security and convenience by preventing unauthorized access to vehicles and IoT devices by anyone other than registered users. We converts camera input data into YOLO system inputs to determine if it is a person, Additionally, it collects voice data through a microphone embedded in the device or computer, converting it into time-domain spectrogram data to be used as input for the voice recognition machine learning system. The input camera image data and voice data undergo inference tasks through pre-trained models, enabling the recognition of simple commands within a limited space based on the inference results. This study demonstrates the feasibility of constructing a device management system within a confined space that enhances security and user convenience through a simple real-time system model. Finally our work aims to provide practical solutions in various application fields, such as smart homes and autonomous vehicles.
Keywords
Computer Vision; Speech Recognition; Machine Learning; Detection
ISSN
2288-4920
URI
https://pubs.kist.re.kr/handle/201004/150727
DOI
10.7236/IJIBC.2024.16.3.212
Appears in Collections:
KIST Article > 2024
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML

qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE