DSpace at KIST: Leveraging Spatio-Temporal Dependency for Skeleton-Based Action Recognition

Browse

DSpace at KISTKIST Conference Paper 2023

Full metadata record

DC Field	Value	Language
dc.contributor.author	Lee, Jungho	-
dc.contributor.author	Lee, Minhyeok	-
dc.contributor.author	Cho, Suhwan	-
dc.contributor.author	Woo, Sungmin	-
dc.contributor.author	Jang, Sungjun	-
dc.contributor.author	Lee, Sangyoun	-
dc.date.accessioned	2024-04-11T05:00:25Z	-
dc.date.available	2024-04-11T05:00:25Z	-
dc.date.created	2024-04-11	-
dc.date.issued	2023-10	-
dc.identifier.issn	1550-5499	-
dc.identifier.uri	https://pubs.kist.re.kr/handle/201004/149640	-
dc.description.abstract	Skeleton-based action recognition has attracted considerable attention due to its compact representation of the human body's skeletal sructure. Many recent methods have achieved remarkable performance using graph convolutional networks (GCNs) and convolutional neural networks (CNNs), which extract spatial and temporal features, respectively. Although spatial and temporal dependencies in the human skeleton have been explored separately, spatio-temporal dependency is rarely considered. In this paper, we propose the Spatio-Temporal Curve Network (STC-Net) to effectively leverage the spatio-temporal dependency of the human skeleton. Our proposed network consists of two novel elements: 1) The Spatio-Temporal Curve (STC) module; and 2) Dilated Kernels for Graph Convolution (DK-GC). The STC module dynamically adjusts the receptive field by identifying meaningful node connections between every adjacent frame and generating spatio-temporal curves based on the identified node connections, providing an adaptive spatio-temporal coverage. In addition, we propose DK-GC to consider long-range dependencies, which results in a large receptive field without any additional parameters by applying an extended kernel to the given adjacency matrices of the graph. Our STC-Net combines these two modules and achieves state-of-the-art performance on four skeleton-based action recognition benchmarks. Code is available at https://github.com/Jho-Yonsei/STC-Net.	-
dc.language	English	-
dc.publisher	IEEE COMPUTER SOC	-
dc.title	Leveraging Spatio-Temporal Dependency for Skeleton-Based Action Recognition	-
dc.type	Conference	-
dc.identifier.doi	10.1109/ICCV51070.2023.00941	-
dc.description.journalClass	1	-
dc.identifier.bibliographicCitation	IEEE/CVF International Conference on Computer Vision (ICCV), pp.10221 - 10230	-
dc.citation.title	IEEE/CVF International Conference on Computer Vision (ICCV)	-
dc.citation.startPage	10221	-
dc.citation.endPage	10230	-
dc.citation.conferencePlace	US	-
dc.citation.conferencePlace	Paris, FRANCE	-
dc.citation.conferenceDate	2023-10-02	-
dc.relation.isPartOf	2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023)	-
dc.identifier.wosid	001169499002062	-
dc.identifier.scopusid	2-s2.0-85183678707	-

Appears in Collections:: KIST Conference Paper > 2023

Export: RIS (EndNote); XLS (Excel); XML

Show Simple Item Record

KIST Library Institutional Repository

Browse

BROWSE