Pedestrian Attribute Recognition Using Hierarchical Transformers
- Authors
- Lohani, Lalit; Thakare, Kamalakar Vijay; Nayak, Kamakshya Prasad; Dogra, Debi Prosad; Choi, Heeseung; Jung, Hyungjoo; Kim, Ig-Jae
- Issue Date
- 2025-12
- Publisher
- SPRINGER INTERNATIONAL PUBLISHING AG
- Citation
- 27th International Conference on Pattern Recognition-ICPR-Annual, v.15316, pp.78 - 93
- Abstract
- The goal of pedestrian attribute recognition (PAR) is to detect and classify a wide range of pedestrian attributes, such as gender, carrying objects, clothing styles, body postures, age groups, and more. It plays a vital role in computer vision, specifically in crucial applications such as behaviour analysis, public safety monitoring, and video surveillance. However, existing PAR approaches are unable to achieve substantial performance due to multiple factors. First, multiple appearances of the same attribute confuse the models. Second, adverse weather and lighting conditions restrict model generalization capability. To mitigate these challenges, this paper proposes a new evaluation baseline that uses Vision Transformer (ViT) blocks for hierarchical feature modelling. The approach categorizes attributes into different spatial granularity levels and employs diverse patch formations to extract discriminative features. Furthermore, we introduce an enhanced loss function for stable training in the re-formulated granularity scenario, where a novel attribute-aware granularity factor influences the loss. The proposed baseline has been extensively evaluated on the three popular PAR datasets, namely RAP, PA100K and PETA.
- ISSN
- 0302-9743
- URI
- https://pubs.kist.re.kr/handle/201004/153902
- DOI
- 10.1007/978-3-031-78444-6_6
- Appears in Collections:
- KIST Conference Paper > 2025
- Export
- RIS (EndNote)
- XLS (Excel)
- XML
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.