Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Bhin, Hyeonuk | - |
dc.contributor.author | Choi, Jongsuk | - |
dc.date.accessioned | 2025-08-20T05:04:38Z | - |
dc.date.available | 2025-08-20T05:04:38Z | - |
dc.date.created | 2025-08-20 | - |
dc.date.issued | 2025-07 | - |
dc.identifier.uri | https://pubs.kist.re.kr/handle/201004/152971 | - |
dc.description.abstract | Personality is a fundamental psychological trait that exerts a long-term influence on human behavior patterns and social interactions. Automatic personality recognition (APR) has exhibited increasing importance across various domains, including Human-Robot Interaction (HRI), personalized services, and psychological assessments. In this study, we propose a multimodal personality recognition model that classifies the Big Five personality traits by extracting features from three heterogeneous sources: audio processed using Wav2Vec2, video represented as Skeleton Landmark time series, and text encoded through Bidirectional Encoder Representations from Transformers (BERT) and Doc2Vec embeddings. Each modality is handled through an independent Self-Attention block that highlights salient temporal information, and these representations are then summarized and integrated using a late fusion approach to effectively reflect both the inter-modal complementarity and cross-modal interactions. Compared to traditional recurrent neural network (RNN)-based multimodal models and unimodal classifiers, the proposed model achieves an improvement of up to 12 percent in the F1-score. It also maintains a high prediction accuracy and robustness under limited input conditions. Furthermore, a visualization based on t-distributed Stochastic Neighbor Embedding (t-SNE) demonstrates clear distributional separation across the personality classes, enhancing the interpretability of the model and providing insights into the structural characteristics of its latent representations. To support real-time deployment, a lightweight thread-based processing architecture is implemented, ensuring computational efficiency. By leveraging deep learning-based feature extraction and the Self-Attention mechanism, we present a novel personality recognition framework that balances performance with interpretability. The proposed approach establishes a strong foundation for practical applications in HRI, counseling, education, and other interactive systems that require personalized adaptation. | - |
dc.language | English | - |
dc.publisher | MDPI AG | - |
dc.title | Multimodal Personality Recognition Using Self-Attention-Based Fusion of Audio, Visual, and Text Features | - |
dc.type | Article | - |
dc.identifier.doi | 10.3390/electronics14142837 | - |
dc.description.journalClass | 1 | - |
dc.identifier.bibliographicCitation | Electronics (Basel), v.14, no.14 | - |
dc.citation.title | Electronics (Basel) | - |
dc.citation.volume | 14 | - |
dc.citation.number | 14 | - |
dc.description.isOpenAccess | Y | - |
dc.description.journalRegisteredClass | scie | - |
dc.description.journalRegisteredClass | scopus | - |
dc.identifier.wosid | 001539795400001 | - |
dc.identifier.scopusid | 2-s2.0-105011649703 | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Information Systems | - |
dc.relation.journalWebOfScienceCategory | Engineering, Electrical & Electronic | - |
dc.relation.journalWebOfScienceCategory | Physics, Applied | - |
dc.relation.journalResearchArea | Computer Science | - |
dc.relation.journalResearchArea | Engineering | - |
dc.relation.journalResearchArea | Physics | - |
dc.type.docType | Article | - |
dc.subject.keywordPlus | TRAITS | - |
dc.subject.keywordPlus | PREDICTION | - |
dc.subject.keywordAuthor | automatic personality recognition | - |
dc.subject.keywordAuthor | multimodal fusion | - |
dc.subject.keywordAuthor | attention mechanism | - |
dc.subject.keywordAuthor | Big Five traits classifier modeling | - |
dc.subject.keywordAuthor | human-robot interaction | - |
dc.subject.keywordAuthor | real-time affective computing | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.