DSpace at KIST: Multimodal Personality Recognition Using Self-Attention-Based Fusion of Audio, Visual, and Text Features

Browse

Full metadata record

DC Field	Value	Language
dc.contributor.author	Bhin, Hyeonuk	-
dc.contributor.author	Choi, Jongsuk	-
dc.date.accessioned	2025-08-20T05:04:38Z	-
dc.date.available	2025-08-20T05:04:38Z	-
dc.date.created	2025-08-20	-
dc.date.issued	2025-07	-
dc.identifier.uri	https://pubs.kist.re.kr/handle/201004/152971	-
dc.description.abstract	Personality is a fundamental psychological trait that exerts a long-term influence on human behavior patterns and social interactions. Automatic personality recognition (APR) has exhibited increasing importance across various domains, including Human-Robot Interaction (HRI), personalized services, and psychological assessments. In this study, we propose a multimodal personality recognition model that classifies the Big Five personality traits by extracting features from three heterogeneous sources: audio processed using Wav2Vec2, video represented as Skeleton Landmark time series, and text encoded through Bidirectional Encoder Representations from Transformers (BERT) and Doc2Vec embeddings. Each modality is handled through an independent Self-Attention block that highlights salient temporal information, and these representations are then summarized and integrated using a late fusion approach to effectively reflect both the inter-modal complementarity and cross-modal interactions. Compared to traditional recurrent neural network (RNN)-based multimodal models and unimodal classifiers, the proposed model achieves an improvement of up to 12 percent in the F1-score. It also maintains a high prediction accuracy and robustness under limited input conditions. Furthermore, a visualization based on t-distributed Stochastic Neighbor Embedding (t-SNE) demonstrates clear distributional separation across the personality classes, enhancing the interpretability of the model and providing insights into the structural characteristics of its latent representations. To support real-time deployment, a lightweight thread-based processing architecture is implemented, ensuring computational efficiency. By leveraging deep learning-based feature extraction and the Self-Attention mechanism, we present a novel personality recognition framework that balances performance with interpretability. The proposed approach establishes a strong foundation for practical applications in HRI, counseling, education, and other interactive systems that require personalized adaptation.	-
dc.language	English	-
dc.publisher	MDPI AG	-
dc.title	Multimodal Personality Recognition Using Self-Attention-Based Fusion of Audio, Visual, and Text Features	-
dc.type	Article	-
dc.identifier.doi	10.3390/electronics14142837	-
dc.description.journalClass	1	-
dc.identifier.bibliographicCitation	Electronics (Basel), v.14, no.14	-
dc.citation.title	Electronics (Basel)	-
dc.citation.volume	14	-
dc.citation.number	14	-
dc.description.isOpenAccess	Y	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.identifier.wosid	001539795400001	-
dc.identifier.scopusid	2-s2.0-105011649703	-
dc.relation.journalWebOfScienceCategory	Computer Science, Information Systems	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.relation.journalWebOfScienceCategory	Physics, Applied	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalResearchArea	Physics	-
dc.type.docType	Article	-
dc.subject.keywordPlus	TRAITS	-
dc.subject.keywordPlus	PREDICTION	-
dc.subject.keywordAuthor	automatic personality recognition	-
dc.subject.keywordAuthor	multimodal fusion	-
dc.subject.keywordAuthor	attention mechanism	-
dc.subject.keywordAuthor	Big Five traits classifier modeling	-
dc.subject.keywordAuthor	human-robot interaction	-
dc.subject.keywordAuthor	real-time affective computing	-

Appears in Collections:: KIST Article > Others

Export: RIS (EndNote); XLS (Excel); XML

Show Simple Item Record

KIST Library Institutional Repository

Browse

BROWSE