Full metadata record
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Kim, Hyeongmo | - |
| dc.contributor.author | Sohyun Kang | - |
| dc.contributor.author | CHOI YERIN | - |
| dc.contributor.author | Ji, Seung Yeon | - |
| dc.contributor.author | Woo, Junhyuk | - |
| dc.contributor.author | Chung, Hyunsuk | - |
| dc.contributor.author | Soyeon Caren Han | - |
| dc.contributor.author | Han, Kyung reem | - |
| dc.date.accessioned | 2026-03-04T08:00:06Z | - |
| dc.date.available | 2026-03-04T08:00:06Z | - |
| dc.date.created | 2026-01-22 | - |
| dc.date.issued | 2026-01-25 | - |
| dc.identifier.uri | https://pubs.kist.re.kr/handle/201004/154397 | - |
| dc.description.abstract | The term `algorithmic fairness' is used to evaluate whether AI models operate fairly in both comparative (where fairness is understood as formal equality, such as “treat like cases as like”) and non-comparative (where unfairness arises from the model’s inaccuracy, arbitrariness, or inscrutability) contexts. Recent advances in multimodal large language models (MLLMs) are breaking new ground in multimodal understanding, reasoning, and generation; however, we argue that inconspicuous distortions arising from complex multimodal interaction dynamics can lead to systematic bias. The purpose of this position paper is twofold: first, it is intended to acquaint AI researchers with phenomenological explainable approaches that rely on the physical entities that the machine experiences during training/inference, as opposed to the traditional cognitivist symbolic account or metaphysical approaches; second, it is to state that this phenomenological doctrine will be practically useful for tackling algorithmic fairness issues in MLLMs. We develop a surrogate physics-based model that describes transformer dynamics (i.e., semantic network structure and self-/cross-attention) to analyze the dynamics of cross-modal bias in MLLM, which are not fully captured by conventional embedding- or representation-level analyses. We support this position through multi-input diagnostic experiments: 1) perturbation-based analyses of emotion classification using Qwen2.5-Omni and Gemma 3n, and 2) dynamical analysis of Lorenz chaotic time-series prediction through the physical surrogate. Across two architecturally distinct MLLMs, we show that multimodal inputs can reinforce modality dominance rather than mitigate it, as revealed by structured error-attractor patterns under systematic label perturbation, complemented by dynamical analysis. | - |
| dc.publisher | AAAI | - |
| dc.title | Physics-based phenomenological characterization of cross-modal bias in multimodal models | - |
| dc.type | Conference | - |
| dc.description.journalClass | 1 | - |
| dc.identifier.bibliographicCitation | 40th Annual AAAI Conference on Artificial Intelligence, v.1 | - |
| dc.citation.title | 40th Annual AAAI Conference on Artificial Intelligence | - |
| dc.citation.volume | 1 | - |
| dc.citation.conferencePlace | SI | - |
| dc.citation.conferencePlace | Singapore EXPO | - |
| dc.citation.conferenceDate | 2026-01 | - |
| dc.relation.isPartOf | Bias in multimodal AI (to be published) | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.