DSpace at KIST: Dual Transformers With Latent Amplification for Multivariate Time Series Anomaly Detection

Browse

DSpace at KISTKIST Article Others

Dual Transformers With Latent Amplification for Multivariate Time Series Anomaly Detection

Authors: Choi, Yeji; Sohn, Kwanghoon; Kim, Ig-Jae

Issue Date: 2025-07

Publisher: Institute of Electrical and Electronics Engineers Inc.

Citation: IEEE Access, v.13, pp.136433 - 136445

Abstract: Anomaly detection in multivariate time series is crucial for applications such as industrial monitoring, cybersecurity, and healthcare. Transformer-based reconstruction methods have recently shown strong performance but often suffer from overgeneralization, where anomalies are reconstructed too accurately, thereby reducing the separability between normal and abnormal patterns. Prior works have attempted to mitigate this by incorporating two-stage frameworks or external memory modules to explicitly store normal patterns and amplify deviations from abnormal patterns. However, such approaches increase model complexity and incur additional computational overhead. In this paper, we propose Dual Transformers with Latent Amplification (DT-LA), a novel framework designed to mitigate overgeneralization within a unified architecture. The core idea of DT-LA is to enhance anomaly separability by jointly leveraging both input and latent space reconstructions, rather than merely improving reconstruction fidelity. In particular, we propose the Modified Reverse Huber (MRH) loss that amplifies meaningful deviations in the latent space by applying inverse scaling. It allows the model to retain informative discrepancies that would otherwise be suppressed, thereby improving its ability to detect subtle anomalies. Second, we incorporate sparse self-attention with entropy-based regularization to capture essential inter-sensor relationships and suppress redundancy. Third, we refine the anomaly scoring process using a scaled-softmax function, which balances relative and absolute deviations to reduce softmax-induced bias. Extensive experiments on four benchmark datasets (SMAP, MSL, PSM, and SMD) demonstrate that DT-LA achieves state-of-the-art performance, with F1-scores of 97.02% on SMAP and 98.42% on PSM, highlighting its robustness and practical competitiveness as a single-stage framework.

Keywords: Anomaly detection; Anomaly detection; multivariate time series; multivariate time series; sparse self-attention; sparse self-attention; transformer; transformer

URI: https://pubs.kist.re.kr/handle/201004/153163

DOI: 10.1109/ACCESS.2025.3594473

Appears in Collections:: KIST Article > Others

Export: RIS (EndNote); XLS (Excel); XML

Show Full Item Record

KIST Library Institutional Repository

Browse

BROWSE