Dual Transformers With Latent Amplification for Multivariate Time Series Anomaly Detection

Authors
Choi, YejiSohn, KwanghoonKim, Ig-Jae
Issue Date
2025-07
Publisher
Institute of Electrical and Electronics Engineers Inc.
Citation
IEEE Access, v.13, pp.136433 - 136445
Abstract
Anomaly detection in multivariate time series is crucial for applications such as industrial monitoring, cybersecurity, and healthcare. Transformer-based reconstruction methods have recently shown strong performance but often suffer from overgeneralization, where anomalies are reconstructed too accurately, thereby reducing the separability between normal and abnormal patterns. Prior works have attempted to mitigate this by incorporating two-stage frameworks or external memory modules to explicitly store normal patterns and amplify deviations from abnormal patterns. However, such approaches increase model complexity and incur additional computational overhead. In this paper, we propose Dual Transformers with Latent Amplification (DT-LA), a novel framework designed to mitigate overgeneralization within a unified architecture. The core idea of DT-LA is to enhance anomaly separability by jointly leveraging both input and latent space reconstructions, rather than merely improving reconstruction fidelity. In particular, we propose the Modified Reverse Huber (MRH) loss that amplifies meaningful deviations in the latent space by applying inverse scaling. It allows the model to retain informative discrepancies that would otherwise be suppressed, thereby improving its ability to detect subtle anomalies. Second, we incorporate sparse self-attention with entropy-based regularization to capture essential inter-sensor relationships and suppress redundancy. Third, we refine the anomaly scoring process using a scaled-softmax function, which balances relative and absolute deviations to reduce softmax-induced bias. Extensive experiments on four benchmark datasets (SMAP, MSL, PSM, and SMD) demonstrate that DT-LA achieves state-of-the-art performance, with F1-scores of 97.02% on SMAP and 98.42% on PSM, highlighting its robustness and practical competitiveness as a single-stage framework.
Keywords
Anomaly detection; Anomaly detection; multivariate time series; multivariate time series; sparse self-attention; sparse self-attention; transformer; transformer
URI
https://pubs.kist.re.kr/handle/201004/153163
DOI
10.1109/ACCESS.2025.3594473
Appears in Collections:
KIST Article > Others
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML

qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE