Spike-based Q-learning in a non-von Neumann architecture
- Authors
- Shin, Donghyuk; Jo, Hyeongcheol; Jang, Hyeseung; Jeong, Yoo Ho; Jeong, Yeonjoo; Kwak, Joon Young; Park, Jongkil; Lee, Suyoun; Kim, Inho; Park, Jong-Keuk; Park, Seongsik; Jang, Hyun Jae; Lee, Hyung-Min; Kim, Jaewook
- Issue Date
- 2026-02
- Publisher
- Frontiers Media S.A.
- Citation
- Frontiers in Neuroscience, v.20
- Abstract
- Non-von Neumann architectures overcome the memory-compute separation of von Neumann systems by distributing computation and memory locally, thereby reducing data-transfer bottlenecks and power consumption. These features are particularly advantageous for reinforcement learning (RL) workloads that rely on frequent value-function updates across large state-action spaces. When combined with event-driven spiking neural networks (SNNs), non-von Neumann architectures can further improve overall computational efficiency by leveraging the sparse nature of spike-based processing. In this study, we propose a hardware-feasible SNN-based non-von Neumann architecture that performs Q-learning, one of the most widely known reinforcement learning algorithms. The proposed architecture maps states and actions to individual neurons using one-hot encoding and locally stores each state–action pair's Q-value in the corresponding synapse. To enable each synapse to update its local Q-value based on the next state maximum Q stored in other synapses, a neuron group connected through a lateral inhibition structure is employed to produce the maximum Q, which is then globally transmitted to all synapses. A delay circuit is also added to align the next-state and current-state values to ensure temporally consistent updates. Each synapse locally generates a learning selection signal and combines it with the globally transmitted signals to update only the target synapse. The proposed architecture was validated through simulations on the Cart-pole benchmark, showing stable learning performance under low-bit precision and achieving comparable accuracy to software-based Q-learning with sufficient bit precision.
- Keywords
- IMPLEMENTATION; non-von Neumann architecture; neuromorphic architecture; SNN; reinforcement learning; Q-learning; cart-pole
- ISSN
- 1662-4548
- URI
- https://pubs.kist.re.kr/handle/201004/154526
- DOI
- 10.3389/fnins.2026.1738140
- Appears in Collections:
- KIST Article > 2026
- Export
- RIS (EndNote)
- XLS (Excel)
- XML
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.