Jeong, Dayena Park, Jaewoo Jo, Jeonghee Park, Jongkil Kim, Jae wook Jang, Hyun Jae Lee, Su youn Park, Seongsik 2025-02-21T01:00:27Z 2025-02-21T01:00:27Z 2025-02-11 2024-08-03 https://pubs.kist.re.kr/handle/201004/151779 Objective: Recent deep neural networks (DNNs), such as diffusion models [1], have faced high computational demands. Thus, spiking neural networks (SNNs) have attracted lots of attention as energy-efficient neural networks. However, conventional spiking neurons, such as leaky integrate-and-fire neurons, cannot accurately represent complex non-linear activation functions, such as Swish [2]. To approximate activation functions with spiking neurons, few spikes (FS) neurons were proposed [3], but the approximation performance was limited due to the lack of training methods considering the neurons. Thus, we propose tendency-based parameter initialization (TBPI) to enhance the approximation of activation function with FS neurons, exploiting temporal dependencies initializing the training parameters. Method: Our method considers the temporal dependencies of parameters in FS neurons when setting initial values for training. Motivating with the fact that the spiking neurons should operate continuously, the parameters were initialized to have a temporal dependency. Our method mainly consists of three steps. The first step is pre-training, where local optimized parameters are obtained by training from random initial values. The parameters (h(t), d(t), T(t) (Fig. 1)) represent the reset value subtracted from the membrane potential, the weight of output spikes, and the threshold, respectively. The second step is function fitting to obtain the temporal relationship of the pre-trained parameters, as shown in Fig. 2. The third step is to extract initial values of parameters from the fitted function at each time step t. Then, the parameters are trained with the initialization. Results: We validated the proposed method with approximation of Swish at the neuron level and diffusion model at the network level. To evaluate fairly, we compared the approximation performance of random initial values, initial values with Gaussian noise added after pre-training, and our method. We also evaluated the performance at the network level with diffusion model with the trained neurons by each method. According to our experimental results, TBPI demonstrates more accurate approximation of Swish activation at the neuron level (Tab. 1), which leads to improved performance of diffusion model (Tab. 2). Conclusion: TBPI improves the generalization in training of FS neurons by the parameter initialization, showing potential in other non-linear activation functions such as GELU that is used in Transformer architectures. Therefore, it will pave the way to energy-efficient artificial intelligence by enabling various deep learning models to be implemented with deep SNNs. English IJCAI A More Accurate Approximation of Activation Function with Few Spikes Neurons Conference 1 International Joint Conference on Artificial Intelligence, v.00, no.00 International Joint Conference on Artificial Intelligence 00 00 KO ICC Jeju 2024-08 Workshop on Human Brain and Artificial Intelligence