Ponencia
EdgeDRNN: Enabling Low-latency Recurrent Neural Network Edge Inference
Autor/es | Gao, Chang
Ríos Navarro, José Antonio Chen, Xi Delbruck, Tobi Liu, Shih-Chii |
Departamento | Universidad de Sevilla. Departamento de Arquitectura y Tecnología de Computadores |
Fecha de publicación | 2020-09 |
Fecha de depósito | 2023-04-03 |
Publicado en |
|
ISBN/ISSN | 978-1-7281-4923-3 (impreso) 978-1-7281-4922-6 (online) |
Resumen | This paper presents a Gated Recurrent Unit (GRU) based recurrent neural network (RNN) accelerator called Edge-DRNN designed for portable edge computing. EdgeDRNN adopts the spiking neural network inspired delta network ... This paper presents a Gated Recurrent Unit (GRU) based recurrent neural network (RNN) accelerator called Edge-DRNN designed for portable edge computing. EdgeDRNN adopts the spiking neural network inspired delta network algorithm to exploit temporal sparsity in RNNs. It reduces off-chip memory access by a factor of up to 10x with tolerable accuracy loss. Experimental results on a 10 million parameter 2-layer GRU-RNN, with weights stored in DRAM, show that EdgeDRNN computes them in under 0.5 ms. With 2.42 W wall plug power on an entry level USB powered FPGA board, it achieves latency comparable with a 92 W Nvidia 1080 GPU. It outperforms NVIDIA Jetson Nano, Jetson TX2 and Intel Neural Compute Stick 2 in latency by 6X. For a batch size of 1, EdgeDRNN achieves a mean effective throughput of 20.2 GOp/s and a wall plug power efficiency that is over 4X higher than all other platforms. |
Cita | Gao, C., Ríos Navarro, J.A., Chen, X., Delbruck, T. y Liu, S. (2020). EdgeDRNN: Enabling Low-latency Recurrent Neural Network Edge Inference. En 2nd IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS 2020) (41-45), Génova (Italia): IEEE Xplore. |
Ficheros | Tamaño | Formato | Ver | Descripción |
---|---|---|---|---|
EdgeDRNN_Enabling_Low-latency_ ... | 210.0Kb | [PDF] | Ver/ | |