Presentation
EdgeDRNN: Enabling Low-latency Recurrent Neural Network Edge Inference
Author/s | Gao, Chang
Ríos Navarro, José Antonio Chen, Xi Delbruck, Tobi Liu, Shih-Chii |
Department | Universidad de Sevilla. Departamento de Arquitectura y Tecnología de Computadores |
Publication Date | 2020-09 |
Deposit Date | 2023-04-03 |
Published in |
|
ISBN/ISSN | 978-1-7281-4923-3 (impreso) 978-1-7281-4922-6 (online) |
Abstract | This paper presents a Gated Recurrent Unit (GRU) based recurrent neural network (RNN) accelerator called Edge-DRNN designed for portable edge computing. EdgeDRNN adopts the spiking neural network inspired delta network ... This paper presents a Gated Recurrent Unit (GRU) based recurrent neural network (RNN) accelerator called Edge-DRNN designed for portable edge computing. EdgeDRNN adopts the spiking neural network inspired delta network algorithm to exploit temporal sparsity in RNNs. It reduces off-chip memory access by a factor of up to 10x with tolerable accuracy loss. Experimental results on a 10 million parameter 2-layer GRU-RNN, with weights stored in DRAM, show that EdgeDRNN computes them in under 0.5 ms. With 2.42 W wall plug power on an entry level USB powered FPGA board, it achieves latency comparable with a 92 W Nvidia 1080 GPU. It outperforms NVIDIA Jetson Nano, Jetson TX2 and Intel Neural Compute Stick 2 in latency by 6X. For a batch size of 1, EdgeDRNN achieves a mean effective throughput of 20.2 GOp/s and a wall plug power efficiency that is over 4X higher than all other platforms. |
Citation | Gao, C., Ríos Navarro, J.A., Chen, X., Delbruck, T. y Liu, S. (2020). EdgeDRNN: Enabling Low-latency Recurrent Neural Network Edge Inference. En 2nd IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS 2020) (41-45), Génova (Italia): IEEE Xplore. |
Files | Size | Format | View | Description |
---|---|---|---|---|
EdgeDRNN_Enabling_Low-latency_ ... | 210.0Kb | [PDF] | View/ | |