Mostrar el registro sencillo del ítem

Artículo

dc.creatorAimar, Alessandroes
dc.creatorMostafa, Heshames
dc.creatorCalabrese, Enricoes
dc.creatorRíos Navarro, José Antonioes
dc.creatorTapiador Morales, Ricardoes
dc.creatorLungu, Iulia-Alexandraes
dc.creatorMilde, Moritz B.es
dc.creatorCorradi, Federicoes
dc.creatorLinares Barranco, Alejandroes
dc.creatorLiu, Shih-Chiies
dc.creatorDelbruck, Tobies
dc.date.accessioned2020-01-31T11:54:40Z
dc.date.available2020-01-31T11:54:40Z
dc.date.issued2019
dc.identifier.citationAimar, A., Mostafa, H., Calabrese, E., Ríos Navarro, J.A., Tapiador Morales, R., Lungu, I.,...,Delbruck, T. (2019). NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps. IEEE Transactions on Neural Networks and Learning Systems, 30 (3), 644-656.
dc.identifier.issn2162-237Xes
dc.identifier.urihttps://hdl.handle.net/11441/92660
dc.description.abstractConvolutional neural networks (CNNs) have become the dominant neural network architecture for solving many stateof- the-art (SOA) visual processing tasks. Even though Graphical Processing Units (GPUs) are most often used in training and deploying CNNs, their power efficiency is less than 10 GOp/s/W for single-frame runtime inference.We propose a flexible and efficient CNN accelerator architecture called NullHop that implements SOA CNNs useful for low-power and low-latency application scenarios. NullHop exploits the sparsity of neuron activations in CNNs to accelerate the computation and reduce memory requirements. The flexible architecture allows high utilization of available computing resources across kernel sizes ranging from 1x1 to 7x7. NullHop can process up to 128 input and 128 output feature maps per layer in a single pass. We implemented the proposed architecture on a Xilinx Zynq FPGA platform and present results showing how our implementation reduces external memory transfers and compute time in five different CNNs ranging from small ones up to the widely known large VGG16 and VGG19 CNNs. Post-synthesis simulations using Mentor Modelsim in a 28nm process with a clock frequency of 500MHz show that the VGG19 network achieves over 450GOp/s. By exploiting sparsity, NullHop achieves an efficiency of 368%, maintains over 98% utilization of the MAC units, and achieves a power efficiency of over 3TOp/s/W in a core area of 6.3mm2. As further proof of NullHop’s usability, we interfaced its FPGA implementation with a neuromorphic event camera for real time interactive demonstrations.es
dc.formatapplication/pdfes
dc.language.isoenges
dc.publisherIEEE Computer Societyes
dc.relation.ispartofIEEE Transactions on Neural Networks and Learning Systems, 30 (3), 644-656.
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internacional*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectConvolutional Neural Networks (CNN)es
dc.subjectVLSIes
dc.subjectFPGAes
dc.subjectComputer visiones
dc.subjectArtificial intelligencees
dc.titleNullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Mapses
dc.typeinfo:eu-repo/semantics/articlees
dcterms.identifierhttps://ror.org/03yxnpp24
dc.type.versioninfo:eu-repo/semantics/submittedVersiones
dc.rights.accessRightsinfo:eu-repo/semantics/openAccesses
dc.contributor.affiliationUniversidad de Sevilla. Departamento de Arquitectura y Tecnología de Computadoreses
dc.relation.publisherversionhttps://ieeexplore.ieee.org/document/8421093es
dc.identifier.doi10.1109/TNNLS.2018.2852335es
dc.contributor.groupUniversidad de Sevilla. TEP-108: Robótica y Tecnología de Computadores Aplicada a la Rehabilitaciónes
idus.format.extent13es
dc.journaltitleIEEE Transactions on Neural Networks and Learning Systemses
dc.publication.volumen30es
dc.publication.issue3es
dc.publication.initialPage644es
dc.publication.endPage656es

FicherosTamañoFormatoVerDescripción
NullHop.pdf5.487MbIcon   [PDF] Ver/Abrir  

Este registro aparece en las siguientes colecciones

Mostrar el registro sencillo del ítem

Attribution-NonCommercial-NoDerivatives 4.0 Internacional
Excepto si se señala otra cosa, la licencia del ítem se describe como: Attribution-NonCommercial-NoDerivatives 4.0 Internacional