Mostrar el registro sencillo del ítem

Ponencia

dc.creatorRíos Navarro, José Antonioes
dc.creatorTapiador Morales, Ricardoes
dc.creatorJiménez Fernández, Ángel Franciscoes
dc.creatorDomínguez Morales, Manuel Jesúses
dc.creatorAmaya Rodríguez, Claudio Antonioes
dc.creatorLinares Barranco, Alejandroes
dc.date.accessioned2020-01-29T08:50:20Z
dc.date.available2020-01-29T08:50:20Z
dc.date.issued2018
dc.identifier.citationRios Navarro, A., Tapiador Morales, R., Jiménez Fernández, Á.F., Domínguez Morales, M.J., Amaya Rodríguez, C.A. y Linares Barranco, A. (2018). Performance evaluation over HW/SW co-design SoC memory transfers for a CNN accelerator. En IEEE-NANO 2018: 18th International Conference on Nanotechnology Cork, Ireland: IEEE Computer Society.
dc.identifier.isbn978-1-5386-5336-4es
dc.identifier.issn1944-9380es
dc.identifier.urihttps://hdl.handle.net/11441/92448
dc.description.abstractMany FPGAs vendors have recently included embedded processors in their devices, like Xilinx with ARM-Cortex A cores, together with programmable logic cells. These devices are known as Programmable System on Chip (PSoC). Their ARM cores (embedded in the processing system or PS) communicates with the programmable logic cells (PL) using ARM-standard AXI buses. In this paper we analyses the performance of exhaustive data transfers between PS and PL for a Xilinx Zynq FPGA in a co-design real scenario for Convolutional Neural Networks (CNN) accelerator, which processes, in dedicated hardware, a stream of visual information from a neuromorphic visual sensor for classification. In the PS side, a Linux operating system is running, which recollects visual events from the neuromorphic sensor into a normalized frame, and then it transfers these frames to the accelerator of multi-layered CNNs, and read results, using an AXI-DMA bus in a per-layer way. As these kind of accelerators try to process information as quick as possible, data bandwidth becomes critical and maintaining a good balanced data throughput rate requires some considerations. We present and evaluate several data partitioning techniques to improve the balance between RX and TX transfer and two different ways of transfers management: through a polling routine at the userlevel of the OS, and through a dedicated interrupt-based kernellevel driver. We demonstrate that for longer enough packets, the kernel-level driver solution gets better timing in computing a CNN classification example. Main advantage of using kernel-level driver is to have safer solutions and to have tasks scheduling in the OS to manage other important processes for our application, like frames collection from sensors and their normalization.es
dc.description.sponsorshipMinisterio de Economía y Competitividad TEC2016-77785-Pes
dc.formatapplication/pdfes
dc.language.isoenges
dc.publisherIEEE Computer Societyes
dc.relation.ispartofIEEE-NANO 2018: 18th International Conference on Nanotechnology (2018),
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internacional*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.titlePerformance evaluation over HW/SW co-design SoC memory transfers for a CNN acceleratores
dc.typeinfo:eu-repo/semantics/conferenceObjectes
dcterms.identifierhttps://ror.org/03yxnpp24
dc.type.versioninfo:eu-repo/semantics/submittedVersiones
dc.rights.accessRightsinfo:eu-repo/semantics/openAccesses
dc.contributor.affiliationUniversidad de Sevilla. Departamento de Arquitectura y Tecnología de Computadoreses
dc.relation.projectIDTEC2016-77785-Pes
dc.relation.publisherversionhttps://ieeexplore.ieee.org/document/8626313es
dc.identifier.doi10.1109/NANO.2018.8626313es
dc.contributor.groupUniversidad de Sevilla. TEP-108: Robótica y Tecnología de Computadores Aplicada a la Rehabilitaciónes
idus.format.extent5es
dc.eventtitleIEEE-NANO 2018: 18th International Conference on Nanotechnologyes
dc.eventinstitutionCork, Irelandes
dc.relation.publicationplaceNew York, USAes

FicherosTamañoFormatoVerDescripción
Performance evaluation over HW-SW ...949.4KbIcon   [PDF] Ver/Abrir  

Este registro aparece en las siguientes colecciones

Mostrar el registro sencillo del ítem

Attribution-NonCommercial-NoDerivatives 4.0 Internacional
Excepto si se señala otra cosa, la licencia del ítem se describe como: Attribution-NonCommercial-NoDerivatives 4.0 Internacional