dc.creator | Tapiador Morales, Ricardo | es |
dc.creator | Ríos Navarro, José Antonio | es |
dc.creator | Linares Barranco, Alejandro | es |
dc.creator | Kim, Minkyu | es |
dc.creator | Kadetotad, Deepak | es |
dc.creator | Seo, Jae-sun | es |
dc.date.accessioned | 2020-02-13T11:45:20Z | |
dc.date.available | 2020-02-13T11:45:20Z | |
dc.date.issued | 2017 | |
dc.identifier.citation | Tapiador Morales, R., Rios Navarro, A., Linares Barranco, A., Kim, M., Kadetotad, D. y Seo, J. (2017). Comprehensive Evaluation of OpenCL-Based CNN Implementations for FPGAs. En IWANN 2017: 14th International Work-Conference on Artificial Neural Networks (271-282), Cadiz, España: Springer. | |
dc.identifier.isbn | 978-3-319-59146-9 | es |
dc.identifier.issn | 0302-9743 | es |
dc.identifier.uri | https://hdl.handle.net/11441/93008 | |
dc.description.abstract | Deep learning has significantly advanced the state of the
art in artificial intelligence, gaining wide popularity from both industry
and academia. Special interest is around Convolutional Neural Networks
(CNN), which take inspiration from the hierarchical structure
of the visual cortex, to form deep layers of convolutional operations,
along with fully connected classifiers. Hardware implementations of these
deep CNN architectures are challenged with memory bottlenecks that
require many convolution and fully-connected layers demanding large
amount of communication for parallel computation. Multi-core CPU
based solutions have demonstrated their inadequacy for this problem
due to the memory wall and low parallelism. Many-core GPU architectures
show superior performance but they consume high power and also
have memory constraints due to inconsistencies between cache and main
memory. OpenCL is commonly used to describe these architectures for
their execution on GPGPUs or FPGAs. FPGA design solutions are also
actively being explored, which allow implementing the memory hierarchy
using embedded parallel BlockRAMs. This boosts the parallel use
of shared memory elements between multiple processing units, avoiding
data replicability and inconsistencies. This makes FPGAs potentially
powerful solutions for real-time classification of CNNs. In this
paper both Altera and Xilinx adopted OpenCL co-design frameworks
for pseudo-automatic development solutions are evaluated. A comprehensive
evaluation and comparison for a 5-layer deep CNN is presented.
Hardware resources, temporal performance and the OpenCL architecture
for CNNs are discussed. Xilinx demonstrates faster synthesis, better
FPGA resource utilization and more compact boards. Altera provides
multi-platforms tools, mature design community and better execution
times. | es |
dc.description.sponsorship | Ministerio de Economía y Competitividad TEC2016-77785-P | es |
dc.format | application/pdf | es |
dc.language.iso | eng | es |
dc.publisher | Springer | es |
dc.relation.ispartof | IWANN 2017: 14th International Work-Conference on Artificial Neural Networks (2017), p 271-282 | |
dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 Internacional | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | * |
dc.subject | Deep learning | es |
dc.subject | Convolutional Neural Networks (CNN) | es |
dc.subject | Hardware Acceleration | es |
dc.subject | OpenCL | es |
dc.subject | FPGA | es |
dc.subject | Caffe | es |
dc.subject | Xilinx | es |
dc.subject | Altera | es |
dc.title | Comprehensive Evaluation of OpenCL-Based CNN Implementations for FPGAs | es |
dc.type | info:eu-repo/semantics/conferenceObject | es |
dcterms.identifier | https://ror.org/03yxnpp24 | |
dc.type.version | info:eu-repo/semantics/submittedVersion | es |
dc.rights.accessRights | info:eu-repo/semantics/openAccess | es |
dc.contributor.affiliation | Universidad de Sevilla. Departamento de Arquitectura y Tecnología de Computadores | es |
dc.relation.projectID | TEC2016-77785-P | es |
dc.relation.publisherversion | https://link.springer.com/chapter/10.1007/978-3-319-59147-6_24 | es |
dc.identifier.doi | 10.1007/978-3-319-59147-6_24 | es |
dc.contributor.group | Universidad de Sevilla. TEP-108: Robótica y Tecnología de Computadores Aplicada a la Rehabilitación | es |
idus.format.extent | 12 | es |
dc.publication.initialPage | 271 | es |
dc.publication.endPage | 282 | es |
dc.eventtitle | IWANN 2017: 14th International Work-Conference on Artificial Neural Networks | es |
dc.eventinstitution | Cadiz, España | es |
dc.relation.publicationplace | Berlin | es |