Mostrar el registro sencillo del ítem

Artículo

dc.creatorRendón Segador, Fernando Josées
dc.creatorÁlvarez García, Juan Antonioes
dc.creatorEnríquez de Salamanca Ros, Fernandoes
dc.creatorDeniz, Oscares
dc.date.accessioned2021-09-08T09:41:22Z
dc.date.available2021-09-08T09:41:22Z
dc.date.issued2021
dc.identifier.citationRendón Segador, F.J., Álvarez García, J.A., Enríquez de Salamanca Ros, F. y Deniz, O. (2021). ViolenceNet: Dense Multi-Head Self-Attention with Bidirectional Convolutional LSTM for Detecting Violence. Electronics, 10 (13), 1-16.
dc.identifier.issn2079-9292es
dc.identifier.urihttps://hdl.handle.net/11441/125566
dc.description.abstractIntroducing efficient automatic violence detection in video surveillance or audiovisual content monitoring systems would greatly facilitate the work of closed-circuit television (CCTV) operators, rating agencies or those in charge of monitoring social network content. In this paper we present a new deep learning architecture, using an adapted version of DenseNet for three dimensions, a multi-head self-attention layer and a bidirectional convolutional long short-term memory (LSTM) module, that allows encoding relevant spatio-temporal features, to determine whether a video is violent or not. Furthermore, an ablation study of the input frames, comparing dense optical flow and adjacent frames subtraction and the influence of the attention layer is carried out, showing that the combination of optical flow and the attention mechanism improves results up to 4.4%. The conducted experiments using four of the most widely used datasets for this problem, matching or exceeding in some cases the results of the state of the art, reducing the number of network parameters needed (4.5 millions), and increasing its efficiency in test accuracy (from 95.6% on the most complex dataset to 100% on the simplest one) and inference time (less than 0.3 s for the longest clips). Finally, to check if the generated model is able to generalize violence, a cross-dataset analysis is performed, which shows the complexity of this approach: using three datasets to train and testing on the remaining one the accuracy drops in the worst case to 70.08% and in the best case to 81.51%, which points to future work oriented towards anomaly detection in new datasets.es
dc.description.sponsorshipMinisterio de Economía y Competitividad TIN2017-82113-C2-1-Res
dc.description.sponsorshipMInisterio de Economía y Competitividad TIN2017-82113-C2-2-Res
dc.formatapplication/pdfes
dc.format.extent16es
dc.language.isoenges
dc.publisherMDPIes
dc.relation.ispartofElectronics, 10 (13), 1-16.
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internacional*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectviolence detectiones
dc.subjectfight detectiones
dc.subjectdeep learninges
dc.subjectdense netes
dc.subjectbidirectional ConvLSTMes
dc.titleViolenceNet: Dense Multi-Head Self-Attention with Bidirectional Convolutional LSTM for Detecting Violencees
dc.typeinfo:eu-repo/semantics/articlees
dcterms.identifierhttps://ror.org/03yxnpp24
dc.type.versioninfo:eu-repo/semantics/publishedVersiones
dc.rights.accessRightsinfo:eu-repo/semantics/openAccesses
dc.contributor.affiliationUniversidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticoses
dc.relation.projectIDTIN2017-82113-C2-1-Res
dc.relation.projectIDTIN2017-82113-C2-2-Res
dc.relation.publisherversionhttps://www.mdpi.com/2079-9292/10/13/1601/htmes
dc.identifier.doi10.3390/electronics10131601es
dc.contributor.groupUniversidad de Sevilla. TIC-134: Sistemas Informáticoses
dc.journaltitleElectronicses
dc.publication.volumen10es
dc.publication.issue13es
dc.publication.initialPage1es
dc.publication.endPage16es
dc.contributor.funderMinisterio de Economía y Competitividad (MINECO). Españaes

FicherosTamañoFormatoVerDescripción
ViolenceNet Dense Multi-Head ...4.547MbIcon   [PDF] Ver/Abrir  

Este registro aparece en las siguientes colecciones

Mostrar el registro sencillo del ítem

Attribution-NonCommercial-NoDerivatives 4.0 Internacional
Excepto si se señala otra cosa, la licencia del ítem se describe como: Attribution-NonCommercial-NoDerivatives 4.0 Internacional