CrimeNet: Neural Structured Learning using Vision Transformer for violence detection

Rendón Segador, Fernando José; Álvarez García, Juan Antonio; Salazar González, José Luis; Tommasi, Tatiana

doi:https://doi.org/10.1016/j.neunet.2023.01.048

Artículo

dc.creator	Rendón Segador, Fernando José	es
dc.creator	Álvarez García, Juan Antonio	es
dc.creator	Salazar González, José Luis	es
dc.creator	Tommasi, Tatiana	es
dc.date.accessioned	2024-02-16T10:20:47Z
dc.date.available	2024-02-16T10:20:47Z
dc.date.issued	2023-04-01
dc.identifier.issn	1879-2782	es
dc.identifier.uri	https://hdl.handle.net/11441/155298
dc.description.abstract	The state of the art in violence detection in videos has improved in recent years thanks to deep learning models, but it is still below 90% of average precision in the most complex datasets, which may pose a problem of frequent false alarms in video surveillance environments and may cause security guards to disable the artificial intelligence system. In this study, we propose a new neural network based on Vision Transformer (ViT) and Neural Structured Learning (NSL) with adversarial training. This network, called CrimeNet, outperforms previous works by a large margin and reduces practically to zero the false positives. Our tests on the four most challenging violence-related datasets (binary and multi-class) show the effectiveness of CrimeNet, improving the state of the art from 9.4 to 22.17 percentage points in ROC AUC depending on the dataset. In addition, we present a generalisation study on our model by training and testing it on different datasets. The obtained results show that CrimeNet improves over competing methods with a gain of between 12.39 and 25.22 percentage points, showing remarkable robustness.	es
dc.description.sponsorship	Ministerio de Ciencia e Innovación PID2021-126359OB-I00	es
dc.description.sponsorship	Unión Europea, Ministerio de Ciencia e Innovación PDC2021-121197	es
dc.format	application/pdf	es
dc.format.extent	12	es
dc.language.iso	eng	es
dc.publisher	Elsevier	es
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 Internacional	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	*
dc.subject	Deep learning	es
dc.subject	Neural Structured Learning	es
dc.subject	Vision Transformer	es
dc.subject	Violence detection	es
dc.subject	Adversarial Learning	es
dc.title	CrimeNet: Neural Structured Learning using Vision Transformer for violence detection	es
dc.type	info:eu-repo/semantics/article	es
dc.type.version	info:eu-repo/semantics/publishedVersion	es
dc.rights.accessRights	info:eu-repo/semantics/openAccess	es
dc.contributor.affiliation	Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos	es
dc.relation.publisherversion	https://doi.org/10.1016/j.neunet.2023.01.048	es
dc.identifier.doi	https://doi.org/10.1016/j.neunet.2023.01.048	es
dc.contributor.group	Universidad de Sevilla. TIC134: Sistemas Informáticos	es
dc.journaltitle	Neural Networks	es
dc.publication.volumen	161	es
dc.publication.initialPage	318	es
dc.publication.endPage	329	es

Ficheros	Tamaño	Formato	Ver	Descripción
1-s2.0-S0893608023000606-main ...	1.916Mb	[PDF]	Ver/Abrir

Este registro aparece en las siguientes colecciones

Artículos (Lenguajes y Sistemas Informáticos)

Mostrar el registro sencillo del ítem

Excepto si se señala otra cosa, la licencia del ítem se describe como: Attribution-NonCommercial-NoDerivatives 4.0 Internacional