dc.creator | Rendón Segador, Fernando José | es |
dc.creator | Álvarez García, Juan Antonio | es |
dc.creator | Salazar González, José Luis | es |
dc.creator | Tommasi, Tatiana | es |
dc.date.accessioned | 2024-02-16T10:20:47Z | |
dc.date.available | 2024-02-16T10:20:47Z | |
dc.date.issued | 2023-04-01 | |
dc.identifier.issn | 1879-2782 | es |
dc.identifier.uri | https://hdl.handle.net/11441/155298 | |
dc.description.abstract | The state of the art in violence detection in videos has improved in recent years thanks to deep learning models, but it is still below 90% of average precision in the most complex datasets, which may pose a problem of frequent false alarms in video surveillance environments and may cause security guards to disable the artificial intelligence system.
In this study, we propose a new neural network based on Vision Transformer (ViT) and Neural Structured Learning (NSL) with adversarial training. This network, called CrimeNet, outperforms previous works by a large margin and reduces practically to zero the false positives. Our tests on the four most challenging violence-related datasets (binary and multi-class) show the effectiveness of CrimeNet, improving the state of the art from 9.4 to 22.17 percentage points in ROC AUC depending on the dataset. In addition, we present a generalisation study on our model by training and testing it on different datasets. The obtained results show that CrimeNet improves over competing methods with a gain of between 12.39 and 25.22 percentage points, showing remarkable robustness. | es |
dc.description.sponsorship | Ministerio de Ciencia e Innovación PID2021-126359OB-I00 | es |
dc.description.sponsorship | Unión Europea, Ministerio de Ciencia e Innovación PDC2021-121197 | es |
dc.format | application/pdf | es |
dc.format.extent | 12 | es |
dc.language.iso | eng | es |
dc.publisher | Elsevier | es |
dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 Internacional | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | * |
dc.subject | Deep learning | es |
dc.subject | Neural Structured Learning | es |
dc.subject | Vision Transformer | es |
dc.subject | Violence detection | es |
dc.subject | Adversarial Learning | es |
dc.title | CrimeNet: Neural Structured Learning using Vision Transformer for violence detection | es |
dc.type | info:eu-repo/semantics/article | es |
dc.type.version | info:eu-repo/semantics/publishedVersion | es |
dc.rights.accessRights | info:eu-repo/semantics/openAccess | es |
dc.contributor.affiliation | Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos | es |
dc.relation.publisherversion | https://doi.org/10.1016/j.neunet.2023.01.048 | es |
dc.identifier.doi | https://doi.org/10.1016/j.neunet.2023.01.048 | es |
dc.contributor.group | Universidad de Sevilla. TIC134: Sistemas Informáticos | es |
dc.journaltitle | Neural Networks | es |
dc.publication.volumen | 161 | es |
dc.publication.initialPage | 318 | es |
dc.publication.endPage | 329 | es |