Artículo
CrimeNet: Neural Structured Learning using Vision Transformer for violence detection
Autor/es | Rendón Segador, Fernando José
Álvarez García, Juan Antonio Salazar González, Jose Luis Tommasi, Tatiana |
Departamento | Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos |
Fecha de publicación | 2023-02-02 |
Fecha de depósito | 2023-02-09 |
Publicado en |
|
Resumen | The state of the art in violence detection in videos has improved in recent years thanks to deep learning models, but it is still below 90% of average precision in the most complex datasets, which may pose a problem of ... The state of the art in violence detection in videos has improved in recent years thanks to deep learning models, but it is still below 90% of average precision in the most complex datasets, which may pose a problem of frequent false alarms in video surveillance environments and may cause security guards to disable the artificial intelligence system. In this study, we propose a new neural network based on Vision Transformer (ViT) and Neural Structured Learning (NSL) with adversarial training. This network, called CrimeNet, outperforms previous works by a large margin and reduces practically to zero the false positives. Our tests on the four most challenging violence-related datasets (binary and multi-class) show the effectiveness of CrimeNet, improving the state of the art from 9.4 to 22.17 percentage points in ROC AUC depending on the dataset. In addition, we present a generalisation study on our model by training and testing it on different datasets. The obtained results show that CrimeNet improves over competing methods with a gain of between 12.39 and 25.22 percentage points, showing remarkable robustness. |
Agencias financiadoras | Ministerio de Ciencia e Innovación (MICIN). España European Union (UE) |
Identificador del proyecto | DISARM project - Grant n. PDC2021-121197
HORUS project - Grant n. PID2021-126359OB-I00 |
Cita | Rendón Segador, F.J., Álvarez García, J.A., Salazar González, J.L. y Tommasi, T. (2023). CrimeNet: Neural Structured Learning using Vision Transformer for violence detection. Neural Networks, February 2023, 1-24. https://doi.org/10.1016/j.neunet.2023.01.048. |
Ficheros | Tamaño | Formato | Ver | Descripción |
---|---|---|---|---|
1-s2.0-S0893608023000606-main.pdf | 1.528Mb | [PDF] | Ver/ | |
Este registro aparece en las siguientes colecciones
Este documento está protegido por los derechos de propiedad intelectual e industrial. Sin perjuicio de las exenciones legales existentes, queda prohibida su reproducción, distribución, comunicación pública o transformación sin la autorización del titular de los derechos, a menos que se indique lo contrario.