Ponencia
Spam hierarchical clustering for campaigns spotting and topic-based classification [Póster]
Autor/es | Jáñez Martino, Francisco
Carofilis, Andrés Alaiz Rodríguez, Rocío González Castro, Víctor Fidalgo, Eduardo Alegre, Enrique |
Coordinador/Director | Varela Vaca, Ángel Jesús
Ceballos Guerrero, Rafael Reina Quintero, Antonia María |
Fecha de publicación | 2024 |
Fecha de depósito | 2024-08-27 |
Publicado en |
|
ISBN/ISSN | 978-84-09-62140-8 |
Resumen | This article focuses on the creation of multi classification systems for spam email in cybersecurity orga nizations to prevent cyber-attacks and spam campaigns. We introduce two new subsets: SPEMC-15K-E and SPEMC-15K-S, ... This article focuses on the creation of multi classification systems for spam email in cybersecurity orga nizations to prevent cyber-attacks and spam campaigns. We introduce two new subsets: SPEMC-15K-E and SPEMC-15K-S, comprising 14479 and 14992 spam emails, in English and Span ish, respectively. These are divided into eleven classes, defined using agglomerative hierarchical clustering. We evaluated sixteen pipelines, combining text representation techniques (TF-IDF, Bag of Words, Word2Vec, and BERT) and classifiers (Support Vector Machine, Na¨ ıve Bayes, Random Forest, and Logistic Regression). TF-IDF with Logistic Regression (LR) achieved the best results for English, with an F1-score of 0.953 and 94.6% accuracy. Similarly, TF-IDF with Na¨ ıve Bayes achieved the best results for Spanish, achieving an F1-score of 0.945 and 98.5% accuracy. Finally, it was observed that the TF-IDF with LR has the shortest processing time, completing the classification in an average of 2ms and 2.2ms per-email in English and Spanish, respectively. |
Cita | Jáñez Martino, F., Carofilis, A., Alaiz Rodríguez, R., González Castro, V., Fidalgo, E. y Alegre, E. (2024). Spam hierarchical clustering for campaigns spotting and topic-based classification [Póster]. En Jornadas Nacionales de Investigación en Ciberseguridad (JNIC) (9ª.2024. Sevilla) (490-491), Sevilla: Universidad de Sevilla. Escuela Técnica Superior de Ingeniería Informática. |
Ficheros | Tamaño | Formato | Ver | Descripción |
---|---|---|---|---|
JNIC24_508.pdf | 2.645Mb | [PDF] | Ver/ | |