Mostrar el registro sencillo del ítem

Ponencia

dc.contributor.editorVarela Vaca, Ángel Jesúses
dc.contributor.editorCeballos Guerrero, Rafaeles
dc.contributor.editorReina Quintero, Antonia Maríaes
dc.creatorJáñez Martino, Franciscoes
dc.creatorCarofilis, Andréses
dc.creatorAlaiz Rodríguez, Rocíoes
dc.creatorGonzález Castro, Víctores
dc.creatorFidalgo, Eduardoes
dc.creatorAlegre, Enriquees
dc.date.accessioned2024-08-27T10:58:40Z
dc.date.available2024-08-27T10:58:40Z
dc.date.issued2024
dc.identifier.citationJáñez Martino, F., Carofilis, A., Alaiz Rodríguez, R., González Castro, V., Fidalgo, E. y Alegre, E. (2024). Spam hierarchical clustering for campaigns spotting and topic-based classification [Póster]. En Jornadas Nacionales de Investigación en Ciberseguridad (JNIC) (9ª.2024. Sevilla) (490-491), Sevilla: Universidad de Sevilla. Escuela Técnica Superior de Ingeniería Informática.
dc.identifier.isbn978-84-09-62140-8es
dc.identifier.urihttps://hdl.handle.net/11441/162068
dc.description.abstractThis article focuses on the creation of multi classification systems for spam email in cybersecurity orga nizations to prevent cyber-attacks and spam campaigns. We introduce two new subsets: SPEMC-15K-E and SPEMC-15K-S, comprising 14479 and 14992 spam emails, in English and Span ish, respectively. These are divided into eleven classes, defined using agglomerative hierarchical clustering. We evaluated sixteen pipelines, combining text representation techniques (TF-IDF, Bag of Words, Word2Vec, and BERT) and classifiers (Support Vector Machine, Na¨ ıve Bayes, Random Forest, and Logistic Regression). TF-IDF with Logistic Regression (LR) achieved the best results for English, with an F1-score of 0.953 and 94.6% accuracy. Similarly, TF-IDF with Na¨ ıve Bayes achieved the best results for Spanish, achieving an F1-score of 0.945 and 98.5% accuracy. Finally, it was observed that the TF-IDF with LR has the shortest processing time, completing the classification in an average of 2ms and 2.2ms per-email in English and Spanish, respectively.es
dc.formatapplication/pdfes
dc.format.extent2es
dc.language.isoenges
dc.publisherUniversidad de Sevilla. Escuela Técnica Superior de Ingeniería Informáticaes
dc.relation.ispartofJornadas Nacionales de Investigación en Ciberseguridad (JNIC) (9ª.2024. Sevilla) (2024), pp. 490-491.
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internacional*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectSpam detectiones
dc.subjectMulti-classificationes
dc.subjectImage based spames
dc.subjectText classificationes
dc.titleSpam hierarchical clustering for campaigns spotting and topic-based classification [Póster]es
dc.typeinfo:eu-repo/semantics/conferenceObjectes
dc.type.versioninfo:eu-repo/semantics/publishedVersiones
dc.rights.accessRightsinfo:eu-repo/semantics/openAccesses
dc.publication.initialPage490es
dc.publication.endPage491es
dc.eventtitleJornadas Nacionales de Investigación en Ciberseguridad (JNIC) (9ª.2024. Sevilla)es
dc.eventinstitutionSevillaes
dc.relation.publicationplaceSevillaes

FicherosTamañoFormatoVerDescripción
JNIC24_508.pdf2.645MbIcon   [PDF] Ver/Abrir  

Este registro aparece en las siguientes colecciones

Mostrar el registro sencillo del ítem

Attribution-NonCommercial-NoDerivatives 4.0 Internacional
Excepto si se señala otra cosa, la licencia del ítem se describe como: Attribution-NonCommercial-NoDerivatives 4.0 Internacional