Repositorio de producción científica de la Universidad de Sevilla

Data Cleansing Meets Feature Selection: A Supervised Machine Learning Approach

Opened Access Data Cleansing Meets Feature Selection: A Supervised Machine Learning Approach


buscar en

Exportar a
Autor: Tallón Ballesteros, Antonio Javier
Riquelme Santos, José Cristóbal
Departamento: Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos
Fecha: 2015
Publicado en: Bioinspired Computation in Artificial Systems : International Work-Conference on the Interplay Between Natural and Artificial Computation, IWINAC 2015, Elche, Spain, June 1-5, 2015, Proceedings, Part II. Lectures notes in Computer Science, v.9108
ISBN/ISSN: 978-3-319-18832-4
Tipo de documento: Capítulo de Libro
Resumen: This paper presents a novel procedure to apply in a sequential way two data preparation techniques from a different nature such as data cleansing and feature selection. For the former we have experienced with a partial removal of outliers via inter-quartile range whereas for the latter we have chosen relevant attributes with two widespread feature subset selectors like CFS (Correlation-based Feature Selection) and CNS (Consistency-based Feature Selection), which are founded on correlation and consistency measures, respectively. Empirical results on seven difficult binary and multi-class data sets, that is, with a test error rate of at least a 10%, according to accuracy, with C4.5 or 1-nearest neighbour classifiers without any kind of prior data pre-processing are outlined. Non-parametric statistical tests assert that the meeting of the aforementioned two data preparation strategies using a correlation measure for feature selection with C4.5 algorithm is significant better...
[Ver más]
Tamaño: 239.3Kb
Formato: PDF



Mostrar el registro completo del ítem

Esta obra está bajo una Licencia Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 Internacional

Este registro aparece en las siguientes colecciones