Repositorio de producción científica de la Universidad de Sevilla

Data Cleansing Meets Feature Selection: A Supervised Machine Learning Approach

Opened Access Data Cleansing Meets Feature Selection: A Supervised Machine Learning Approach

Citas

buscar en

Estadísticas
Icon
Exportar a
Autor: Tallón Ballesteros, Antonio Javier
Riquelme Santos, José Cristóbal
Departamento: Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos
Fecha: 2015
Publicado en: Bioinspired Computation in Artificial Systems : International Work-Conference on the Interplay Between Natural and Artificial Computation, IWINAC 2015, Elche, Spain, June 1-5, 2015, Proceedings, Part II. Lectures notes in Computer Science, v.9108
ISBN/ISSN: 978-3-319-18832-4
Tipo de documento: Capítulo de Libro
Resumen: This paper presents a novel procedure to apply in a sequential way two data preparation techniques from a different nature such as data cleansing and feature selection. For the former we have experienced with a partial removal of outliers via inter-quartile range whereas for the latter we have chosen relevant attributes with two widespread feature subset selectors like CFS (Correlation-based Feature Selection) and CNS (Consistency-based Feature Selection), which are founded on correlation and consistency measures, respectively. Empirical results on seven difficult binary and multi-class data sets, that is, with a test error rate of at least a 10%, according to accuracy, with C4.5 or 1-nearest neighbour classifiers without any kind of prior data pre-processing are outlined. Non-parametric statistical tests assert that the meeting of the aforementioned two data preparation strategies using a correlation measure for feature selection with C4.5 algorithm is significant better...
[Ver más]
Tamaño: 239.3Kb
Formato: PDF

URI: http://hdl.handle.net/11441/42752

DOI: http://dx.doi.org/10.1007/978-3-319-18833-1_39

Mostrar el registro completo del ítem


Esta obra está bajo una Licencia Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 Internacional

Este registro aparece en las siguientes colecciones