dc.creator | Benítez Peña, Sandra | es |
dc.creator | Blanquero Bravo, Rafael | es |
dc.creator | Carrizosa Priego, Emilio José | es |
dc.creator | Ramírez Cobo, Josefa | es |
dc.date.accessioned | 2018-04-09T08:58:14Z | |
dc.date.available | 2018-04-09T08:58:14Z | |
dc.date.issued | 2018-03 | |
dc.identifier.citation | Benítez Peña, S., Blanquero Bravo, R., Carrizosa Priego, E.J. y Ramírez Cobo, J. (2018). Cost-sensitive feature selection for support vector machines. Computers and Operations Research | |
dc.identifier.issn | 0305-0548 | es |
dc.identifier.issn | 1873-765x | es |
dc.identifier.uri | https://hdl.handle.net/11441/72191 | |
dc.description.abstract | Feature Selection (FS) is a crucial procedure in Data Science tasks such as
Classification, since it identifies the relevant variables, making thus the classification procedures more interpretable and more effective by reducing noise and data overfit. The relevance of features in a classification procedure is linked to the fact that misclassifications costs are frequently asymmetric, since false positive and false negative cases may have very different consequences. However, off-the-shelf FS procedures seldom take into account such cost-sensitivity of errors. In this paper we propose a mathematical-optimization-based FS procedure embedded in one of the most popular classification procedures, namely, Support Vector Machines (SVM), accommodating asymmetric misclassification costs. The key idea is to replace the traditional margin maximization by minimizing the number of features selected, but imposing upper bounds on the false positive and negative rates. The problem is written as an integer linear problem plus a quadratic convex problem for SVM with both linear and radial kernels. The reported numerical experience demonstrates the usefulness of the proposed FS procedure. Indeed, our results on benchmark data sets show that a substantial decrease of the number of features is obtained, whilst the desired trade-off between false positive and false negative rates is achieved. | es |
dc.format | application/pdf | es |
dc.language.iso | eng | es |
dc.publisher | Elsevier | es |
dc.relation.ispartof | Computers and Operations Research | |
dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 Internacional | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | * |
dc.subject | Classification | es |
dc.subject | Data science | es |
dc.subject | Support vector machines | es |
dc.subject | Feature selection | es |
dc.subject | Integer programming | es |
dc.subject | Sparsity | es |
dc.title | Cost-sensitive feature selection for support vector machines | es |
dc.type | info:eu-repo/semantics/article | es |
dcterms.identifier | https://ror.org/03yxnpp24 | |
dc.type.version | info:eu-repo/semantics/submittedVersion | es |
dc.rights.accessRights | info:eu-repo/semantics/openAccess | es |
dc.contributor.affiliation | Universidad de Sevilla. Departamento de Estadística e Investigación Operativa | es |
dc.relation.publisherversion | https://ac.els-cdn.com/S0305054818300741/1-s2.0-S0305054818300741-main.pdf?_tid=93af2337-b467-49cd-ba99-2e7e90a03885&acdnat=1523263712_8863726be4ae6466dd0596c5f9d3043b | es |
dc.identifier.doi | 10.1016/j.cor.2018.03.005 | es |
dc.contributor.group | Universidad de Sevilla. FQM329: Optimización | es |
idus.format.extent | 25 p. | es |
dc.journaltitle | Computers and Operations Research | es |