dc.creator | Carrizosa Priego, Emilio José | es |
dc.creator | Olivares Nadal, Alba Victoria | es |
dc.creator | Ramírez Cobo, Josefa | es |
dc.date.accessioned | 2021-02-01T12:31:23Z | |
dc.date.available | 2021-02-01T12:31:23Z | |
dc.date.issued | 2020-04-01 | |
dc.identifier.citation | Carrizosa Priego, E.J., Olivares Nadal, A.V. y Ramírez Cobo, J. (2020). Integer constraints for enhancing interpretability in linear regression. SORT. Statistics and Operations Research Transactions, 44 (1), 67-98. | |
dc.identifier.issn | 2013-8830 | es |
dc.identifier.uri | https://hdl.handle.net/11441/104390 | |
dc.description.abstract | One of the main challenges researchers face is to identify the most relevant features in a prediction
model. As a consequence, many regularized methods seeking sparsity have flourished. Although
sparse, their solutions may not be interpretable in the presence of spurious coefficients and correlated features. In this paper we aim to enhance interpretability in linear regression in presence of
multicollinearity by: (i) forcing the sign of the estimated coefficients to be consistent with the sign
of the correlations between predictors, and (ii) avoiding spurious coefficients so that only significant features are represented in the model. This will be addressed by modelling constraints and
adding them to an optimization problem expressing some estimation procedure such as ordinary
least squares or the lasso. The so-obtained constrained regression models will become Mixed Integer Quadratic Problems. The numerical experiments carried out on real and simulated datasets
show that tightening the search space of some standard linear regression models by adding the
constraints modelling (i) and/or (ii) help to improve the sparsity and interpretability of the solutions
with competitive predictive quality. | es |
dc.format | application/pdf | es |
dc.format.extent | 28 p. | es |
dc.language.iso | eng | es |
dc.publisher | Institut d´Estadística de Catalunya (Idescat) | es |
dc.relation.ispartof | SORT. Statistics and Operations Research Transactions, 44 (1), 67-98. | |
dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 Internacional | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | * |
dc.subject | Linear regression | es |
dc.subject | Multicollinearity | es |
dc.subject | Sparsity | es |
dc.subject | Cardinality constraint | es |
dc.subject | Mixed Integer Non Linear Programming | es |
dc.title | Integer constraints for enhancing interpretability in linear regression | es |
dc.type | info:eu-repo/semantics/article | es |
dcterms.identifier | https://ror.org/03yxnpp24 | |
dc.type.version | info:eu-repo/semantics/publishedVersion | es |
dc.rights.accessRights | info:eu-repo/semantics/openAccess | es |
dc.contributor.affiliation | Universidad de Sevilla. Departamento de Estadística e Investigación Operativa | es |
dc.relation.publisherversion | https://doi.org/10.2436/20.8080.02.95 | es |
dc.identifier.doi | 10.2436/20.8080.02.95 | es |
dc.journaltitle | SORT. Statistics and Operations Research Transactions | es |
dc.publication.volumen | 44 | es |
dc.publication.issue | 1 | es |
dc.publication.initialPage | 67 | es |
dc.publication.endPage | 98 | es |