Mostrar el registro sencillo del ítem

Artículo

dc.creatorCarrizosa Priego, Emilio Josées
dc.creatorHvas Mortensen, Laustes
dc.creatorRomero Morales, María Doloreses
dc.creatorSillero Denamiel, María Remedioses
dc.date.accessioned2022-06-30T07:08:46Z
dc.date.available2022-06-30T07:08:46Z
dc.date.issued2022-05-04
dc.identifier.citationCarrizosa Priego, E.J., Hvas Mortensen, L., Romero Morales, M.D. y Sillero Denamiel, M.R. (2022). The tree based linear regression model for hierarchical categorical variables. Expert Systems with Applications, 203, 117423-1-117423-13.
dc.identifier.issn0957-4174es
dc.identifier.urihttps://hdl.handle.net/11441/134809
dc.description.abstractMany real-life applications consider nominal categorical predictor variables that have a hierarchical structure, e.g. economic activity data in Official Statistics. In this paper, we focus on linear regression models built in the presence of this type of nominal categorical predictor variables, and study the consolidation of their categories to have a better tradeoff between interpretability and fit of the model to the data. We propose the so-called Tree based Linear Regression (TLR) model that optimizes both the accuracy of the reduced linear regression model and its complexity, measured as a cost function of the level of granularity of the representation of the hierarchical categorical variables. We show that finding non-dominated outcomes for this problem boils down to solving Mixed Integer Convex Quadratic Problems with Linear Constraints, and small to medium size instances can be tackled using off-the-shelf solvers. We illustrate our approach in two real-world datasets, as well as a synthetic one, where our methodology finds a much less complex model with a very mild worsening of the accuracy.es
dc.formatapplication/pdfes
dc.format.extent13 p.es
dc.language.isoenges
dc.publisherElsevieres
dc.relation.ispartofExpert Systems with Applications, 203, 117423-1-117423-13.
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internacional*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectHierarchical categorical variableses
dc.subjectLinear regression modelses
dc.subjectAccuracy vs. model complexityes
dc.subjectMixed integer convex quadratic problem with linear constraintses
dc.titleThe tree based linear regression model for hierarchical categorical variableses
dc.typeinfo:eu-repo/semantics/articlees
dc.type.versioninfo:eu-repo/semantics/publishedVersiones
dc.rights.accessRightsinfo:eu-repo/semantics/openAccesses
dc.contributor.affiliationUniversidad de Sevilla. Departamento de Estadística e Investigación Operativaes
dc.relation.publisherversiondoi.org/10.1016/j.eswa.2022.117423es
dc.identifier.doi10.1016/j.eswa.2022.117423es
dc.contributor.groupUniversidad de Sevilla. FQM329: Optimizaciones
dc.journaltitleExpert Systems with Applicationses
dc.publication.volumen203es
dc.publication.initialPage117423-1es
dc.publication.endPage117423-13es

FicherosTamañoFormatoVerDescripción
The tree based linear regression ...2.658MbIcon   [PDF] Ver/Abrir  

Este registro aparece en las siguientes colecciones

Mostrar el registro sencillo del ítem

Attribution-NonCommercial-NoDerivatives 4.0 Internacional
Excepto si se señala otra cosa, la licencia del ítem se describe como: Attribution-NonCommercial-NoDerivatives 4.0 Internacional