Mostrar el registro sencillo del ítem

Artículo

dc.creatorHernández Salmerón, Inmaculada Concepciónes
dc.creatorRivero, Carlos R.es
dc.creatorRuiz Cortés, Davides
dc.creatorCorchuelo Gil, Rafaeles
dc.date.accessioned2017-11-22T11:28:27Z
dc.date.available2017-11-22T11:28:27Z
dc.date.issued2014
dc.identifier.citationHernández Salmerón, I.C., Rivero, C.R., Ruiz Cortés, D. y Corchuelo Gil, R. (2014). CALA: An unsupervised URL-based web page classification system. Knowledge-Based Systems, 57 (February 2014), 168-180.
dc.identifier.issn0950-7051es
dc.identifier.urihttp://hdl.handle.net/11441/66444
dc.description.abstractUnsupervised web page classification refers to the problem of clustering the pages in a web site so that each cluster includes a set of web pages that can be classified using a unique class. The existing proposals to perform web page classification do not fulfill a number of requirements that would make them suitable for enterprise web information integration, namely: to be based on a lightweight crawling, so as to avoid interfering with the normal operation of the web site, to be unsupervised, which avoids the need for a training set of pre-classified pages, or to use features from outside the page to be classified, which avoids having to download it. In this article, we propose CALA, a new automated proposal to generate URL-based web page classifiers. Our proposal builds a number of URL patterns that represent the different classes of pages in a web site, so further pages can be classified by matching their URLs to the patterns. Its salient features are that it fulfills all of the previous requirements, and it has been validated by a number of experiments using real-world, top-visited web sites. Our validation proves that CALA is very effective and efficient in practice.es
dc.description.sponsorshipMinisterio de Educación y Ciencia TIN2007-64119es
dc.description.sponsorshipJunta de Andalucía P07-TIC-2602es
dc.description.sponsorshipJunta de Andalucía P08- TIC-4100es
dc.description.sponsorshipMinisterio de Ciencia e Innovación TIN2008-04718-Ees
dc.description.sponsorshipMinisterio de Ciencia e Innovación TIN2010-21744es
dc.description.sponsorshipMinisterio de Ciencia e Innovación TIN2010-09809-Ees
dc.description.sponsorshipMinisterio de Ciencia e Innovación TIN2010-10811-Ees
dc.description.sponsorshipMinisterio de Ciencia e Innovación TIN2010-09988-Ees
dc.description.sponsorshipMinisterio de Economía y Competitividad TIN2011-15497-Ees
dc.formatapplication/pdfes
dc.language.isoenges
dc.publisherElsevieres
dc.relation.ispartofKnowledge-Based Systems, 57 (February 2014), 168-180.
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internacional*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectWeb Page Classificationes
dc.subjectURL Classificationes
dc.subjectURL Patternses
dc.subjectEnterprise web information integrationes
dc.subjectWeb Page Clusteringes
dc.titleCALA: An unsupervised URL-based web page classification systemes
dc.typeinfo:eu-repo/semantics/articlees
dcterms.identifierhttps://ror.org/03yxnpp24
dc.type.versioninfo:eu-repo/semantics/submittedVersiones
dc.rights.accessRightsinfo:eu-repo/semantics/openAccesses
dc.contributor.affiliationUniversidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticoses
dc.relation.projectIDTIN2007-64119es
dc.relation.projectIDP07-TIC-2602es
dc.relation.projectIDP08-TIC-4100es
dc.relation.projectIDTIN2008-04718-Ees
dc.relation.projectIDTIN2010-21744es
dc.relation.projectIDTIN2010-09809-Ees
dc.relation.projectIDTIN2010-10811-Ees
dc.relation.projectIDTIN2010-09988-Ees
dc.relation.projectIDTIN2011-15497-Ees
dc.relation.publisherversionhttp://www.sciencedirect.com/science/article/pii/S0950705113003997es
dc.identifier.doi10.1016/j.knosys.2013.12.019es
dc.contributor.groupUniversidad de Sevilla. TIC134: Sistemas Informáticoses
idus.format.extent13es
dc.journaltitleKnowledge-Based Systemses
dc.publication.volumen57es
dc.publication.issueFebruary 2014es
dc.publication.initialPage168es
dc.publication.endPage180es
dc.identifier.sisius20649208es

FicherosTamañoFormatoVerDescripción
Cala.pdf1.873MbIcon   [PDF] Ver/Abrir  

Este registro aparece en las siguientes colecciones

Mostrar el registro sencillo del ítem

Attribution-NonCommercial-NoDerivatives 4.0 Internacional
Excepto si se señala otra cosa, la licencia del ítem se describe como: Attribution-NonCommercial-NoDerivatives 4.0 Internacional