dc.creator | Hernández Salmerón, Inmaculada Concepción | es |
dc.creator | Rivero, Carlos R. | es |
dc.creator | Ruiz Cortés, David | es |
dc.creator | Corchuelo Gil, Rafael | es |
dc.date.accessioned | 2017-11-10T10:18:21Z | |
dc.date.available | 2017-11-10T10:18:21Z | |
dc.date.issued | 2012 | |
dc.identifier.citation | Hernández Salmerón, I.C., Rivero, C.R., Ruiz Cortés, D. y Corchuelo Gil, R. (2012). A Statistical Approach to URL-Based Web Page Clustering. En WWW 2012: 21st International Conference on World Wide Web (525-526), Lyon, France: ACM. | |
dc.identifier.isbn | 978-1-4503-1230-1 | es |
dc.identifier.uri | http://hdl.handle.net/11441/65918 | |
dc.description.abstract | Most web page classifiers use features from the page content,
which means that it has to be downloaded to be classified. We
propose a technique to cluster web pages by means of their
URL exclusively. In contrast to other proposals, we analyse
features that are outside the page, hence, we do not need to
download a page to classify it. Also, it is non-supervised,
requiring little intervention from the user. Fur-thermore, we
do not need to crawl extensively a site to build a classifier for
that site, but only a small subset of pages. We have
performed an experiment over 21 highly visited web-sites to
evaluate the performance of our classifier, obtaining good
precision and recall results. | es |
dc.description.sponsorship | Junta de Andalucía P08-TIC-4100 | es |
dc.description.sponsorship | Ministerio de Ciencia e Innovación TIN2010-21744 | es |
dc.format | application/pdf | es |
dc.language.iso | eng | es |
dc.publisher | ACM | es |
dc.relation.ispartof | WWW 2012: 21st International Conference on World Wide Web (2012), p 525-526 | |
dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 Internacional | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | * |
dc.subject | URL Classification | es |
dc.subject | URL Patterns | es |
dc.subject | Web Page Clustering | es |
dc.title | A Statistical Approach to URL-Based Web Page Clustering | es |
dc.type | info:eu-repo/semantics/conferenceObject | es |
dc.type.version | info:eu-repo/semantics/submittedVersion | es |
dc.rights.accessRights | info:eu-repo/semantics/openAccess | es |
dc.contributor.affiliation | Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos | es |
dc.relation.projectID | P08-TIC-4100 | es |
dc.relation.projectID | TIN2010-21744 | es |
dc.relation.publisherversion | https://dl.acm.org/citation.cfm?id=2188109 | es |
dc.identifier.doi | 10.1145/2187980.2188109 | es |
dc.contributor.group | Universidad de Sevilla. TIC134: Sistemas Informáticos | es |
idus.format.extent | 2 | es |
dc.publication.initialPage | 525 | es |
dc.publication.endPage | 526 | es |
dc.eventtitle | WWW 2012: 21st International Conference on World Wide Web | es |
dc.eventinstitution | Lyon, France | es |
dc.relation.publicationplace | New York, USA | es |
dc.contributor.funder | Junta de Andalucía | |
dc.contributor.funder | Ministerio de Ciencia e Innovación (MICIN). España | |