Repositorio de producción científica de la Universidad de Sevilla

An Experiment to Test URL Features for Web Page Classification

 

Advanced Search
 
Opened Access An Experiment to Test URL Features for Web Page Classification
Cites

Show item statistics
Icon
Export to
Author: Hernández Salmerón, Inmaculada Concepción
Rivero, Carlos R.
Ruiz Cortés, David
Arjona Fernández, José Luis
Department: Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos
Date: 2012
Published in: PAAMS 2012: 10th International Conference on Practical Applications of Agents and Multi-Agent Systems (2012), p 109-116
ISBN/ISSN: 978-3-642-28794-7
1865-1348
Document type: Presentation
Abstract: Web page classification has been extensively researched, using different types of features that are extracted either from the page content, the page structure or from other pages that link to that page. Using features from the page itself implies having to download it before its classification. We present an experiment to proof that URL tokens contain information enough to extract features to classify web pages. A classifier based on these features is able to classify a web page without having to download it previously, avoiding unnecessary downloads.
Cite: Hernández Salmerón, I.C., Rivero, C.R., Ruiz Cortés, D. y Arjona Fernández, J.L. (2012). An Experiment to Test URL Features for Web Page Classification. En PAAMS 2012: 10th International Conference on Practical Applications of Agents and Multi-Agent Systems (109-116), Salamanca, España: Springer.
Size: 320.8Kb
Format: PDF

URI: http://hdl.handle.net/11441/66174

DOI: 10.1007/978-3-642-28795-4_13

See editor´s version

This work is under a Creative Commons License: 
Attribution-NonCommercial-NoDerivatives 4.0 Internacional

This item appears in the following Collection(s)