Presentation
An Experiment to Test URL Features for Web Page Classification
Author/s | Hernández Salmerón, Inmaculada Concepción
![]() ![]() ![]() ![]() ![]() ![]() ![]() Rivero, Carlos R. Ruiz Cortés, David ![]() ![]() ![]() ![]() ![]() ![]() ![]() Arjona Fernández, José Luis |
Department | Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos |
Publication Date | 2012 |
Deposit Date | 2017-11-17 |
Published in |
|
ISBN/ISSN | 978-3-642-28794-7 1865-1348 |
Abstract | Web page classification has been extensively researched, using different
types of features that are extracted either from the page content, the page structure
or from other pages that link to that page. Using features ... Web page classification has been extensively researched, using different types of features that are extracted either from the page content, the page structure or from other pages that link to that page. Using features from the page itself implies having to download it before its classification. We present an experiment to proof that URL tokens contain information enough to extract features to classify web pages. A classifier based on these features is able to classify a web page without having to download it previously, avoiding unnecessary downloads. |
Project ID. | TIN2007-64119
![]() P07-TIC-2602 ![]() P08- TIC-4100 ![]() TIN2008-04718-E ![]() TIN2010-21744 ![]() TIN2010-09809-E ![]() TIN2010-10811-E ![]() TIN2010-09988-E ![]() |
Citation | Hernández Salmerón, I.C., Rivero, C.R., Ruiz Cortés, D. y Arjona Fernández, J.L. (2012). An Experiment to Test URL Features for Web Page Classification. En PAAMS 2012: 10th International Conference on Practical Applications of Agents and Multi-Agent Systems (109-116), Salamanca, España: Springer. |
Files | Size | Format | View | Description |
---|---|---|---|---|
An Experiment to Test.pdf | 320.8Kb | ![]() | View/ | |