Ponencia
Mining Web Pages Using Features of Rendering HTML Elements in the Web Browser
Autor/es | Fernández, F. J.
Álvarez, José L. Abad, Pedro J. Jiménez Aguirre, Patricia |
Departamento | Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos |
Fecha de publicación | 2011 |
Fecha de depósito | 2022-04-08 |
Publicado en |
|
ISBN/ISSN | 978-3-642-19930-1 1867-5662 |
Resumen | The Web is the largest repository of useful information available for
human users, but it is usual that Web Pages do not provide an API to get access to
its information automatically. In order to solve this problem, ... The Web is the largest repository of useful information available for human users, but it is usual that Web Pages do not provide an API to get access to its information automatically. In order to solve this problem, Information Extractors are developed. We present a new methodology to induce Information Extractors from the Web. It is based on rendering HTML elements in the Web browser. The methodology uses a KDD process to mining a dataset with features of the elements in the Web page. An experimentation over 10 web sites has been made and the results show the effectiveness of the methodology. |
Agencias financiadoras | Ministerio de Ciencia Y Tecnología (MCYT). España Junta de Andalucía |
Identificador del proyecto | TIN2007-64119
P07-TIC-02602 P08-TIC-4100 |
Cita | Fernández, F.J., Álvarez, J.L., Abad, P.J. y Jiménez Aguirre, P. (2011). Mining Web Pages Using Features of Rendering HTML Elements in the Web Browser. En PAAMS 2011: 9th International Conference on Practical Applications of Agents and Multi-Agent Systems (161-168), Salamanca, España: Springer. |
Ficheros | Tamaño | Formato | Ver | Descripción |
---|---|---|---|---|
Fernández2011_Chapter_MiningWe ... | 212.3Kb | [PDF] | Ver/ | |