Presentation
Mining Web Pages Using Features of Rendering HTML Elements in the Web Browser
Author/s | Fernández, F. J.
Álvarez, José L. Abad, Pedro J. Jiménez Aguirre, Patricia ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Department | Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos |
Publication Date | 2011 |
Deposit Date | 2022-04-08 |
Published in |
|
ISBN/ISSN | 978-3-642-19930-1 1867-5662 |
Abstract | The Web is the largest repository of useful information available for
human users, but it is usual that Web Pages do not provide an API to get access to
its information automatically. In order to solve this problem, ... The Web is the largest repository of useful information available for human users, but it is usual that Web Pages do not provide an API to get access to its information automatically. In order to solve this problem, Information Extractors are developed. We present a new methodology to induce Information Extractors from the Web. It is based on rendering HTML elements in the Web browser. The methodology uses a KDD process to mining a dataset with features of the elements in the Web page. An experimentation over 10 web sites has been made and the results show the effectiveness of the methodology. |
Funding agencies | Ministerio de Ciencia Y Tecnología (MCYT). España Junta de Andalucía |
Project ID. | TIN2007-64119
![]() P07-TIC-02602 ![]() P08-TIC-4100 ![]() |
Citation | Fernández, F.J., Álvarez, J.L., Abad, P.J. y Jiménez Aguirre, P. (2011). Mining Web Pages Using Features of Rendering HTML Elements in the Web Browser. En PAAMS 2011: 9th International Conference on Practical Applications of Agents and Multi-Agent Systems (161-168), Salamanca, España: Springer. |
Files | Size | Format | View | Description |
---|---|---|---|---|
Fernández2011_Chapter_MiningWe ... | 212.3Kb | ![]() | View/ | |