idUS - Buscar

Mostrando ítems 1-4 de 4

Artículo

TOMATE: A heuristic-based approach to extract data from HTML tables

Roldán Salvador, Juan Carlos; Jiménez Aguirre, Patricia; Szekely, Pedro; Corchuelo Gil, Rafael (Elsevier, 2021)

Extracting data from user-friendly HTML tables is difficult because of their different lay outs, formats, and encoding problems. In this article, we present a new proposal that first applies several pre-processing heuristics ...

Artículo

A clustering approach to extract data from HTML tables

Jiménez Aguirre, Patricia; Roldán Salvador, Juan Carlos; Corchuelo Gil, Rafael (Elsevier, 2021)

HTML tables have become pervasive on the Web. Extracting their data automatically is difficult because finding the relationships between their cells is not trivial due to the many different layouts, encodings, and formats ...

Artículo

A coral-reef approach to extract information from HTML tables

Jiménez Aguirre, Patricia; Roldán Salvador, Juan Carlos; Corchuelo Gil, Rafael (Elsevier, 2022)

his article presents Coraline, which is a new table-understanding proposal. Its novelty lies in a coral-reef optimisation algorithm that addresses the problem of feature selection in synchrony with a clustering technique ...

Artículo

A hybrid quantum approach to leveraging data from HTML tables

Jiménez Aguirre, Patricia; Roldán Salvador, Juan Carlos; Corchuelo Gil, Rafael (Springer, 2022)

The Web provides many data that are encoded using HTML tables. This facilitates rendering them, but obfuscates their structure and makes it difficult for automated business processes to leverage them. This has motivated ...

Buscar

Filtros

TOMATE: A heuristic-based approach to extract data from HTML tables

A clustering approach to extract data from HTML tables

A coral-reef approach to extract information from HTML tables

A hybrid quantum approach to leveraging data from HTML tables