Buscar
Mostrando ítems 1-4 de 4
Artículo
TOMATE: A heuristic-based approach to extract data from HTML tables
(Elsevier, 2021)
Extracting data from user-friendly HTML tables is difficult because of their different lay outs, formats, and encoding problems. In this article, we present a new proposal that first applies several pre-processing heuristics ...
Artículo
A clustering approach to extract data from HTML tables
(Elsevier, 2021)
HTML tables have become pervasive on the Web. Extracting their data automatically is difficult because finding the relationships between their cells is not trivial due to the many different layouts, encodings, and formats ...
Artículo
On exploring data lakes by finding compact, isolated clusters
(Elsevier, 2022)
Data engineers are very interested in data lake technologies due to the incredible abun dance of datasets. They typically use clustering to understand the structure of the datasets before applying other methods to infer ...
Artículo
A hybrid quantum approach to leveraging data from HTML tables
(Springer, 2022)
The Web provides many data that are encoded using HTML tables. This facilitates rendering them, but obfuscates their structure and makes it difficult for automated business processes to leverage them. This has motivated ...