Buscar
Mostrando ítems 1-10 de 12
Artículo
TOMATE: A heuristic-based approach to extract data from HTML tables
(Elsevier, 2021)
Extracting data from user-friendly HTML tables is difficult because of their different lay outs, formats, and encoding problems. In this article, we present a new proposal that first applies several pre-processing heuristics ...
Artículo
On learning context-aware rules to link RDF datasets
(Oxford University Press, 2020-09-15)
Integrating RDF datasets has become a relevant problem for both researchers and practitioners. In the literature, there are many genetic proposals that learn rules that allow to link the resources that refer to the same ...
Artículo
A clustering approach to extract data from HTML tables
(Elsevier, 2021)
HTML tables have become pervasive on the Web. Extracting their data automatically is difficult because finding the relationships between their cells is not trivial due to the many different layouts, encodings, and formats ...
Artículo
On Extracting Data from Tables that are Encoded using HTML
(Elsevier, 2020)
Tables are a common means to display data in human-friendly formats. Many authors have worked on proposals to extract those data back since this has many interesting applications. In this article, we summarise and compare ...
Artículo
A deep-learning approach to mining conditions
(ScienceDirect, 2020-04)
A condition is a constraint that determines when a consequent holds. Mining them in text is paramount to understand many sentences properly. In the literature, there are a few pattern-based proposals that fall short regarding ...
Artículo
An encoder–decoder approach to mine conditions for engineering textual data
(ScienceDirect, 2020-05)
Data engineering seeks to support artificial intelligence processes that extract knowledge from raw data. Many such data are rendered in natural language from which entity-relation extractors extract facts and opinion ...
Artículo
Torii: An aspect-based sentiment analysis system that can mine conditions
(John Wiley and Sons, 2020-01)
Aspect-based sentiment analysis systems are a kind of text-mining systems that specialize in summarizing the sentiment that a collection of reviews convey regarding some aspects of an item. There are many cases in which ...
Artículo
On the synthesis of metadata tags for HTML files
(Wiley, 2020)
RDFa, JSON-LD, Microdata, and Microformats allow to endow the data in HTML files with metadata tags that help software agents understand them. Unluckily, there are many HTML files that do not have any metadata tags, which ...
Artículo
On validating web information extraction proposals
(Elsevier, 2022)
Many people who have to make informed decisions in today’s always-on culture use information extractors to feed their systems with information that comes from human-friendly documents. Unfortunately, many proposals that ...
Artículo
On exploring data lakes by finding compact, isolated clusters
(Elsevier, 2022)
Data engineers are very interested in data lake technologies due to the incredible abun dance of datasets. They typically use clustering to understand the structure of the datasets before applying other methods to infer ...