idUS - Search

Now showing items 1-10 of 28

Presentation

On Mining Conditions using Encoder-decoder Networks

Ortega Gallego, Fernando; Corchuelo Gil, Rafael (SciTePress, 2019)

A condition is a constraint that determines when something holds. Mining them is paramount to understanding many sentences properly. There are a few pattern-based approaches that fall short because the patterns must be ...

Article

TOMATE: A heuristic-based approach to extract data from HTML tables

Roldán Salvador, Juan Carlos; Jiménez Aguirre, Patricia; Szekely, Pedro; Corchuelo Gil, Rafael (Elsevier, 2021)

Extracting data from user-friendly HTML tables is difficult because of their different lay outs, formats, and encoding problems. In this article, we present a new proposal that first applies several pre-processing heuristics ...

Presentation

A Novel Approach to Web Information Extraction

Reina Quintero, Antonia María; Jiménez Aguirre, Patricia; Corchuelo Gil, Rafael (Springer, 2015)

Business Intelligence requires the acquisition and aggrega tion of key pieces of knowledge from multiple sources in order to provide valuable information to customers. The Web is the largest source of infor mation nowadays. ...

Presentation

An Unsupervised Technique to Extract Information from Semi-structured Web Pages

Sleiman, Hassan A.; Corchuelo Gil, Rafael (Springer, 2012-11)

We propose a technique that takes two or more web pages generated by the same server-side template and tries to learn a regular expression that represents it and helps extract relevant information from similar pages. Our ...

Article

ARIEX: Automated ranking of information extractors

Jiménez Aguirre, Patricia; Corchuelo Gil, Rafael; Sleiman, Hassan A. (Elsevier, 2016)

Information extractors are used to transform the user-friendly information in a web document into structured information that can be used to feed a knowledge-based system. Researchers are interested in ranking them to ...

Presentation

On Feeding Business Systems with Linked Resources from the Web of Data

Cimmino Arriaga, Andrea Jesús; Corchuelo Gil, Rafael (Springer, 2018-07)

Business systems that are fed with data from the Web of Data require transparent interoperability. The Linked Data principles establish that different resources that represent the same real-world entities must be linked ...

Article

On learning context-aware rules to link RDF datasets

Cimmino Arriaga, Andrea Jesús; Corchuelo Gil, Rafael (Oxford University Press, 2020-09-15)

Integrating RDF datasets has become a relevant problem for both researchers and practitioners. In the literature, there are many genetic proposals that learn rules that allow to link the resources that refer to the same ...

Article

A clustering approach to extract data from HTML tables

Jiménez Aguirre, Patricia; Roldán Salvador, Juan Carlos; Corchuelo Gil, Rafael (Elsevier, 2021)

HTML tables have become pervasive on the Web. Extracting their data automatically is difficult because finding the relationships between their cells is not trivial due to the many different layouts, encodings, and formats ...

Article

On Extracting Data from Tables that are Encoded using HTML

Roldán Salvador, Juan Carlos; Jiménez Aguirre, Patricia; Corchuelo Gil, Rafael (Elsevier, 2020)

Tables are a common means to display data in human-friendly formats. Many authors have worked on proposals to extract those data back since this has many interesting applications. In this article, we summarise and compare ...

Presentation

A Novel Approach to Web Information Extraction

Reina Quintero, Antonia María; Jiménez Aguirre, Patricia; Corchuelo Gil, Rafael (Springer International Publishing AG, 2015-06)

Business Intelligence requires the acquisition and aggregation of key pieces of knowledge from multiple sources in order to provide valuable information to customers. The Web is the largest source of information nowadays. ...

Search

Filters

On Mining Conditions using Encoder-decoder Networks

TOMATE: A heuristic-based approach to extract data from HTML tables

A Novel Approach to Web Information Extraction

An Unsupervised Technique to Extract Information from Semi-structured Web Pages

ARIEX: Automated ranking of information extractors

On Feeding Business Systems with Linked Resources from the Web of Data

On learning context-aware rules to link RDF datasets

A clustering approach to extract data from HTML tables

On Extracting Data from Tables that are Encoded using HTML

A Novel Approach to Web Information Extraction