dc.creator | Reina Quintero, Antonia María | es |
dc.creator | Jiménez Aguirre, Patricia | es |
dc.creator | Corchuelo Gil, Rafael | es |
dc.date.accessioned | 2023-03-21T13:12:27Z | |
dc.date.available | 2023-03-21T13:12:27Z | |
dc.date.issued | 2015-06 | |
dc.identifier.citation | Reina Quintero, A.M., Jiménez Aguirre, P. y Corchuelo Gil, R. (2015). A Novel Approach to Web Information Extraction. En 18th International Conference: Business Information Systems (BIS 2015) (152-161), Poznań, Polonia: Springer International Publishing AG. | |
dc.identifier.isbn | 978-3-319-19026-6 (impreso) | es |
dc.identifier.isbn | 978-3-319-19027-3 (online) | es |
dc.identifier.issn | 1865-1348 (impreso) | es |
dc.identifier.issn | 1865-1356 (online) | es |
dc.identifier.uri | https://hdl.handle.net/11441/143500 | |
dc.description.abstract | Business Intelligence requires the acquisition and aggregation of key pieces of knowledge from multiple sources in order to provide valuable information to customers. The Web is the largest source of information nowadays. Unfortunately, the information it provides is available in semi-structured human-friendly formats, which makes it difficult to be processed by automated business processes. Classical propositional and ILP machine-learning techniques have been applied for this purpose. However, the former have not enough expressive power, whereas the latter are more expressive but intractable with large datasets. Propositionalisation was devised as a means to provide propositional techniques with more expressive power, enabling them to exploit structural information in a propositional way that allows them to be efficient. In this paper, we present a proposal to extract information from semi-structured web documents that uses this approach. It leverages a classical propositional machine learning technique and enhances it with the ability to learn from an unbounded context, which helps increase its precision and recall. Our experiments prove that our proposal outperforms other state-of-art techniques in the literature | es |
dc.description.sponsorship | Ministerio de Ciencia y Tecnología TIN2007-64119 | es |
dc.description.sponsorship | Junta de Andalucía P07-TIC-2602 | es |
dc.description.sponsorship | Junta de Andalucía P08-TIC-4100 | es |
dc.description.sponsorship | Ministerio de Ciencia e Innovación TIN2008-04718-E | es |
dc.description.sponsorship | Ministerio de Ciencia e Innovación TIN2010-21744 | es |
dc.description.sponsorship | Ministerio de Economía, Industria y Competitividad TIN2010-09809-E | es |
dc.description.sponsorship | Ministerio de Ciencia e Innovación TIN2010-10811-E | es |
dc.description.sponsorship | Ministerio de Ciencia e Innovación TIN2010-09988-E | es |
dc.description.sponsorship | Ministerio de Economía y Competitividad TIN2011-15497-E | es |
dc.description.sponsorship | Ministerio de Economía y Competitividad TIN2013-40848-R | es |
dc.format | application/pdf | es |
dc.format.extent | 10 | es |
dc.language.iso | eng | es |
dc.publisher | Springer International Publishing AG | es |
dc.relation.ispartof | 18th International Conference: Business Information Systems (BIS 2015) (2015), pp. 152-161. | |
dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 Internacional | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | * |
dc.title | A Novel Approach to Web Information Extraction | es |
dc.type | info:eu-repo/semantics/conferenceObject | es |
dcterms.identifier | https://ror.org/03yxnpp24 | |
dc.type.version | info:eu-repo/semantics/publishedVersion | es |
dc.rights.accessRights | info:eu-repo/semantics/openAccess | es |
dc.contributor.affiliation | Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos | es |
dc.relation.projectID | TIN2007-64119 | es |
dc.relation.projectID | P07-TIC-2602 | es |
dc.relation.projectID | P08-TIC-4100 | es |
dc.relation.projectID | TIN2008-04718-E | es |
dc.relation.projectID | TIN2010-21744 | es |
dc.relation.projectID | TIN2010-09809-E | es |
dc.relation.projectID | TIN2010-10811-E | es |
dc.relation.projectID | TIN2010-09988-E | es |
dc.relation.projectID | TIN2011-15497-E | es |
dc.relation.projectID | TIN2013-40848-R | es |
dc.relation.publisherversion | https://link.springer.com/chapter/10.1007/978-3-319-19027-3_13 | es |
dc.identifier.doi | 10.1007/978-3-319-19027-3_13 | es |
dc.publication.initialPage | 152 | es |
dc.publication.endPage | 161 | es |
dc.eventtitle | 18th International Conference: Business Information Systems (BIS 2015) | es |
dc.eventinstitution | Poznań, Polonia | es |
dc.relation.publicationplace | Suiza | es |
dc.contributor.funder | Ministerio de Ciencia Y Tecnología (MCYT). España | es |
dc.contributor.funder | Junta de Andalucía | es |
dc.contributor.funder | Ministerio de Ciencia e Innovación (MICIN). España | es |
dc.contributor.funder | Ministerio de Economía, Industria y Competitividad | es |
dc.contributor.funder | Ministerio de Economía y Competitividad (MINECO). España | es |