Mostrar el registro sencillo del ítem

Artículo

dc.creatorAyala Hernández, Danieles
dc.creatorHernández Salmerón, Inmaculada Concepciónes
dc.creatorRuiz Cortés, Davides
dc.creatorRahm, Erhardes
dc.date.accessioned2022-04-29T08:49:23Z
dc.date.available2022-04-29T08:49:23Z
dc.date.issued2022
dc.identifier.citationAyala Hernández, D., Hernández Salmerón, I.C., Ruiz Cortés, D. y Rahm, E. (2022). Multi-source dataset of e-commerce products with attributes for property matching. Data in Brief, 41 (April 2022, art. nº107884)
dc.identifier.issn2352-3409es
dc.identifier.urihttps://hdl.handle.net/11441/132888
dc.description.abstractSchema/ontology matching consists in finding matches between types, properties and entities in heterogeneous sources of data in order to integrate them, which has become increasingly relevant with the development of web technologies and open data initiatives. One of the involved tasks is the matching of data properties, which attempts to try to find correspondences between the attributes of the entities. This is challenging due to the at times different names of equivalent properties. Furthermore, some properties may not be equivalent, but still match in 1..n relationships. These difficulties create the need for varied evaluation datasets for two reasons. First, they are needed to evaluate existing techniques in a variety of scenarios. Second, they enable the training of supervised techniques that may even become context-independent if trained with data from diverse enough contexts. To support the evaluation and training of data property matching techniques, we present a collection dataset consisting of product records from four different contexts. These datasets are the result of transforming two different existing datasets. In one of the datasets, some properties were filtered for being too noisy. The resulting processed dataset consists of json files with a listing of the product records and their properties, and a separate grouping of the properties that determines which ones match. It contains information about 2860 entities, with 4386 properties and 13350 pairwise matches.es
dc.description.sponsorshipMinisterio de Ciencia, Innovación y Universidades PID2019–105471RB-I00es
dc.description.sponsorshipJunta de Andalucía P18-RT-1060es
dc.description.sponsorshipJunta de Andalucía US-1380565es
dc.formatapplication/pdfes
dc.format.extent6es
dc.language.isoenges
dc.publisherElsevieres
dc.relation.ispartofData in Brief, 41 (April 2022, art. nº107884)
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internacional*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectProperty matchinges
dc.subjectData integrationes
dc.subjectOntologyes
dc.subjectData engineeringes
dc.titleMulti-source dataset of e-commerce products with attributes for property matchinges
dc.typeinfo:eu-repo/semantics/articlees
dcterms.identifierhttps://ror.org/03yxnpp24
dc.type.versioninfo:eu-repo/semantics/publishedVersiones
dc.rights.accessRightsinfo:eu-repo/semantics/openAccesses
dc.contributor.affiliationUniversidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticoses
dc.relation.projectIDPID2019–105471RB-I00es
dc.relation.projectIDP18-RT-1060es
dc.relation.projectIDUS-1380565es
dc.relation.publisherversionhttps://www.sciencedirect.com/science/article/pii/S2352340922000968?via%3Dihubes
dc.identifier.doi10.1016/j.dib.2022.107884es
dc.contributor.groupUniversidad de Sevilla. TIC134: Sistemas Informáticoses
dc.journaltitleData in Briefes
dc.publication.volumen41es
dc.publication.issueApril 2022, art. nº107884es
dc.contributor.funderMinisterio de Ciencia, Innovación y Universidades (MICINN). Españaes
dc.contributor.funderJunta de Andalucíaes

FicherosTamañoFormatoVerDescripción
1-s2.0-S2352340922000968-main.pdf549.1KbIcon   [PDF] Ver/Abrir  

Este registro aparece en las siguientes colecciones

Mostrar el registro sencillo del ítem

Attribution-NonCommercial-NoDerivatives 4.0 Internacional
Excepto si se señala otra cosa, la licencia del ítem se describe como: Attribution-NonCommercial-NoDerivatives 4.0 Internacional