Show simple item record

Presentation

dc.creatorRivero, Carlos R.es
dc.creatorRuiz Cortés, Davides
dc.date.accessioned2021-02-17T11:08:50Z
dc.date.available2021-02-17T11:08:50Z
dc.date.issued2020
dc.identifier.citationRivero, C.R. y Ruiz Cortés, D. (2020). Selecting Suitable Configurations for Automated Link Discovery. En SAC 2020: 35th Annual ACM Symposium on Applied Computing (907-914), Brno, Czech Republic: ACM Digital Library.
dc.identifier.isbn978-1-4503-6866-7es
dc.identifier.urihttps://hdl.handle.net/11441/105072
dc.description.abstractLinking individuals in one dataset to other same individuals in existing datasets is a major problem known as link discovery. Existing automated link discovery techniques make users responsible for selecting suitable properties, distances and transformations, a.k.a. configurations, which is challenging for both researchers and practitioners. Furthermore, failing to provide suitable configurations dramatically increases the complexity of link discovery since many configurations need to be evaluated. Current approaches to help users select proper configurations assume datasets are not heterogeneous or require the existence of a schema or ontology, making them less appealing in the context of Linked Data. In this paper, we present an approach to help users select suitable configurations solely based on data, i.e., no schema or ontology is required. We rely on the concepts of universality and uniqueness, i.e., properties that are present in many individuals of the datasets to link (universality) and do not have repeated objects (uniqueness). We use the concept of singularity to focus on configurations in which only a few individuals are very similar while the rest are very dissimilar. We evaluate our approach using eight commonlyused scenarios, in which, on average, we only suggest 5% of all the possible configurations. Additionally, selected configurations consistently generate links achieving high precision and recall with respect to a ground truth. Finally, we provide a number of guidelines to apply our approach in additional scenarios.es
dc.description.sponsorshipMinisterio de Economía y Competitividad TIN2016-75394-Res
dc.formatapplication/pdfes
dc.format.extent8es
dc.language.isoenges
dc.publisherACM Digital Libraryes
dc.relation.ispartofSAC 2020: 35th Annual ACM Symposium on Applied Computing (2020), pp. 907-914.
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internacional*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectLinked dataes
dc.subjectLink discoveryes
dc.subjectData integrationes
dc.titleSelecting Suitable Configurations for Automated Link Discoveryes
dc.typeinfo:eu-repo/semantics/conferenceObjectes
dcterms.identifierhttps://ror.org/03yxnpp24
dc.type.versioninfo:eu-repo/semantics/submittedVersiones
dc.rights.accessRightsinfo:eu-repo/semantics/openAccesses
dc.contributor.affiliationUniversidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticoses
dc.relation.projectIDTIN2016-75394-Res
dc.relation.publisherversionhttps://dl.acm.org/doi/abs/10.1145/3341105.3373882es
dc.identifier.doi10.1145/3341105.3373882es
dc.publication.initialPage907es
dc.publication.endPage914es
dc.eventtitleSAC 2020: 35th Annual ACM Symposium on Applied Computinges
dc.eventinstitutionBrno, Czech Republices
dc.relation.publicationplaceNew York, USAes
dc.contributor.funderMinisterio de Economía y Competitividad (MINECO). Españaes

FilesSizeFormatViewDescription
Selecting suitable configurations ...2.364MbIcon   [PDF] View/Open  

This item appears in the following collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivatives 4.0 Internacional
Except where otherwise noted, this item's license is described as: Attribution-NonCommercial-NoDerivatives 4.0 Internacional