dc.creator | Barba González, Cristóbal | es |
dc.creator | Caballero, Ismael | es |
dc.creator | Varela Vaca, Ángel Jesús | es |
dc.creator | Cruz Lemus, José Antonio | es |
dc.creator | Gómez López, María Teresa | es |
dc.creator | Navas Delgado, Ismael | es |
dc.date.accessioned | 2024-01-02T10:21:50Z | |
dc.date.available | 2024-01-02T10:21:50Z | |
dc.date.issued | 2024 | |
dc.identifier.citation | Barba González, C., Caballero, I., Varela Vaca, Á.J., Cruz Lemus, J.A., Gómez López, M.T. y Navas Delgado, I. (2024). BIGOWL4DQ: Ontology-driven approach for Big Data quality meta-modelling, selection and reasoning. Information and Software Technology, 167 (Article number 107378), 1-16. https://doi.org/10.1016/j.infsof.2023.107378. | |
dc.identifier.issn | 0950-5849 | es |
dc.identifier.uri | https://hdl.handle.net/11441/152875 | |
dc.description | Article number 107378 | es |
dc.description.abstract | Data quality should be at the core of many Artificial Intelligence initiatives from the very first
moment in which data is required for a successful analysis. Measurement and evaluation of the level of quality
are crucial to determining whether data can be used for the tasks at hand. Conscientious of this importance,
industry and academia have proposed several data quality measurements and assessment frameworks over the
last two decades. Unfortunately, there is no common and shared vocabulary for data quality terms. Thus, it
is difficult and time-consuming to integrate data quality analysis within a (Big) Data workflow for performing
Artificial Intelligence tasks. One of the main reasons is that, except for a reduced number of proposals,
the presented vocabularies are neither machine-readable nor processable, needing human processing to be
incorporated.
Objective: This paper proposes a unified data quality measurement and assessment information model. This
model can be used in different environments and contexts to describe data quality measurement and evaluation
concerns.
Method: The model has been developed as an ontology to make it interoperable and machine-readable. For
better interoperability and applicability, this ontology, BIGOWL4DQ, has been developed as an extension of a
previously developed ontology for describing knowledge management in Big Data analytics.
Conclusions: This extended ontology provides a data quality measurement and assessment framework required
when designing Artificial Intelligence workflows and integrated reasoning capacities. Thus, BIGOWL4DQ can
be used to describe Big Data analysis and assess the data quality before the analysis.
Result: Our proposal has been validated with two use cases. First, the semantic proposal has been assessed
using an academic use case. And second, a real-world case study within an Artificial Intelligence workflow
has been conducted to endorse our work. | es |
dc.description.sponsorship | Universidad de Málaga PID2020-112540RB C41 | es |
dc.description.sponsorship | Universidad de Castilla-La Mancha PID2020-112540RB-C42 | es |
dc.description.sponsorship | Universidad de Sevilla PID2020-112540RB-C44 | es |
dc.format | application/pdf | es |
dc.format.extent | 16 p. | es |
dc.language.iso | eng | es |
dc.publisher | Elsevier B.V. | es |
dc.relation.ispartof | Information and Software Technology, 167 (Article number 107378), 1-16. | |
dc.rights | Atribución 4.0 Internacional | * |
dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | * |
dc.subject | Data quality evaluation and measurement | es |
dc.subject | Data quality information model | es |
dc.subject | Big Data | es |
dc.subject | Ontology | es |
dc.subject | Decision model and notation | es |
dc.title | BIGOWL4DQ: Ontology-driven approach for Big Data quality meta-modelling, selection and reasoning | es |
dc.type | info:eu-repo/semantics/article | es |
dcterms.identifier | https://ror.org/03yxnpp24 | |
dc.type.version | info:eu-repo/semantics/publishedVersion | es |
dc.rights.accessRights | info:eu-repo/semantics/openAccess | es |
dc.contributor.affiliation | Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos | es |
dc.relation.projectID | SBPLY/21/180501/000061 | es |
dc.relation.projectID | US-1381375 | es |
dc.relation.projectID | PID2020-112540RB C41 | es |
dc.relation.projectID | PID2020-112540RB-C42 | es |
dc.relation.projectID | PID2020-112540RB-C44 | es |
dc.relation.publisherversion | https://www.sciencedirect.com/search?qs=BIGOWL4DQ%3A%20Ontology-driven%20approach%20for%20Big%20Data%20quality%20meta-modelling%2C%20selection%20and%20reasoning&pub=Information%20and%20Software%20Technology&cid=271539 | es |
dc.identifier.doi | 10.1016/j.infsof.2023.107378 | es |
dc.journaltitle | Information and Software Technology | es |
dc.publication.volumen | 167 | es |
dc.publication.issue | Article number 107378 | es |
dc.publication.initialPage | 1 | es |
dc.publication.endPage | 16 | es |
dc.contributor.funder | Ministerio de Ciencia e Innovación (MICIN). España | es |
dc.contributor.funder | Junta de Andalucía | es |
dc.contributor.funder | Universidad de Málaga | es |
dc.contributor.funder | Consejería de Educación, Cultura y Deportes de la Junta de Comunidades de Castilla-La Mancha | es |