dc.creator | de Haro Olmo, Francisco José | es |
dc.creator | Valencia Parra, Álvaro | es |
dc.creator | Varela Vaca, Ángel Jesús | es |
dc.creator | Álvarez Bermejo, José Antonio | es |
dc.creator | Gómez López, María Teresa | es |
dc.date.accessioned | 2023-10-02T10:36:58Z | |
dc.date.available | 2023-10-02T10:36:58Z | |
dc.date.issued | 2023 | |
dc.identifier.citation | de Haro Olmo, F.J., Valencia Parra, Á., Varela Vaca, Á.J., Álvarez Bermejo, J.A. y Gómez López, M.T. (2023). ELI: an IoT-aware big data pipeline with data curation and data quality. Peer J Computer Science, 9:e1605, 1-24. https://doi.org/10.7717/peerj-cs.1605. | |
dc.identifier.issn | 2376-5992 | es |
dc.identifier.uri | https://hdl.handle.net/11441/149268 | |
dc.description.abstract | The complexity of analysing data from IoT sensors requires the use of Big Data
technologies, posing challenges such as data curation and data quality assessment. Not
facing both aspects potentially can lead to erroneous decision-making (i.e., processing
incorrectly treated data, introducing errors into processes, causing damage or increasing
costs). This article presents ELI, an IoT-based Big Data pipeline for developing a data
curation process and assessing the usability of data collected by IoT sensors in both
offline and online scenarios. We propose the use of a pipeline that integrates data
transformation and integration tools and a customisable decision model based on
the Decision Model and Notation (DMN) to evaluate the data quality. Our study
emphasises the importance of data curation and quality to integrate IoT information
by identifying and discarding low-quality data that obstruct meaningful insights and
introduce errors in decision making. We evaluated our approach in a smart farm
scenario using agricultural humidity and temperature data collected from various
types of sensors. Moreover, the proposed model exhibited consistent results in offline
and online (stream data) scenarios. In addition, a performance evaluation has been
developed, demonstrating its effectiveness. In summary, this article contributes to
the development of a usable and effective IoT-based Big Data pipeline with data
curation capabilities and assessing data usability in both online and offline scenarios.
Additionally, it introduces customisable decision models for measuring data quality
across multiple dimensions. | es |
dc.description.sponsorship | Ministerio de Ciencia e Innovación (MICIN) España AEI/10.13039/501100011033 | es |
dc.format | application/pdf | es |
dc.format.extent | 24 p. | es |
dc.language.iso | eng | es |
dc.publisher | PeerJ | es |
dc.relation.ispartof | Peer J Computer Science, 9:e1605, 1-24. | |
dc.rights | Atribución 4.0 Internacional | * |
dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | * |
dc.subject | Data curation | es |
dc.subject | Data quality | es |
dc.subject | Big data pipeline | es |
dc.subject | Internet of Things | es |
dc.subject | Sensors | es |
dc.title | ELI: an IoT-aware big data pipeline with data curation and data quality | es |
dc.type | info:eu-repo/semantics/article | es |
dcterms.identifier | https://ror.org/03yxnpp24 | |
dc.type.version | info:eu-repo/semantics/publishedVersion | es |
dc.rights.accessRights | info:eu-repo/semantics/openAccess | es |
dc.contributor.affiliation | Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos | es |
dc.relation.projectID | AEI/10.13039/501100011033 | es |
dc.relation.projectID | US-1381375 | es |
dc.relation.projectID | P20_01224 | es |
dc.relation.publisherversion | https://peerj.com/articles/cs-1605/ | es |
dc.identifier.doi | 10.7717/peerj-cs.1605 | es |
dc.journaltitle | Peer J Computer Science | es |
dc.publication.volumen | 9:e1605 | es |
dc.publication.initialPage | 1 | es |
dc.publication.endPage | 24 | es |