PhD Thesis
Improving data preparation for the application of process mining
Author/s | Ramos Gutiérrez, Belén |
Director | Gómez López, María Teresa
Reina Quintero, Antonia María |
Department | Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos |
Publication Date | 2023-02-07 |
Deposit Date | 2023-03-30 |
Abstract | Immersed in what is already known as the fourth industrial revolution, automation and data exchange are taking on a particularly relevant role in complex environments, such as industrial manufacturing environments or ... Immersed in what is already known as the fourth industrial revolution, automation and data exchange are taking on a particularly relevant role in complex environments, such as industrial manufacturing environments or logistics. This digitisation and transition to the Industry 4.0 paradigm is causing experts to start analysing business processes from other perspectives. Consequently, where management and business intelligence used to dominate, process mining appears as a link, trying to build a bridge between both disciplines to unite and improve them. This new perspective on process analysis helps to improve strategic decision making and competitive capabilities. Process mining brings together data and process perspectives in a single discipline that covers the entire spectrum of process management. Through process mining, and based on observations of their actual operations, organisations can understand the state of their operations, detect deviations, and improve their performance based on what they observe. In this way, process mining is an ally, occupying a large part of current academic and industrial research. However, although this discipline is receiving more and more attention, it presents severe application problems when it is implemented in real environments. The variety of input data in terms of form, content, semantics, and levels of abstraction makes the execution of process mining tasks in industry an iterative, tedious, and manual process, requiring multidisciplinary experts with extensive knowledge of the domain, process management, and data processing. Currently, although there are numerous academic proposals, there are no industrial solutions capable of automating these tasks. For this reason, in this thesis by compendium we address the problem of improving business processes in complex environments thanks to the study of the state-of-the-art and a set of proposals that improve relevant aspects in the life cycle of processes, from the creation of logs, log preparation, process quality assessment, and improvement of business processes. Firstly, for this thesis, a systematic study of the literature was carried out in order to gain an in-depth knowledge of the state-of-the-art in this field, as well as the different challenges faced by this discipline. This in-depth analysis has allowed us to detect a number of challenges that have not been addressed or received insufficient attention, of which three have been selected and presented as the objectives of this thesis. The first challenge is related to the assessment of the quality of input data, known as event logs, since the requeriment of the application of techniques for improving the event log must be based on the level of quality of the initial data, which is why this thesis presents a methodology and a set of metrics that support the expert in selecting which technique to apply to the data according to the quality estimation at each moment, another challenge obtained as a result of our analysis of the literature. Likewise, the use of a set of metrics to evaluate the quality of the resulting process models is also proposed, with the aim of assessing whether improvement in the quality of the input data has a direct impact on the final results. The second challenge identified is the need to improve the input data used in the analysis of business processes. As in any data-driven discipline, the quality of the results strongly depends on the quality of the input data, so the second challenge to be addressed is the improvement of the preparation of event logs. The contribution in this area is the application of natural language processing techniques to relabel activities from textual descriptions of process activities, as well as the application of clustering techniques to help simplify the results, generating more understandable models from a human point of view. Finally, the third challenge detected is related to the process optimisation, so we contribute with an approach for the optimisation of resources associated with business processes, which, through the inclusion of decision-making in the creation of flexible processes, enables significant cost reductions. Furthermore, all the proposals made in this thesis are validated and designed in collaboration with experts from different fields of industry and have been evaluated through real case studies in public and private projects in collaboration with the aeronautical industry and the logistics sector. |
Citation | Ramos Gutiérrez, B. (2023). Improving data preparation for the application of process mining. (Tesis Doctoral Inédita). Universidad de Sevilla, Sevilla. |
Files | Size | Format | View | Description |
---|---|---|---|---|
Ramos Gutiérrez. Belén Tesis.pdf | 22.13Mb | [PDF] | View/ | |