Trabajo Fin de Grado
Introducción a las Redes Bayesianas
Autor/es | Romero Núñez, Marta |
Director | Conde Sánchez, Eduardo |
Departamento | Universidad de Sevilla. Departamento de Estadística e Investigación Operativa |
Fecha de publicación | 2020 |
Fecha de depósito | 2021-07-05 |
Titulación | Universidad de Sevilla. Grado en Matemáticas |
Resumen | One of the more relevant purposes of the Statistical Modeling is that of
describing probabilistic relations among a set of random variables and show
them in a meaningful format. Bayesian Networks (BNs) combine a ... One of the more relevant purposes of the Statistical Modeling is that of describing probabilistic relations among a set of random variables and show them in a meaningful format. Bayesian Networks (BNs) combine a modular representation of the joint statistical distribution of the random vector under study with a powerful graphical tool allowing the identification of statistical dependencies by direct observation of the network structure. Despite the inherent difficulties of managing large sets of interrelated variables with a huge set of parameters, BNs have been gaining increasing relevance in the set of the Statistical applications in fields as diverse as medical diagnosis, insurance management tools, decision making or engineering. The computational drawbacks encountered by early BN applications are also being overcame due to the numerical capabilities of modern computers. These factors, together with the flexibility of BNs to be integrated in general Decision-Making Support Systems, make us to think that these techniques will continue spreading into the field of Statistical and Operational Research applications. In this study we present the introductory elements of BNs. First, we analyse the structure of the network and the statistical relations induced by its topology. We limit ourselves to the case of discrete random vectors. Although, they represent only a part, this set of random vectors cover a large amount of practical situations in the fields mentioned above. In the first chapter we also discuss the non-uniqueness of the graphical representation of a given joint distribution. In general, we have an equivalence relation in the set of possible networks and only a graphical representant of the corresponding equivalence class is needed in order to model the problem. In actual applications, the BN can be given by experts or learned from data. In this last case, we distinguish between the topological learning of the structure and the quantitative inference of the set of parameters modeling the conditional probabilities. Different techniques can be used to learn the topological structure of the network. None of them is fully satisfactory. However, once the structure of the network has been fixed, the statistical estimation of the parameters is given by more consolidated techniques, usually maximum likelihood estimation. The second chapter is devoted to explain how can be used a BN in order to implement Probabilistic Reasoning and Evidences. Basically, once that one or more random variables have been instantiated, that is, we have certain realizations for the corresponding variables, the network is used to compute the posterior probabilities given such an evidence. After the updating process one can infer the most probable explanation of that evidence. Maximum a posteriori queries are concerned with finding the configuration of the variables in a given set that has the highest posterior probability. This is the basis of the probabilistic reasoning we can develop through a BN. In the third chapter, we show how to apply these concepts into an illustrative academic application in the context of the nowadays COVID-19 pandemic situation in Spain. We have based our application in the report of the Red Nacional de Vigilancia Epidemiol´ogica, or RENAVE, of the middle of last April. We known the existing difficulties to have access to a reliable and extensive data-basis about the actual status of the disease, hence we used the data collected in that report to develop our experiment. By using the numerical tables of that report we identified a set of variables concerning different symptoms, treatment, patient features and final results of the disease. This quantitative information was also used to make a subjective estimation of the conditional probabilities in the same way as it could be done by an expert. Of course, the limit of our application is just to exemplify concepts and procedures previously discussed in this memory. The resulting BN was used to simulate a data set of realizations of the random vector. This data was taken as the input of the learning procedures in order to show the validity of these methods to rebuild the BN taken as target. We also develop tasks of probabilistic explanation of the evidence in our BN. In the last part of this work, we have added some appendices containing definitions of graph elements, the R scripts written for the above numerical application and the technical report used in the study. |
Cita | Romero Núñez, M. (2020). Introducción a las Redes Bayesianas. (Trabajo Fin de Grado Inédito). Universidad de Sevilla, Sevilla. |
Ficheros | Tamaño | Formato | Ver | Descripción |
---|---|---|---|---|
GM Romero Núñez, Marta.pdf | 2.681Mb | [PDF] | Ver/ | |