Final Degree Project
Análisis topológico de datos
Author/s | Perera Lago, Javier |
Department | Universidad de Sevilla. Departamento de Álgebra |
Publication Date | 2020-06-01 |
Deposit Date | 2021-07-06 |
Academic Title | Universidad de Sevilla. Doble Grado en Matemáticas y Estadística |
Abstract | Topological data analysis is a branch of computational topology which uses algebra to
obtain topological features from a data set. It has many applications in computer vision, shape
description, time series analysis, ... Topological data analysis is a branch of computational topology which uses algebra to obtain topological features from a data set. It has many applications in computer vision, shape description, time series analysis, biomedicine, drug design... The first step to learn topological information from data is to build a filtration of nested simplicial complexes estimating the global structure of the data set at different scales of precision. The second step is to study the evolution of the simplicial homology along this filtration, using a tool from algebraic topology called persistent homology. When the filtration is indexed by a single parameter, we can describe discrete and complete invariants for persistent homology with coefficients over a field, such as barcodes, persistent diagrams or persistent landscapes. Unfortunately, we don’t have a simple classification for persistent homology when the filtration is indexed by two or more parameters. The problem of classifying multiparameter persistent homology is being widely studied nowadays, and we present some proposals for it. One of the big problems of large sets is the presence of noisy, wrong or incomplete data. Although persistent homology is stable under small perturbation, we still need to give a statistical frame in order to separate insignificant features from topological signal and build some hypothesis tests applied to topological features. Another usual problem of large sets is the high dimensionality. Even if we have captured the main topological caracteristics of the data, we aren’t usually able to fully understand its structure. In order to reduce dimensionality and obtain a better visualization we present the Mapper algorithm, which groups the data using a filter function and summarizes the set as a graph or a low dimensional simplicial complex. |
Citation | Perera Lago, J. (2020). Análisis topológico de datos. (Trabajo Fin de Grado Inédito). Universidad de Sevilla, Sevilla. |
Files | Size | Format | View | Description |
---|---|---|---|---|
TFG DGMyE Perera Lago, Javier.pdf | 1.230Mb | [PDF] | View/ | |