Article
Topology-based representative datasets to reduce neural network training resources
Author/s | González Díaz, Rocío
Gutiérrez Naranjo, Miguel Ángel Paluzo Hidalgo, Eduardo |
Department | Universidad de Sevilla. Departamento de Matemática Aplicada I (ETSII) Universidad de Sevilla. Departamento de Ciencias de la Computación e Inteligencia Artificial |
Publication Date | 2022 |
Deposit Date | 2022-07-01 |
Published in |
|
Abstract | One of the main drawbacks of the practical use of neural networks is the long time required in the training process. Such a
training process consists of an iterative change of parameters trying to minimize a loss function. ... One of the main drawbacks of the practical use of neural networks is the long time required in the training process. Such a training process consists of an iterative change of parameters trying to minimize a loss function. These changes are driven by a dataset, which can be seen as a set of labeled points in an n-dimensional space. In this paper, we explore the concept of a representative dataset which is a dataset smaller than the original one, satisfying a nearness condition independent of isometric transformations. Representativeness is measured using persistence diagrams (a computational topology tool) due to its computational efficiency. We theoretically prove that the accuracy of a perceptron evaluated on the original dataset coincides with the accuracy of the neural network evaluated on the representative dataset when the neural network architecture is a perceptron, the loss function is the mean squared error, and certain conditions on the representativeness of the dataset are imposed. These theoretical results accompanied by experimentation open a door to reducing the size of the dataset to gain time in the training process of any neural network |
Funding agencies | Agencia Estatal de Investigación. España Agencia Andaluza del Conocimiento |
Project ID. | PID2019-107339GB-100
P20-01145 |
Citation | González Díaz, R., Gutiérrez Naranjo, M.Á. y Paluzo Hidalgo, E. (2022). Topology-based representative datasets to reduce neural network training resources. Neural Computing and Applications, May 2022 |
Files | Size | Format | View | Description |
---|---|---|---|---|
Gonzalez-Diaz2022_Article_Topo ... | 2.291Mb | [PDF] | View/ | |