dc.creator | Aradillas Jaramillo, José Carlos | es |
dc.creator | Murillo Fuentes, Juan José | es |
dc.creator | Olmos, Pablo M. | es |
dc.date.accessioned | 2021-09-07T15:02:09Z | |
dc.date.available | 2021-09-07T15:02:09Z | |
dc.date.issued | 2021 | |
dc.identifier.citation | Aradillas Jaramillo, J.C., Murillo Fuentes, J.J. y Olmos, P. M., (2021). Boosting offline handwritten text recognition in historical documents with few labeled lines. IEEE Access, 9, Article number 9438636, (76674-76688). | |
dc.identifier.issn | 2169-3536 | es |
dc.identifier.uri | https://hdl.handle.net/11441/125558 | |
dc.description | Article number 9438636 | es |
dc.description.abstract | In this paper we address the problem of offline handwritten text recognition (HTR) in historical documents when few labeled samples are available and some of them contain errors in the train set. Our three main contributions are: first, we analyze how to perform transfer learning (TL) from a massive database to a smaller historical database, analyzing which layers of the model need fine-tuning. Second, we analyze methods to efficiently combine TL and data augmentation (DA). Finally, we propose an algorithm to mitigate the effects of incorrect labeling in the training set. The methods are analyzed over the ICFHR 2018 competition database, Washington and Parzival. Combining all these techniques, we demonstrate a remarkable reduction of CER (up to 6 percentage points in some cases) in the test set with little complexity overhead. | es |
dc.format | application/pdf | es |
dc.format.extent | 15 p. | es |
dc.language.iso | eng | es |
dc.publisher | Institute of Electrical and Electronics Engineers Inc. | es |
dc.relation.ispartof | IEEE Access, 9, Article number 9438636, pp. 76674-76688. | |
dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 Internacional | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | * |
dc.subject | Connectionist temporal classification (CTC) | es |
dc.subject | Convolutional neural networks (CNN) | es |
dc.subject | Data augmentation (DA) | es |
dc.subject | Deep neural networks (DNN) | es |
dc.subject | Historical documents | es |
dc.subject | Long-short-term-memory (LSTM) | es |
dc.subject | Offline handwriting text recognition (HTR) | es |
dc.subject | Outlier detection | es |
dc.subject | Transfer learning | es |
dc.title | Boosting offline handwritten text recognition in historical documents with few labeled lines | es |
dc.type | info:eu-repo/semantics/article | es |
dcterms.identifier | https://ror.org/03yxnpp24 | |
dc.type.version | info:eu-repo/semantics/publishedVersion | es |
dc.rights.accessRights | info:eu-repo/semantics/openAccess | es |
dc.contributor.affiliation | Universidad de Sevilla. Departamento de Teoría de la Señal y Comunicaciones | es |
dc.relation.publisherversion | https://ieeexplore.ieee.org/document/9438636 | es |
dc.identifier.doi | 10.1109/ACCESS.2021.3082689 | es |
dc.journaltitle | IEEE Access | es |
dc.publication.volumen | 9 | es |
dc.publication.initialPage | 76674 | es |
dc.publication.endPage | 76688 | es |