Artículo
Improving the Performance of a Tagger Generator in an Information Extraction Application
Autor/es | Troyano Jiménez, José Antonio
Enríquez de Salamanca Ros, Fernando Cruz Mata, Fermín Cañete Valdeón, José Miguel Ortega Rodríguez, Francisco Javier |
Departamento | Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos |
Fecha de publicación | 2007 |
Fecha de depósito | 2020-08-03 |
Publicado en |
|
Resumen | In this paper we present an experience in the extraction of named entities
from Spanish texts using stacking. Named Entity Extraction (NEE) is a subtask of
Information Extraction that involves the identification of groups ... In this paper we present an experience in the extraction of named entities from Spanish texts using stacking. Named Entity Extraction (NEE) is a subtask of Information Extraction that involves the identification of groups of words that make up the name of an entity, and the classification of these names into a set of predefined categories. Our approach is corpus-based, we use a re-trainable tagger generator to obtain a named entity extractor from a set of tagged examples. The main contribution of our work is that we obtain the systems needed in a stacking scheme without making use of any additional training material or tagger generators. Instead of it, we have generated the variability needed in stacking by applying corpus transformation to the original training corpus. Once we have several versions of the training corpus we generate several extractors and combine them by means of a machine learning algorithm. Experiments show that the combination of corpus transformation and stacking improve the performance of the tagger generator in this kind of natural language processing applications. The best of our experiments achieves an improvement of more than six percentual points respect to the predefined baseline. |
Cita | Troyano Jiménez, J.A., Enríquez de Salamanca Ros, F., Cruz Mata, F., Cañete Valdeón, J.M. y Ortega Rodríguez, F.J. (2007). Improving the Performance of a Tagger Generator in an Information Extraction Application. Journal of Universal Computer Science, 13 (9), 1287-1299. |
Ficheros | Tamaño | Formato | Ver | Descripción |
---|---|---|---|---|
Improving the Performance of a ... | 150.3Kb | [PDF] | Ver/ | |