Repositorio de producción científica de la Universidad de Sevilla

Técnicas estadísticas en minería de textos


Advanced Search
Opened Access Técnicas estadísticas en minería de textos
Show item statistics
Export to
Author: Valero Moreno, Ana Isabel
Director: Pino Mejías, José Luis
Department: Universidad de Sevilla. Departamento de Estadística e Investigación Operativa
Date: 2017-06
Document type: Final Degree Work
Academic Title: Universidad de Sevilla. Grado en Matemáticas
Abstract: Este trabajo presenta un análisis de distintas técnicas estadísticas existentes para la minería de textos, como son el Modelo de Espacio Vectorial Semántico, el Análisis de Semántica Latente y la Asignación de Dirichlet Latente. Se explican técnicas...
[See more]
This paper presents an analysis of different statistical techniques for the text mining, like Semantic Vector Space Model, Latent Semantic Analysis and Latent Dirichlet Allocation. Techniques related to the analysis of unstructured data such as data mining, sentiment analysis, feature extraction, clustering and creation of abstracts are explained. As well as the stages that must be followed for its realization and some areas in which it is used. It is also added, a list of software that allow to study text data. Finally, two practical cases are developed, where some of the introduced models are applied to real data. The first is a small application using Latent Semantic Analysis to see which documents a query belongs to. The second is a real application of sentiment analysis to know the opinions that users have about a product through their reviews. The analysis of both is carried out by the statistical computer application R.
Size: 724.8Kb
Format: PDF


This work is under a Creative Commons License: 
Attribution-NonCommercial-NoDerivatives 4.0 Internacional

This item appears in the following Collection(s)