Mostrar el registro sencillo del ítem

Artículo

dc.creatorCotelo Moya, Juan Manueles
dc.creatorCruz Mata, Fermínes
dc.creatorTroyano Jiménez, José Antonioes
dc.creatorOrtega Rodríguez, Francisco Javieres
dc.date.accessioned2020-07-09T07:33:27Z
dc.date.available2020-07-09T07:33:27Z
dc.date.issued2015
dc.identifier.citationCotelo Moya, J.M., Cruz Mata, F., Troyano Jiménez, J.A. y Ortega Rodríguez, F.J. (2015). A modular approach for lexical normalization applied to Spanish tweets. Expert Systems with Applications, 42 (10), 4743-4754.
dc.identifier.issn0957-4174es
dc.identifier.urihttps://hdl.handle.net/11441/99108
dc.description.abstractTwitter is a social media platform with widespread success where millions of people continuously express ideas and opinions about a myriad of topics. It is a huge and interesting source of data but most of these texts are usually written hastily and very abbreviated, rendering them unsuitable for traditional Natural Language Processing (NLP). The two main contributions of this work are: the characterization of the textual error phenomena in Twitter and the proposal of a modular normalization system that improves the textual quality of tweets. Instead of focusing on a single technique, we propose an extensible normalization system that relies on the combination of several independent ‘‘expert modules’’, each one addressing an very specific error phenomenon in its own way, thus increasing module accuracy and lowering the module building costs. Broadly speaking, the system resembles to an ‘‘expert board’’: modules independently propose correction candidates for each Out of Vocabulary (OOV) word, rank the candidates and the best one is selected. In order to evaluate our proposal, we perform several experiments using texts from Twitter written in Spanish about a specific topic. The flexibility of defining resources at different language levels (core language, domain, genre) combined with the modular architecture lead to lower costs and a good performance: requiring a minimal effort for building the resources and achieving more than 82% of accuracy compared to the 31% yielded by the baseline.es
dc.description.sponsorshipMinisterio de Economía y Competitividad TIN2012-38536-C03-02es
dc.description.sponsorshipJunta de Andalucía P11-TIC-7684 MOes
dc.formatapplication/pdfes
dc.format.extent12es
dc.language.isoenges
dc.publisherElsevieres
dc.relation.ispartofExpert Systems with Applications, 42 (10), 4743-4754.
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internacional*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectTwitteres
dc.subjectText normalizationes
dc.subjectDomain adaptationes
dc.titleA modular approach for lexical normalization applied to Spanish tweetses
dc.typeinfo:eu-repo/semantics/articlees
dcterms.identifierhttps://ror.org/03yxnpp24
dc.type.versioninfo:eu-repo/semantics/submittedVersiones
dc.rights.accessRightsinfo:eu-repo/semantics/openAccesses
dc.contributor.affiliationUniversidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticoses
dc.relation.projectIDTIN2012-38536-C03-02es
dc.relation.projectIDP11-TIC-7684 MOes
dc.relation.publisherversionhttps://www.sciencedirect.com/science/article/pii/S0957417415000962es
dc.identifier.doi10.1016/j.eswa.2015.02.003es
dc.journaltitleExpert Systems with Applicationses
dc.publication.volumen42es
dc.publication.issue10es
dc.publication.initialPage4743es
dc.publication.endPage4754es
dc.identifier.sisius20947060es
dc.contributor.funderMinisterio de Economía y Competitividad (MINECO). Españaes
dc.contributor.funderJunta de Andalucíaes

FicherosTamañoFormatoVerDescripción
A modular approach for lexical ...1.435MbIcon   [PDF] Ver/Abrir  

Este registro aparece en las siguientes colecciones

Mostrar el registro sencillo del ítem

Attribution-NonCommercial-NoDerivatives 4.0 Internacional
Excepto si se señala otra cosa, la licencia del ítem se describe como: Attribution-NonCommercial-NoDerivatives 4.0 Internacional