Sentimenduen analisia euskaraz: lexiko-mailatik erlaziozko diskurtso-egiturarako proposamena

  1. Jon Alkorta 1
  2. Koldo Gojenola 1
  3. Mikel Iruskieta 1
  1. 1 Universidad del País Vasco/Euskal Herriko Unibertsitatea
    info

    Universidad del País Vasco/Euskal Herriko Unibertsitatea

    Lejona, España

    ROR https://ror.org/000xsnr85

Revista:
Gogoa: Euskal Herriko Unibersitateko hizkuntza, ezagutza, komunikazio eta ekintzari buruzko aldizkaria

ISSN: 1577-9424

Año de publicación: 2016

Título del ejemplar: Xabier Arrazola Gogoan (1962-2015)

Número: 14

Páginas: 131-152

Tipo: Artículo

DOI: 10.1387/GOGOA.15634 DIALNET GOOGLE SCHOLAR lock_openAcceso abierto editor

Otras publicaciones en: Gogoa: Euskal Herriko Unibersitateko hizkuntza, ezagutza, komunikazio eta ekintzari buruzko aldizkaria

Objetivos de desarrollo sostenible

Resumen

Nowadays, opinion texts play an important role, in fact, people read opinions before they do an activity, buy a product or take a decision. However, the amount of opinion text is increasing rapidly and reading all opinions about a subject is unfeasible. ‘Sentiment analysis’ is a part of Natural Language Processing whose aim is to process opinion texts. This work is a part of ‘sentiment analysis’ and presents a first approximation to the assignment of polarity to texts written in Basque. Rhetorical Structure Theory has been used to assign different weights to text spans and an analysis has been performed at the lexical level. This method has been compared with other approaches and the results are promising.

Referencias bibliográficas

  • ALKORTA, J., GOJENOLA, K., IRUSKIETA, M., eta PEREZ, A. (2015), «Using relational discourse structure information in Basque sentiment analysis». In 5th Workshop «RST and Discourse Studies», in Actas del XXXI Congreso de la Sociedad Española del Procesamiento del Lenguaje Natural (SEPLN 2015), Alicante, España.
  • ASHER, N., BENAMARA, F., eta MATHIEU, Y. Y. (2009), «Appraisal of opinion expressions in discourse». Lingvisticæ Investigationes, 32(2):279-292.
  • BACCIANELLA, S., ESULI, A., eta SEBASTIANI, F. (2010), «Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining». In LREC, volume 10, 2200-2204 orr.
  • CARLSON, L., OKUROWSKI, M. E., MARCU, D., CONSORTIUM, L. D., et al. (2002), RST discourse treebank. Linguistic Data Consortium, University of Pennsylvania.
  • CHESLEY, P., VINCENT, B., XU, L., eta SRIHARI, R. K. (2006), «Using verbs and adjectives to automatically classify blog sentiment». Training, 580(263):233.
  • DA CUNHA, I., TORRES-MORENO, J.-M., eta SIERRA, G. (2011), «On the development of the rst spanish treebank». In Proceedings of the 5th Linguistic Annotation Workshop, pages 1–10. Association for Computational Linguistics.
  • EGAÑA, I. (2013), Kritikarako hurbilketa literaturaren soziologiatik. Egunkari eta aldizkarietako euskal literatur kritiken analisia (1975-2005). Doktoretza tesia. Euskal Herriko Unibertsitatea, UPV/EHU, Vitoria-Gasteiz.
  • EZEIZA, N., ADURIZ, I., ALEGRIA, I., ARRIOLA, J. M., eta URIZAR, R. (1998), «Combining Stochastic and Rule-Based Methods for Disambiguation in Agglutinative Languages. In COLING-ACL’98, volume 1, 380-384 orr., Canada.
  • GÓMEZ, I. (1997), «La partición informacional en el discurso». Doktoretza tesia. Euskal Herriko Unibertsitatea, UPV/EHU.
  • GÓMEZ, I. (2002), «Foco y tema. una aproximación discursiva». EHUko Argitalpen Zerbitzuak/Servicio Editorial de la UPV.
  • HEERSCHOP, B., GOOSSEN, F., HOGENBOOM, A., FRASINCAR, F., KAYMAK, U., eta de JONG, F. (2011), «Polarity analysis of texts using discourse structure». In Proceedings of the 20th ACM international conference on Information and knowledge management, 1061-1070 orr. ACM.
  • HORVATH, B. M. eta EGGINS, S. (1995), «Opinion texts in conversation». Advances In Discourse Processes, 50:29-46.
  • IRUSKIETA, M. (2014), Pragmatikako erlaziozko diskurtso-egitura: deskribapena eta bere ebaluazioa hizkuntzalaritza konputazionalean. Doktoretza tesia. Euskal Herriko Unibertsitatea, UPV/EHU, Donostia.
  • IRUSKIETA, M., ARANZABE, M. J., DE ILARRAZA, A. D., GONZALEZ, I., LERSUNDI, M., eta DE LA CALLE, O. L. (2013), «The RST basque treebank: an online search interface to check rhetorical relations». In 4th Workshop˙ RST and Discourse Studies, Brasil, October, 21-23 orr.
  • LIU, B. (2012), «Sentiment analysis and opinion mining». Synthesis Lectures on Human Language Technologies, 5(1):1-167.
  • MANN, W. C. eta TABOADA, M. (2005), RST web site. Available at http://www.sfu.ca/rst/.
  • MANN, W. C. eta THOMPSON, S. A. (1987), Rhetorical structure theory: A theory of text organization. University of Southern California, Information Sciences Institute.
  • MANN, W. C. eta THOMPSON, S. A. (1988), «Rhetorical structure theory: Toward a functional theory of text organization». Text-Interdisciplinary Journal for the Study of Discourse, 8(3):243-281.
  • PANG, B. eta LEE, L. (2004), «A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts». In Proceedings of the 42nd annual meeting on Association for Computational Linguistics, 271-278 orr. Association for Computational Linguistics.
  • PANG, B. eta LEE, L. (2005), «Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales». In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, 115-124 orr. Association for Computational Linguistics.
  • PANG, B. eta LEE, L. (2008), «Opinion mining and sentiment analysis». Foundations and trends in information retrieval, 2(1-2):1-135.
  • PARDO, T. A. S. eta SENO, E. R. M. (2005), «Rhetalho: um corpus de referência anotado retoricamente». Anais do V Encontro de Corpora, 24-25 orr.
  • POLANYI, L. eta ZAENEN, A. (2006), «Contextual valence shifters». In Computing attitude and affect in text: Theory and applications, 1-10 orr. Springer.
  • SAN VICENTE, I., AGERRI, R., eta RIGAU, G. (2014), «Simple, Robust and (almost) Unsupervised Generation of Polarity Lexicons for Multiple Languages». In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2014), Gothenburg, Sweden.
  • SAURÍ, R. (2008), A factuality profiler for eventualities in text. Doktoretza tesia. Brandeis University Waltham.Massachusetts.
  • STEDE, M. (2004), «The potsdam commentary corpus». In Proceedings of the 2004 ACL Workshop onDiscourse Annotation, 96-102 orr. Association for Computational Linguistics.
  • TABOADA, M., BROOKE, J., TOFILOSKI, M., VOLL, K., eta STEDE, M. (2011), «Lexicon-based methods for sentiment analysis». Computational linguistics, 37(2):267-307.
  • TABOADA, M., VOLL, K., eta BROOKE, J. (2008), «Extracting sentiment as a function of discourse structure and topicality». Simon Fraser Univeristy School of Computing Science Technical Report.
  • TUMASJAN, A., SPRENGER, T. O., SANDNER, P. G., eta WELPE, I. M. (2010), «Predicting elections with twitter: What 140 characters reveal about political sentiment». ICWSM, 10:178-185.
  • XU, K., LIAO, S. S., LI, J., eta SONG, Y. (2011), «Mining comparative opinions from customer reviews for competitive intelligence». Decision support systems, 50(4):743-754.