Assessing State-of-the-Art Sentiment Models on State-of-the-Art Sentiment Datasets
- Jeremy Barnes 1
- Roman Klinger 1
- Sabine Schulte im Walde 1
-
1
University of Stuttgart
info
- Alexandra Balahur (ed. lit.)
- Saif M. Mohammad (ed. lit.)
- Erik van der Goot (ed. lit.)
Editorial: The Association for Computational Linguistics
ISBN: 978-1-945626-95-1
Any de publicació: 2017
Pàgines: 2-12
Congrés: Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (8. 2017. Copenhagen)
Tipus: Aportació congrés
Resum
There has been a good amount of progress in sentiment analysis over the past 10 years, including the proposal of new methods and the creation of benchmark datasets. In some papers, however, there is a tendency to compare models only on one or two datasets, either because of time restraints or because the model is tailored to a spe- cific task. Accordingly, it is hard to un- derstand how well a certain model gener- alizes across different tasks and datasets. In this paper, we contribute to this situa- tion by comparing several models on six different benchmarks, which belong to dif- ferent domains and additionally have dif- ferent levels of granularity (binary, 3-class, 4-class and 5-class). We show that Bi- LSTMs perform well across datasets and that both LSTMs and Bi-LSTMs are partic- ularly good at fine-grained sentiment tasks (i. e., with more than two classes). Incorpo- rating sentiment information into word em- beddings during training gives good results for datasets that are lexically similar to the training data. With our experiments, we contribute to a better understanding of the performance of different model architec- tures on different data sets. Consequently, we detect novel state-of-the-art results on the SenTube datasets.