Tampere University of Technology

TUTCRIS Research Portal

Less Is More: Deep Learning Using Subjective Annotations For Sentiment Analysis From Social Media

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Details

Original languageEnglish
Title of host publication2019 IEEE 29th International Workshop on Machine Learning for Signal Processing (MLSP)
PublisherIEEE
Number of pages6
ISBN (Electronic)978-1-7281-0824-7
ISBN (Print)978-1-7281-0825-4
DOIs
Publication statusPublished - Oct 2019
Publication typeA4 Article in a conference publication
EventIEEE International Workshop on Machine Learning for Signal Processing -
Duration: 1 Jan 1900 → …

Publication series

NameIEEE International Workshop on Machine Learning for Signal Processing
ISSN (Print)1551-2541

Conference

ConferenceIEEE International Workshop on Machine Learning for Signal Processing
Period1/01/00 → …

Abstract

Acquiring reliable training annotations for the huge amounts of data collected in large-scale applications is often infeasible, especially for inherently subjective tasks, such as sentiment analysis. In these cases, the data are usually annotated using semi-automated methods. Even when crowd-sourcing is used, ensuring the quality of the acquired annotations can be challenging. Therefore, a number of important questions arise when annotating such datasets: Does using more data always increase the accuracy of a model regardless the quality of the annotations? Is there any way of selecting which data samples we should use when the annotations are unreliable? Is there any point at which using unreliable annotations actually harms the performance of deep models instead of helping? In this work we provide an extensive study on training deep sentiment analysis models with unreliably annotated data, as well as propose a simple, yet effective semi-supervised learning method to overcome the aforementioned limitations.

Keywords

  • Training, Reliability, Data models, Sentiment analysis, Machine learning, Task analysis, Analytical models, Large-Scale Sentiment Analysis, Subjective Annotations, Deep Learning, Twitter Dataset

Publication forum classification

Field of science, Statistics Finland