IRISA at DeFT 2015: Supervised and Unsupervised Representations Methods in Sentiment Analysis

Vedran Vukotić, Vincent Claveau, Christian Raymond

DeFT 2015

Abstract

In this work, we present the participation of IRISA Linkmedia team at DeFT 2015. The team participated in two tasks: i) valence classification of tweets and ii) fine-grained classification of tweets (which includes two sub-tasks: detection of the generic class of the information expressed in a tweet and detection of the specific class of the opinion / sentiment / emotion. For all three problems, we adopt a standard machine learning framework. More precisely, three main methods are proposed and their feasibility for the tasks is analyzed: i) decision trees with boosting (bonzaiboost), ii) Naive Bayes with Okapi and iii) Convolutional Neural Networks (CNNs). Our approaches are voluntarily knowledge free and text-based only, we do not exploit external resources (lexicons, corpora) or tweet metadata. It allows us to evaluate the interest of each method and of traditional bag-of-words representations vs. word embeddings.

Overview

As a slight warming up exercise, we decided to participate at DeFT's twitter sentiment analysis task. We participated in three tasks, namely:

  • task 1: valence classification - that consisted of labeling tweets in three classes: positive, negative and neutral/mixed
  • task 2.1: generic class of information - that consisted of labeling tweets in four classes: information, opinion, sentiment, emotion
  • task 2.2: detection of specific opinion/sentiment/emotion - that consisted of labeling tweets 18 classes (in French: surprise_negative, accord, tristesse, valorisation, insatisfaction, peur, apaisement, colere, satisfaction, desaccord, deplaisir, derangement, amour, mepris, plaisir, surprise_positive, ennui, devalorisation)

We all used different methods and approaches:

  • Christian used boosting decision trees and his well polished tool bonzaiboost
  • Vincent used a naive Bayes approach and Okapi
  • I used a 1D convolutional neural network

The results were as follows:

Method Macro-precision Micro-precision
Task 1
Bonzaiboost 0.6723841209 0.5995856762
Bayesian learning 0.6985095279 0.6898490678
CNN + word2vec 0.6580527369 0.6531518201
Task 2.1
Bonzaiboost 0.4779050332 0.52352767091
Bayesian learning 0.5722246948 0.60165729506
CNN + word2vec 0.5020312287 0.57147084937
Task 2.2
Bonzaiboost 0.2577236629 0.5157972079
Bayesian learning 0.3248703305 0.5716385011
CNN + word2vec 0.3159632639 0.5525349008

Full Article

Click here

View on Google Scholar

Click here

To Cite