In this work, we present the participation of IRISA Linkmedia team at DeFT 2015. The team participated in two tasks: i) valence classification of tweets and ii) fine-grained classification of tweets (which includes two sub-tasks: detection of the generic class of the information expressed in a tweet and detection of the specific class of the opinion / sentiment / emotion. For all three problems, we adopt a standard machine learning framework. More precisely, three main methods are proposed and their feasibility for the tasks is analyzed: i) decision trees with boosting (bonzaiboost), ii) Naive Bayes with Okapi and iii) Convolutional Neural Networks (CNNs). Our approaches are voluntarily knowledge free and text-based only, we do not exploit external resources (lexicons, corpora) or tweet metadata. It allows us to evaluate the interest of each method and of traditional bag-of-words representations vs. word embeddings.
As a slight warming up exercise, we decided to participate at DeFT's twitter sentiment analysis task. We participated in three tasks, namely:
We all used different methods and approaches:
The results were as follows:
Method | Macro-precision | Micro-precision |
---|---|---|
Task 1 | ||
Bonzaiboost | 0.6723841209 | 0.5995856762 |
Bayesian learning | 0.6985095279 | 0.6898490678 |
CNN + word2vec | 0.6580527369 | 0.6531518201 |
Task 2.1 | ||
Bonzaiboost | 0.4779050332 | 0.52352767091 |
Bayesian learning | 0.5722246948 | 0.60165729506 |
CNN + word2vec | 0.5020312287 | 0.57147084937 |
Task 2.2 | ||
Bonzaiboost | 0.2577236629 | 0.5157972079 |
Bayesian learning | 0.3248703305 | 0.5716385011 |
CNN + word2vec | 0.3159632639 | 0.5525349008 |