Tasks

Four tasks are proposed for the participants covering different aspects of sentiment analysis and automatic text classification.

Groups may participate in one or several tasks.

Task 1: Sentiment Analysis at global level

This task consists on performing an automatic sentiment analysis to determine the global polarity (using 5 levels) of each message in the test set of the General corpus.

Participants will be provided with the training set of the General corpus so that they may train and validate their models.

Task 2: Topic classification

The technological challenge of this task is to build a classifier to automatically identify the topic of each message in the test set of the General corpus.

Participants may use the training set of the General corpus to train and validate their models.

Task 3: Sentiment Analysis at entity level

This task consists on performing an automatic sentiment analysis, similar to Task 1, but determining the polarity at entity level (using 3 polarity levels) of each message in the Politics corpus.

In this case, the polarity at entity level included in the training set of the General corpus may be used by participants to train and validate the models (converting from 5 polarity levels to 3 levels).

Task 4: Political tendency identification

This task moves one step forward and the objective is to estimate the political tendency of each user in the test set of the General corpus, in four possible values: LEFT, RIGHT, CENTRE and UNDEFINED.

Participants may used whatever strategy they decide, but a first approach could be to aggregate the results of the previous tasks by author and topic.

The evaluation metrics to evaluate and compare the different systems are the usual measurements of precision (1), recall (2) and F-measure (especifically F1) (3) calculated over the full test set:

precision (1)
recall (2)
fmeasure (3)

Participation

Experiments

Participants are expected to submit one or several results of different experiments for one or several of these tasks, in the appropriate format.

Results for all tasks must be submitted in a plain text file with the following format:

id \t output \t confidence

where:

Regarding the polarity values, there are 6 valid tags (P+, P, NEU, N, N+ and NONE). Although the polarity level must be classified into those levels and results will be evaluated for the 5 of them, the evaluation will include metrics that consider just 3 levels (POSITIVE, NEUTRAL and NEGATIVE).

Regarding the topic classification, a given tweet ID can be repeated in different lines if it is assigned more than one topic.

Results

31 groups registered (15 groups last year) and 14 groups (9 last year) sent their submissions.

GroupTask 1Task 2Task 3Task 4
CITIUS-USC2-1-
DLSI-UA3---
Elhuyar2---
ETH-Zurich3111
FHC25-IMDEA-2--
ITA1---
JRC18---
LYS22-2
SINAI-EMML2---
SINAI-CESA2224
Tecnalia-UNED1---
UNED-JRM22--
UNED-LSI159--
UPV3224
Total groups13744
Total runs5620611

Results achieved by the best experiments in each of the four tasks, sorted by precision value, are shown in the next figures.


results-task1

results-task2

results-task3

results-task4


Detailed results can be downloaded from the links below using the provided user and password for the private area.

The Excel sheet contains the summary of results of all experiments for each task. Then specific results for each task are contained in the 5 different compressed files, which in turn store the overall results for the task and the results per experiment, the confusion matrix to allow error analysis and the gold standard for the task (qrel) itself.

A PHP script and gold standards used for the evaluation of each submission are also included for your convenience (gold standards are also in the compressed file for each task).

Reports

Along with the submission of experiments, participants were invited to submit a paper to the workshop in order to describe their experiments and discussing the results with the audience in a regular workshop session.

Papers should follow the usual SEPLN template given in the author guidelines page. Reports can be written in Spanish or English. In this case there is no limitation in extension as they will be included in the electronic working notes of the conference.

Submitted papers were reviewed by the program committee.

All reports are included in the Proceedings of the TASS workshop at SEPLN 2013. Actas del XXIX Congreso de la Sociedad Española de Procesamiento de Lenguaje Natural. IV Congreso Español de Informática. 17-20 September 2013, Madrid, Spain. Díaz Esteban, Alberto; Alegría, Iñaki; Villena Román, Julio (eds). ISBN: 978-84-695-8349-4. Online at http://www.congresocedi.es/images/site/actas/ActasSEPLN.pdf.

Registration

Participants were required to register for the task(s). Registration had to be done by mail to .

The registration is now closed.



Daedalus SINAI-UJAEN GSI-UPM