Past editions: 2014 | 2013 | 2012

TASS 2015

Welcome to the 4th evaluation workshop for sentiment analysis focused on Spanish. TASS 2015 will be held as part of the 31st SEPLN Conference in Alicante, Spain, on September 15th, 2015. You are invited to attend the workshop, taking part in the proposed tasks and visiting this beautiful city!

Workshop Program

15:30 - 16:00	Opening and overview
16:00 - 17:00	Participant reports (I)
17:00 - 17:30	Coffee break
17:30 - 18:30	Participant reports (II)
18:30 - 19:00	Discussion and closing

TASS2015 Proceedings - CEUR Vol 1397

Tweets sobre #sepln2015 OR #tass15

Welcome to TASS 2015!

TASS is an experimental evaluation workshop for sentiment analysis and online reputation analysis focused on Spanish language, organized as a satellite event of the annual conference of the Spanish Society for Natural Language Processing (SEPLN). After three previous successful editions, TASS 2015 will take place on September 15th, 2015 at University of Alicante, Spain.

The aim of TASS is to provide a forum for discussion and communication where the latest research work and developments in the field of sentiment analysis in social media, specifically focused on Spanish language, can be shown and discussed by scientific and business communities. The main objective is to promote the application of state-of-the-art algorithms and techniques for sentiment analysis applied to short text opinions extracted from social media messages (specifically Twitter).

Several challenge tasks are proposed, intended to provide a benchmark forum for comparing the latest approaches in these fields. In addition, with the creation and release of the fully tagged corpus, we aim to provide a benchmark dataset that enables researchers to compare their algorithms and systems.

Tasks

First of all, we are interested in evaluating the evolution of the different approaches for sentiment analysis and text classification in Spanish during these years. So, the traditional sentiment analysis at global level task will be repeated again, reusing the same corpus, to compare results. Moreover, we want to foster the research in the analysis of fine-grained polarity analysis at aspect level (aspect-based sentiment analysis, one of the new requirements of the market of natural language processing in these areas.

Thus the following two tasks are proposed this year.

Participants are expected to submit up to 3 results of different experiments for one or both of these tasks, in the appropriate format described below.

Along with the submission of experiments, participants will be invited to submit a paper to the workshop in order to describe their experiments and discussing the results with the audience in a regular workshop session. More information about format and requirements will be provided soon.

Information for submissions

Submissions must be done through the following page, using the provided user and password:

http://www.sngularmeaning.team/TASS2015/private/evaluate.php

There you must select the task and fill in the name of your group, the run ID and the run file, and the system will automatically check and evaluate your submission according to the defined metrics and keep a history of everything.

If you want to resubmit your experiment, just use the same group name and run id.

Please notice that the list of submissions is public and open to all participants.

You may submit any experiment at any moment that you want, but the valid official runs are the ones up to July 2nd, included.

Call for Papers

All participants are invited to submit a paper with the description of the main keys of your systems and the discussion of your results. The papers will be reviewed by a scientific committee, and only the accepted papers will be published at CEUR.

Depending on the final number of participants and the time slot allocated for the workshop, all or a selected group op papers will be selected to be presented and discussed in the Workshop session.

The manuscripts must to satisfy the following rules:

The maximum size allowed for contributions is up to 6 DIN A4 pages, including references and figures.
Articles can be written in English or Spanish. The title, abstract and keywords must be written in both languages.
The document format must be Word or Latex, but the submission must be in PDF format. The allowed template is at the SEPLN webpage.

Instead of describing the task and/or the corpus, focus on the description of your experiments and the analysis of your results, and include a citation to the Overview paper (more information will be provided soon).

Submissions can be done by email to tass AT sngularmeaning.team. The deadline for submissions is July 20th. Notification of acceptance is expected for July 25th and the publication will be by the end of July.

Task 1: Sentiment Analysis at global level

This task consists on performing an automatic sentiment analysis to determine the global polarity of each message in the provided test sets (complete set and 1k set) of the General corpus (see below). This task is a reedition of the task in the previous years. Participants will be provided with the training set of the General corpus so that they may train and validate their models.

There will be two different evaluations: one based on 6 different polarity labels (P+, P, NEU, N, N+, NONE) and another based on just 4 labels (P, N, NEU, NONE).

Participants are expected to submit (up to 3) experiments for the 6-labels evaluation, but are also allowed to submit (up to 3) specific experiments for the 4-labels scenario.

Accuracy (correct tweet polarity according to the gold standard) will be used for ranking the systems. The confusion matrix will be generated and then used to evaluate the precision, recall and F1-measure for each individual category (polarity). Macroaveraged precision, recall and F1-measure will be also calculated for the whole run.

Results must be submitted in a plain text file with the following format:

tweetid \t polarity

where polarity can be:

P+, P, NEU, N, N+ and NONE for the 6-labels case
P, NEU, N and NONE for the 4-labels case.

The same test corpus of previous years will be used for the evaluation, to allow for comparison among systems. Obviously, participants are not allowed to use any test data to train their systems.

Notice that there are two test sets: complete set and 1k set, a subset of the first one. The reason is that, to deal with the problem of the imbalanced distribution of labels between the training and test set, a selected test subset containing 1000 tweets with a similar distribution to the training corpus was extracted to be used for an alternate evaluation of the performance of systems.

Task 2: Aspect-based sentiment analysis

Participants will be provided with a corpus tagged with a series of aspects, and systems must identify the polarity at the aspect-level. Two corpora will be provided: the Social-TV corpus, used last year, and the new STOMPOL corpus, collected this year (both described later). Both corpora have been splitted into training and test set, the first one for building and validating the systems, and the second for evaluation.

Participants are expected to submit up to 3 experiments for each corpus, each in a plain text file with the following format:

tweetid \t aspect \t polarity

[for the Social-TV corpus]

tweetid \t aspect-entity \t polarity

[for the STOMPOL corpus]

Allowed polarity values are P, NEU and N.

For evaluation, a single label combining "aspect-polarity" will be considered. Similarly to the first task, accuracy will be used for ranking the systems; precision, recall and F1-measure will be used to evaluate each individual category ("aspect-polarity" label); and macroaveraged precision, recall and F1-measure will be also calculated for the global result.

Corpus

General Corpus

The general corpus contains over 68 000 Twitter messages, written in Spanish by about 150 well-known personalities and celebrities of the world of politics, economy, communication, mass media and culture, between November 2011 and March 2012. Although the context of extraction has a Spain-focused bias, the diverse nationality of the authors, including people from Spain, Mexico, Colombia, Puerto Rico, USA and many other countries, makes the corpus reach a global coverage in the Spanish-speaking world.

The general corpus has been divided into two sets: training (about 10%) and test (90%). The training set will be released so that participants may train and validate their models. The test corpus will be provided without any tagging and will be used to evaluate the results provided by the different systems. Obviously, it is not allowed to use the test data from previous years to train the systems.

Each message in both the training and test set is tagged with its global polarity, indicating whether the text expresses a positive, negative or neutral sentiment, or no sentiment at all. A set of 6 labels has been defined: strong positive (P+), positive (P), neutral (NEU), negative (N), strong negative (N+) and one additional no sentiment tag (NONE).

In addition, there is also an indication of the level of agreement or disagreement of the expressed sentiment within the content, with two possible values: AGREEMENT and DISAGREEMENT. This is especially useful to make out whether a neutral sentiment comes from neutral keywords or else the text contains positive and negative sentiments at the same time.

Moreover, the polarity at entity level, i.e., the polarity values related to the entities that are mentioned in the text, is also included for those cases when applicable. These values are similarly tagged with 6 possible values and include the level of agreement as related to each entity.

On the other hand, a selection of a set of topics has been made based on the thematic areas covered by the corpus, such as "política" ("politics"), "fútbol" ("soccer"), "literatura" ("literature") or "entretenimiento" ("entertainment"). Each message in both the training and test set has been assigned to one or several of these topics (most messages are associated to just one topic, due to the short length of the text).

All tagging has been done semiautomatically: a baseline machine learning model is first run and then all tags are manually checked by human experts. In the case of the polarity at entity level, due to the high volume of data to check, this tagging has just been done for the training set.

The following figure shows the information of two sample tweets. The first tweet is only tagged with the global polarity as the text contains no mentions to any entity, but the second one is tagged with both the global polarity of the message and the polarity associated to each of the entities that appear in the text (UPyD and Foro Asturias).

        <tweet>
          <tweetid>0000000000</tweetid>
          <user>usuario0</user>
          <content><![CDATA['Conozco a alguien q es adicto al drama! Ja ja ja te suena d algo!]]></content>
          <date>2011-12-02T02:59:03</date>
          <lang>es</lang>
          <sentiments>
            <polarity><value>P+</value><type>AGREEMENT</type></polarity>
          </sentiments>
          <topics>
            <topic>entretenimiento</topic>
          </topics>
        </tweet>
        <tweet>
          <tweetid>0000000001</tweetid>
          <user>usuario1</user>
          <content><![CDATA['UPyD contará casi seguro con grupo gracias al Foro Asturias.]]></content>
          <date>2011-12-02T00:21:01</date>
          <lang>es</lang>
          <sentiments>
            <polarity><value>P</value><type>AGREEMENT</type></polarity>
            <polarity><entity>UPyD</entity><value>P</value><type>AGREEMENT</type></polarity>
            <polarity><entity>Foro_Asturias</entity><value>P</value><type>AGREEMENT</type></polarity>
          </sentiments>
          <topics>
            <topic>política</topic>
          </topics>
        </tweet>

Social-TV Corpus

This corpus was collected during the 2014 Final of Copa del Rey championship in Spain between Real Madrid and F.C. Barcelona, played on 16 April 2014 at Mestalla Stadium in Valencia. Over 1 million tweets were collected from 15 minutes before to 15 minutes after the match. After filtering useless information, tweets in other languages than Spanish, a subset of 2 773 was selected.

All tweets were manually tagged with the aspects of the expressed messages and its sentiment polarity. Tweets may cover more than one aspect. The list of aspects is:

Afición
Arbitro
Autoridades
Entrenador
Equipos: Equipo-Atlético_de_Madrid, Equipo-Barcelona, Equipo-Real_Madrid, Equipo (any other team)
Jugadores: Jugador-Alexis_Sánchez, Jugador-Alvaro_Arbeloa, Jugador-Andrés_Iniesta, Jugador-Angel_Di_María, Jugador-Asier_Ilarramendi, Jugador-Carles_Puyol, Jugador-Cesc_Fábregas, Jugador-Cristiano_Ronaldo, Jugador-Dani_Alves, Jugador-Dani_Carvajal, Jugador-Fábio_Coentrão, Jugador-Gareth_Bale, Jugador-Iker_Casillas, Jugador-Isco, Jugador-Javier_Mascherano, Jugador-Jesé_Rodríguez, Jugador-José_Manuel_Pinto, Jugador-Karim_Benzema, Jugador-Lionel_Messi, Jugador-Luka_Modric, Jugador-Marc_Bartra, Jugador-Neymar_Jr., Jugador-Pedro_Rodríguez, Jugador-Pepe, Jugador-Sergio_Busquets, Jugador-Sergio_Ramos, Jugador-Xabi_Alonso, Jugador-Xavi_Hernández, Jugador (any other player)
Partido
Retransmisión

Sentiment polarity has been tagged from the point of view of the person who writes the tweet, using 3 levels: P, NEU and N. No distinction is made in cases when the author does not express any sentiment or when he/she expresses a no-positive no-negative sentiment.

The Social-TV corpus was randomly divided into two sets: training (1 773 tweets) and test (1 000 tweets), with a similar distribution of both aspects and sentiments. The training set will be released so that participants may train and validate their models. The test corpus will be provided without any tagging and will be used to evaluate the results provided by the different systems.

The following figure shows the information of three sample tweets in the training set.

<tweet id="456544898791907328"><sentiment aspect="Equipo-Real_Madrid" polarity="P">#HalaMadrid</sentiment> ganamos sin <sentiment aspect="Jugador-Cristiano_Ronaldo" polarity="NEU">Cristiano</sentiment>. .perdéis con <sentiment aspect="Jugador-Lionel_Messi" polarity="N">Messi</sentiment>. Hala <sentiment aspect="Equipo-Real_Madrid" polarity="P">Madrid</sentiment>! !!!!!</tweet>

<tweet id="456544898942906369">@nevermind2192 <sentiment aspect="Equipo-Barcelona" polarity="P">Barça</sentiment> por siempre!!</tweet>

<tweet id="456544898951282688"><sentiment aspect="Partido" polarity="NEU">#FinalCopa</sentiment> Hala <sentiment aspect="Equipo-Real_Madrid" polarity="P">Madrid</sentiment>, hala <sentiment aspect="Equipo-Real_Madrid" polarity="P">Madrid</sentiment>, campeón de la <sentiment aspect="Partido" polarity="P">copa del rey</sentiment></tweet>

STOMPOL Corpus

STOMPOL (corpus of Spanish Tweets for Opinion Mining at aspect level about POLitics) is a corpus of Spanish tweets prepared for the research in the challenging task of opinion mining at aspect level. The tweets were gathered from 23rd to 24th of April, and are related to one of the following political aspects that appear in political campaigns:

Economia (Economics): taxes, infrastructure, markets, labor policy...
Sanidad (Health System): hospitals, public/private health system, drugs, doctors...
Educacion (Education): state school, private school, scholarships...
Propio_partido (Political party): anything good (speeches, electoral programme...) or bad (corruption, criticism) related to the entity
Otros_aspectos (Other aspects): electoral system, environmental policy...

Each aspect is related to one or several entities (separated by pipe |) that correspond to one of the main political parties in Spain, which are:

Partido_Popular (PP)
Partido_Socialista_Obrero_Español (PSOE)
Izquierda_Unida (IU)
Podemos
Ciudadanos (Cs)
Unión_Progreso_y_Democracia (UPyD)

Each tweet in the corpus has been manually tagged by two different annotators, and a third one in case of disagreement, with the sentiment polarity at aspect level. Sentiment polarity has been tagged from the point of view of the person who writes the tweet, using 3 levels: P, NEU and N. Again, no difference is made between no sentiment and a neutral sentiment (neither positive nor negative).

Each political aspect is linked to its correspondent political party and its polarity.

Some examples are shown in the following figure:

<tweet id="591267548311769088">@ahorapodemos @Pablo_Iglesias_ @SextaNocheTV Que alguien pregunte si habrá cambios en las <sentiment aspect="Educacion" entity="Podemos" polarity="NEU">becas</sentiment> MEC para universitarios, por favor.</tweet>

<tweet id="591192167944736769">#Arroyomolinos lo que le interesa al ciudadano son Políticos cercanos que se interesen y preocupen por sus problemas <sentiment aspect="Propio_partido" entity="Union_Progreso_y_Democracia" polarity="P">@UPyD</sentiment> VECINOS COMO TU</tweet>

The corpus is composed by 1284 tweets, and has been splitted into training set (784 tweets), which is provided for building and validating the systems, and test set (500 tweets) that will be used for evaluation.

Important Dates

~~April 6th, 2015~~	Release of tasks.
~~Beginning of May, 2015~~	Release of training and test corpora (General and Social-TV).
~~Mid May, 2015~~	Release of training STOMPOL corpus.
~~June 1st, 2015~~	Release of test STOMPOL corpus.
~~July 1st, 2015~~	Experiment submissions by participants.
~~July 20th, 2015~~	Submission of papers.
September 15th, 2015	Workshop.

Registration

Please send an email to tass AT sngularmeaning.team filling in the TASS Corpus License agreement with your email, affiliation (institution, company or any kind of organization). You will be given a password to download the files in the password protected area.

All corpora will be made freely available to the community after the workshop.

If you use the corpus in your research (papers, articles, presentations for conferences or educational purposes), please include a citation to one of the following publications:

Villena-Román, J., Martínez-Cámara, E., García-Morera, J. & Jiménez-Zafra, S. (2015). TASS 2014 - The Challenge of Aspect-based Sentiment Analysis. Procesamiento del Lenguaje Natural, 54. http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/5095.
Villena-Román, J., García-Morera, J., Lana-Serrano, S., & González-Cristóbal, J.C. (2014). TASS 2013 - A Second Step in Reputation Analysis in Spanish. Procesamiento del Lenguaje Natural, 52. http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/4901.
Villena-Román, J., Lana-Serrano, S., Martínez-Cámara, E., González-Cristobal, J.C. (2013). TASS - Workshop on Sentiment Analysis at SEPLN. Procesamiento del Lenguaje Natural, 50. http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/4657.
TASS (Taller de Análisis de Sentimientos en la SEPLN) website. http://www.sngularmeaning.team/TASS.

Downloads

TASS 2015

general-tweets-train-tagged.xml : General corpus training set

[3.5MB]

general-tweets-test.xml : General corpus test set (for task 1)

[16.3MB]

general-tweets-test1k.xml : General corpus 1k test set (for task 1)

[274.1KB]

socialtv-tweets-train-tagged.xml : Social-TV corpus training set

[429KB]

socialtv-tweets-test.xml : Social-TV corpus test set (for task 2)

[213.6KB]

stompol-tweets-train-tagged.xml : STOMPOL corpus training set

[213.3KB]

stompol-tweets-test.xml : STOMPOL corpus test set (for task 2)

[126KB]

TASS 2015 (tagged)

general-tweets-test-tagged.xml : General corpus test set (for task 1), tagged

[23.7MB]

general-tweets-test1k-tagged.xml : General corpus 1k test set (for task 1), tagged

[397.9KB]

eval-task1.php.gz : Evaluation script for task 1

[1.2KB]

general-sentiment-5l.qrel : QREL file for General corpus test set, sentiment analysis, 5 levels

[1.3MB]

general-sentiment-3l.qrel : QREL file for General corpus test set, sentiment analysis, 3 levels

[1.3MB]

general-sentiment-5l-1k.qrel : QREL file for General corpus 1k test set, sentiment analysis, 5 levels

[21.4KB]

general-sentiment-3l-1k.qrel : QREL file for General corpus 1k test set, sentiment analysis, 3 levels

[21KB]

socialtv-tweets-test-tagged.xml : Social-TV corpus test set (for task 2), tagged

[239.5KB]

stompol-tweets-test-tagged.xml : STOMPOL corpus test set (for task 2), tagged

[134.5KB]

eval-task2.php.gz : Evaluation script for task 2

[1.3KB]

socialtv-sentiment.qrel : QREL file for Social-TV corpus test set, sentiment analysis

[73.1KB]

stompol-sentiment.qrel : QREL file for STOMPOL corpus test set, sentiment analysis

[35.2KB]

Past editions

general-users-tagged.xml : General corpus user information, manually tagged with political orientation (TASS 2013 task 3)

[102.2KB]

general-topics.qrel : QREL file for General corpus topic classification (TASS 2012-2014 task 2)

[1.8MB]

politics2013-tweets-test-tagged.xml : Politics 2013 corpus, manually tagged (TASS 2013 task 4)

[1.4MB]

politics2013.qrel : QREL file for Politics 2013 corpus (TASS 2013 task 4)

[67KB]

Organization

Organizing Commitee

Julio Villena-Román - Singular Meaning, Spain
Janine García-Morera - Singular Meaning, Spain
Miguel Ángel García-Cumbreras - University of Jaen, Spain (SINAI-UJAEN)
Eugenio Martínez-Cámara - University of Jaen, Spain (SINAI-UJAEN)
L. Alfonso Ureña-López - University of Jaen, Spain (SINAI-UJAEN)
María-Teresa Martín-Valdivia - University of Jaen, Spain (SINAI-UJAEN)

Contributors

David Vilares Calvo - University of Coruña, Spain
Ferran Pla Santamaria - Universitat Politècnica de València, Spain
Lluís F. Hurtado - Universitat Politècnica de València, Spain
David Tomás - University of Alicante, Spain
Yoan Gutiérrez Vázquez - University of Alicante, Spain
Manuel Montes - National Institute For Astrophysics, Optics and Electronics (INAOE), Mexico
Luis Villaseñor - National Institute For Astrophysics, Optics and Electronics (INAOE), Mexico

Programme Commitee

Alexandra Balahur - EC-Joint Research Centre, Italy
José Carlos González-Cristóbal - Technical University of Madrid, Spain (GSI-UPM)
José Carlos Cortizo - European University of Madrid, Spain
Ana García-Serrano - UNED, Spain
José María Gómez-Hidalgo - Optenet, Spain
Carlos A. Iglesias-Fernández - Technical University of Madrid, Spain
Zornitsa Kozareva - Information Sciences Institute, USA
Sara Lana-Serrano - Technical University of Madrid, Spain
Paloma Martínez-Fernandez - Carlos III University of Madrid, Spain
Ruslan Mitkov - University of Wolverhampton, U.K.
Andrés Montoyo - University of Alicante, Spain
Rafael Muñoz - University of Alicante, Spain
Constantin Orasan - University of Wolverhampton, U.K.
José Manuel Perea - University of Extremadura, Spain
Mike Thelwall - University of Wolverhampton, U.K.
José Antonio Troyano - University of Seville, Spain

Supporting Projects

ATTOS: Análisis de Tendencias y Temáticas a través de Opiniones y Sentimientos (TIN2012-38536-C03-0)

AORESCU: Análisis de Opinión en Redes Sociales y Contenidos Generados por Usuarios (P11-TIC-7684 MO)

Ciudad 2020: Hacia un nuevo modelo de ciudad inteligente sostenible (INNPRONTA IPT-20111006)

SAM (Socialising Around Media): Dynamic Social and Media Content Syndication for 2nd Screen (FP7-611312)