About
This task is focused on an emotional categorization in the domain of news articles.
A corpus is currently being built from RSS feeds of different online newspapers written
in different varieties of Spanish, namelly Argentina, Chile, Colombia, Cuba,
Spain, USA, Mexico, Peru and Venezuela. Then the plan is to classify the information
provided by each feed (at least, a headline, and in many cases, a brief summary of
the article) into an emotional categorization of SAFE
or UNSAFE
, from the point of view of
the general public of each corresponding country.
This task could be considered as a kind of stance classification, on the positioning of the editor or the public of an article on a news content. The task is a strong challenge because it has to deal with the polarity of feeling (safe vs unsafe) and to work in combination with a (pseudo) thematic classification to be able to determine the meaning of the news. For example, the reduction of traffic accidents has a negative feeling because of the accidents, but the context of reducing the numbers makes that these are finally good news.
Tasks
Two subtasks are proposed, the first one aims at evaluating the performance of the systems without taking into account the varieties of the Spanish language. On the other hand, the second task is a local multilingual challenge, because the training set is composed of texts written in the Spanish language spoken in Spain and the test sets are texts written in the Spanish language spoken in different countries of America.
Subtask-1: Monolingual classification
The aim of the task is the classification of the headline of a news in SAFE
or UNSAFE
for
incorporating an ad. If a news arises a positive or a neutral emotion, it is safe for incorporating ads, but if it arises a negative emotion,
the news is unsafe for adding ads. The submitted systems will have to face up the following challenges:
- Lack of context. The participants will work with headline of news, that are usually very short and without any contextual information.
- Genearalization. The topics of the headlines of news are very diverse, so it makes more difficult the classification task.
- Lexical diversity. The training and the test data include utterances written in the Spanish language spoken in Spain and in different coutries of America.
The participants will be provided with a training and development sets of the SANSE corpus (see Dataset section), and the test set
for the evaluation. In this task, the three sets are composed of headlines of news written in different version of the Spanish language, but the country of the
text is not relevant for this task. The three sets are annotated with two levels of safety: SAFE
and UNSAFE
.
Therefore, the task is a binary classification task.
The evaluation of Subtask-1 is organized in two levels:
- The test data will be composed of the headlines of the test subset of the SANSE corpus.
- The test set will be larger than in L1, about 13,000 headlines, and the headlines are written in all varieties of the Spanish language.
Subtask-2: Multilingual classification
The aim is similar to the Subtask-1, but in this case the aim is to evaluate the generalization capacity of the submitted systems. The participants will be provided with a training and development set of SANSE with headlines of news only written in the Spanish language spoken in Spain. Several test sets will be provided, and they will be composed of headline of news written in the Spanish language spoken in different countries of America.
Evaluation
The systems presented will be evaluated using the measures of Macro-Precision, Macro-Recall, Macro-F1 and Accuracy.
Datasets
The participants will use the Spanish brANd Safe Emotion corpus (SANSE).
SANSE corpus
The SANSE corpus is composed of 2,000 headlines of news writen in the Spanish language spoken in Spain and in several Amaerican countries, specifically Mexico, Cuba, Chile, Colombia, Argentina, Venezuela, Peru and U.S.A. Therefore, SENSE is a representative corpus of headline of news written in Spanish all over the Spanish speaking world.
The annotation was carried out by two human annotators, namelly the two organizers of the task. A safe headline of a news was defined as an utterance that arises a positive or neutral emotion in the reader, or it is not related to a controversial topics: religion, extreme wing political topic, topics that may arise strong positive emotions to some readers but strong negative emotions to other ones. An unsafe headline was defined as an utterance that arises negative emotions on the reader. Some examples:
Así será el nuevo pan integral en España, según una nueva ley en marcha.
According to a new law in progress, the new wholemeal bread will be in what follows.
SAFE
Casi 300 municipios de Colombia en riesgo electoral.
Almost 300 Colombian towns are in electoral risk.
UNSAFE
The agreement of the annotation was 0,58 according to Π [1] and Κ [2], which may consider moderate according to Landis and Koch [3]. Although the agreement is moderate, it is close to be considered substantial, and we have also to take into account that it is a new classification task that works with a strong subjective content. All those cases with no agreement between the two annotators, a third annotator undid the tie. We will work in making the annotations guidelines more precise in order to improve the agreement of the annotators. Besides, we hope that the participants will give us insights with the aim of improving the annotation of the data.
The SANSE corpus is divided in three subsest for Substask-1, specifically: training, development and test. The statistics of the three subsets are in the following table.
The statistics of SENSE corpus for Subtask-2 are the following:
Shared Task
Evaluation
The evaluation web page is available and it is at: http://www.sepln.org/workshops/tass/2018/task-4/private/evaluation/evaluate.php
Results must be submitted in a plain text file with the following format:
ID_Headline\tLABEL
Official results
The official results web page will be released after the evaluation time.
Award
The best system of Subtask-1 will receive the best system award that is a cash prize of 100€, which is sponsored by MeaningCloud.
Datasets downloads
The use of SANSE corpus requires of agreeing the terms of use of the data through the signment of the TASS Data License.
Subtask-1
- Training & Development: You must sign the License of terms of use in order to download it. The license is at: http://www.sepln.org/workshops/tass/tass_data/download.php
- Test: It will be released in the download section of TASS. You must use the same URL that you used for downloading the Training and Development datasets.
Subtask-2
- Training & Development: You must sign the License of terms of use in ordeer to download it. The license is at: http://www.sepln.org/workshops/tass/tass_data/download.php
- Test: It will be released in the download section of TASS. You must use the same URL that you used for downloading the Training and Development datasets.
Proceedings
The same as for Task-1 and Task-2. See main webpage of TASS-2018.
You should take into account that the content of the paper must be 6 pages plus references. You have to strongly focus on the description of your system, and you do not have to waste space describing the details of the task or the corpora. The details of the task and the corpora will be published in a "Overview" paper, which we recommend you to cite in your paper. The provisional bibtex code of the reference to the Overview paper is:
@inproceedings{overview_tass2018, author = "Mart\'{i}nez-C\'{a}mara, Eugenio and Almeida-Cruz, Yudivi\'{a}n and D\'{i}az-Galiano, Manuel C. and Est\'{e}vez-Velarde, Suilan and Garc\'{i}a-Cumbreras, Miguel \'{A}. and Garc\'{i}a-Vega, Manuel and Guti\'{e}rrez, Yoan and Montejo R\'{a}ez, Arturo and Montoyo, Andr\'{e}s and Mu\~{n}oz, Rafael and Piad-Morffis, Alejandro and Villena-Rom\'{a}n Julio", title = "Overview of TASS 2018: Opinions, Health and Emotions", booktitle = "Proceedings of TASS 2018: Workshop on Semantic Analysis at SEPLN (TASS 2018)", editor = "Mart\'{i}nez-C\'{a}mara, Eugenio and Almeida Cruz, Yudivi\'{a}n and D\'{i}az-Galiano, Manuel C. and Est\'{e}vez Velarde, Suilan and Garc\'{i}a-Cumbreras, Miguel \'{A}. and Garc\'{i}a-Vega, Manuel and Guti\'{e}rrez V\'{a}zquez, Yoan and Montejo R\'{a}ez, Arturo and Montoyo Guijarro, Andr\'{e} and Mu\~{n}oz Guillena, Rafael and Piad Morffis, Alejandro and Villena-Rom\'{a}n Julio", volume = "", series = "CEUR Workshop Proceedings", pages = "1-X", address = "Sevilla, Spain", publisher = "CEUR-WS", year = "2018", month = "September", }
Because of the CEUR rules for assignation of volume number, we will kindly ask you to update the reference of the "Overview" paper before the submission of the camera-ready.
You have to send your paper to the email direction: tass-sepln@googlegroups.com. If you have any problem, please, let us know (emcamara@decsai.ugr.es).
Program
To be announced.
Presentation instructions
To be announced.
Important dates
Release of training and development corpora
May 2, 2018
Release of test corpora
June 25, 2018
Deadline for evaluation
June 27, 2018
Deadline for evaluation
July 3, 2018
Paper submission
July 16, 2018
Paper submission
July 24, 2018
Review notification
August 7, 2018
Camera ready submission
September 5, 2018
Publication
September 17, 2018
Workshop
September 18, 2018
Organization
References
- Scott, William A. 1955. Reliability of content analysis: The case of nominal scale coding. Public Opinion Quarterly, 19(3):321–325.
- Cohen, Jacob. 1960. A coefficient of agree-ment for nominal scales. Educational and Psychological Measurement, 20(1):37–46.
- Landis, J. Richard and Gary G. Koch. 1977. The measurement of observer agreement for categorical data. Biometrics, 33(1):159–174.