TASS 2013 @ SEPLN

Welcome to TASS 2013!

TASS is an experimental evaluation workshop for sentiment analysis and online reputation analysis focused on Spanish language, organized as a satellite event of the annual SEPLN Conference. After a successful first edition in 2012, TASS 2013 will be held on September 20th, 2013 in the Universidad Complutense de Madrid, Madrid, Spain.

According to Merriam-Webster dictionary, reputation is the overall quality or character of a given person or organization as seen or judged by people in general, or, in other words, the general recognition by other people of some characteristics or abilities for a given entity. Specifically, in business, reputation comprises the actions of a company and its internal stakeholders along with the perception of consumers about the business. Reputation affects attitudes like satisfaction, commitment and trust, and drives behavior like loyalty and support. In turn, reputation analysis is the process of tracking, investigating and reporting an entity's actions and other entities' opinions about those actions. It covers many factors to calculate the market value of reputation. Reputation analysis has come into wide use as a major factor of competitiveness in the increasingly complex marketplace of personal and business relationships among people and companies.

Currently market research using user surveys is typically performed. However, the rise of social media such as blogs and social networks and the increasing amount of user-generated contents in the form of reviews, recommendations, ratings and any other form of opinion, has led to creation of an emerging trend towards online reputation analysis. This analysis has two technological aspects: sentiment analysis and text classification (or categorization).

First, the so-called sentiment analysis, i.e., the application of natural language processing and text analytics to identify and extract subjective information from texts, which is the first step towards the online reputation analysis, is becoming a promising topic in the field of marketing and customer relationship management, as the social media and its associated word-of-mouth effect is turning out to be the most important source of information for companies and their customers' sentiments towards their brands and products.

Then, automatic text classification is used to guess the topic of the text, among those of a predefined set of categories or classes, so as to be able to assign the reputation level of the company into different facets, axis or points of view of analysis.

Sentiment analysis is a major technological challenge. The task is so hard that even humans often disagree on the sentiment of a given text. The fact that issues that one individual finds acceptable or relevant may not be the same to others, along with multilingual aspects, cultural factors and different contexts make it very hard to classify a text written in a natural language into a positive or negative sentiment. And the shorter the text is, for example, when analyzing Twitter messages or short comments in Facebook, the harder the task becomes.

On the other hand, text classification techniques, although studied for a longer time, still need more research effort to be able to build complex models with many categories with less workload and increase the precision and recall of the results. In addition, these models should work well with short texts and deal with specific text features that are present in social media messages (such as spelling mistakes, abbreviations, SMS language, etc.).

Within this context, the aim of TASS is to provide a forum for discussion and communication where the latest research work and developments in the field of sentiment analysis in social media, specifically focused on Spanish language, can be shown and discussed by scientific and business communities. The main objective is to promote the application of existing state-of-the-art algorithms and techniques and the design of new ones for the implementation of complex systems able to perform a sentiment analysis and text classification on short text opinions extracted from social media messages (specifically Twitter) published by a series of representative personalities.

The challenge task is intended to provide a benchmark forum for comparing the latest approaches in these fields. In addition, with the creation and release of the fully tagged corpus, we aim to provide a benchmark dataset that enables researchers to compare their algorithms and systems.