Alignment in language models (LLMs)

Picture Álvaro Barbero

Date and time: 30/05/24, 5pm CET


Álvaro Barbero Jiménez, Instituto de Ingeniería del Conocimiento (IIC), Universidad Autónoma de Madrid


Large language models (LLMs) have been the state of the art in natural language processing techniques for several years now, although it was not until the arrival of ChatGPT that their use has become massively popular. At the same time, ChatGPT has shown that the alignment of these models to the end user is key to their successful adoption: recent studies suggest that language models do not reason, in the strict sense of this word, but rather behave like "search engines." generative", producing answers that are an interpolation of the training texts that most closely resemble the user's question. The success of products like ChatGPT, Bard or Claude has been the fine adjustment of these language models, aligning their responses and style to what the end user expects.

In this talk we will review some evidence that supports the behavior of LLMs as generative search engines, as well as the techniques used to align these models and improve their use in practice: training by instructions, supervised fine tuning, reinforcement learning based on human feedback (Reinforcement Learning from Human Feedback), direct preference optimization (Direct Preference Optimization) and red teaming (Red Teaming).


Álvaro Barbero is the director of the Artificial Intelligence area at the Institute of Knowledge Engineering (IIC). He has degrees of Higher Engineering (2006), Master (2008) and Doctor (2011) in Computer Engineering from the UAM, with a specialty in Machine Learning. He has been a finalist twice in the Texata Big Data Analytics World Championships competition, and also a finalist in the Spain AI NLP hackathon in 2020. From his position at IIC he has participated in numerous Artificial Intelligence projects, from fraud detection strategies and opinion analysis in social networks, demand prediction systems and optimization of stock management. In the academic field, he collaborates with the Machine Learning Group of the Autonomous University of Madrid (UAM), is the author of more than 40 international publications and teaches several data science and machine learning courses. During his career he has collaborated with prestigious research centers such as the Max Planck Institute for Intelligent Systems, IBM Research Watson or the University of Tokyo.

Mandatory registration in Zoom:

Enlaces relacionados