EP-IT Data Science Seminars

Entropy is all you need? The quest for best tokens and the new physics of LLMs

by Dr Pierre-Carl Langlais (GRIPIC* - CELSA SORBONNE UNIVERSITY)

Europe/Zurich
40/S2-D01 - Salle Dirac (CERN)

40/S2-D01 - Salle Dirac

CERN

115
Show room on map
Description
LLMs currently generate texts one token at a time with fixed hyperparameters. In September 2024, OpenAI released o1, a largely undocumented model that rely on a expanded pre-generation strategies to enhance the quality of even small models ("inference scaling"). Since then, two anonymous researchers publicly released a new method of token selection: Entropix leverage entropy as a metric of token uncertainty to constraint the model to explore different generation strategies. This seminar will present an emerging field of post-training research that holds the promise to significantly enhance the quality of very small model (including a 300m million parameters we adapted to Entropix). We'll also discuss the even more speculative impact of this research direction for pretraining by letting the model adapt in flux different forms of training or optimizer strategies.
 
SPEAKER'S BIO:

Dr. Pierre-Carl Langlais is a researcher in artificial intelligence, co-fonder of Pleias, a French lab specialised in the training of Small Language Models for document processing and other intensive corporate use cases. A long time activist for open science, Pierre-Carl is an admin on Wikipedia and has co-written an influential report for the EU commission on non-commercial open access publishing. In 2024, he coordinated the release of Common Corpus, the largest open dataset available for the training of large language models.

(*) Groupe de Recherches Interdisciplinaires sur les Processus d’Information et de Communication
 

Coffee will be served at 10:30.

Organized by

M. Girone, M. Elsing, L. Moneta, M. Pierini

Webcast
There is a live webcast for this event