EP-IT Data Science Seminars

Entropy is all you need? The quest for best tokens and the new physics of LLMs

Name: Entropy is all you need? The quest for best tokens and the new physics of LLMs
Start: 2024-11-07T11:00:00+01:00
End: 2024-11-07T12:00:00+01:00
Location: CERN

by Dr Pierre-Carl Langlais (GRIPIC* - CELSA SORBONNE UNIVERSITY)

Thursday 7 Nov 2024, 11:00 → 12:00 Europe/Zurich

40/S2-D01 - Salle Dirac (CERN)

40/S2-D01 - Salle Dirac

CERN

Show room on map

Description

LLMs currently generate texts one token at a time with fixed hyperparameters. In September 2024, OpenAI released o1, a largely undocumented model that rely on a expanded pre-generation strategies to enhance the quality of even small models ("inference scaling"). Since then, two anonymous researchers publicly released a new method of token selection: Entropix leverage entropy as a metric of token uncertainty to constraint the model to explore different generation strategies. This seminar will present an emerging field of post-training research that holds the promise to significantly enhance the quality of very small model (including a 300m million parameters we adapted to Entropix). We'll also discuss the even more speculative impact of this research direction for pretraining by letting the model adapt in flux different forms of training or optimizer strategies.

SPEAKER'S BIO:

Dr. Pierre-Carl Langlais is a researcher in artificial intelligence, co-fonder of Pleias, a French lab specialised in the training of Small Language Models for document processing and other intensive corporate use cases. A long time activist for open science, Pierre-Carl is an admin on Wikipedia and has co-written an influential report for the EU commission on non-commercial open access publishing. In 2024, he coordinated the release of Common Corpus, the largest open dataset available for the training of large language models.

(*) Groupe de Recherches Interdisciplinaires sur les Processus d’Information et de Communication

Coffee will be served at 10:30.

Organised by

M. Girone, M. Elsing, L. Moneta, M. Pierini

Contact

EP-seminars.colloquia@cern.ch

Webcast

There is a live webcast for this event