EP-IT Data Science Seminars

STEAM Academy Seminar: The Secrets to Training World-Class LLMs

by Dr Lewis Tunstall (Hugging Face)

Europe/Zurich
500/1-001 - Main Auditorium (CERN)

500/1-001 - Main Auditorium

CERN

400
Show room on map
Description

Abstract: What does it actually take to train a strong language model? Published research makes it look clean: strategic architecture choices, carefully curated data, enough compute, and a tidy set of ablations where every decision seems obvious in hindsight. But those reports are written with a fair bit of rosy retrospection. They don't capture the 2 a.m. debugging sessions, the loss spikes that appear from nowhere, or the subtle tensor-parallelism bug that quietly sabotages a run for days. The reality is messier and more iterative, and most of the decisions that actually mattered never make it into the final paper. In this lecture I'll walk through the nuts and bolts of training an LLM end to end, from pretraining over trillions of tokens to post-training with reinforcement learning. 

Bio: Lewis is a Machine Learning Engineer at Hugging Face where he works on applying Transformers to automate business processes and solve MLOps challenges. Lewis has built ML applications for startups and enterprises in the domains of NLP, topological data analysis, and time series. In a previous life, Lewis was a theoretical physicist and contributer to open-source projects.

This seminar is part of the CERN STEAM Academy Seminar Series. 

Networking cocktail will follow the seminar. 

With the support of CERN's Next Generation Triggers Project.

 

Organised by

F. Pantaleo, A. Kravchenko,
M. Girone, M. Elsing, L. Moneta, M. Pierini

Webcast
There is a live webcast for this event
Zoom Meeting ID
69075966602
Host
Felice Pantaleo
Useful links
Join via phone
Zoom URL