Inverted CERN School of Computing 2025

Name: Inverted CERN School of Computing 2025
Start: 2025-03-24T09:00:00+01:00
End: 2025-03-27T18:15:00+01:00
Location: CERN

24–27 Mar 2025

CERN

Europe/Zurich timezone

There is a live webcast for this event.

CERN School of Computing

Computing.School@cern.ch

LLMs in Production: RAG pipelines and beyond

24 Mar 2025, 11:40

31/3-004 - IT Amphitheatre (CERN)

31/3-004 - IT Amphitheatre

CERN

105

Show room on map

Data Science, Machine Learning, and AI

Jack Charlie Munday (CERN)

Running Large Language Models (LLMs) in production presents lots of complexities extending far beyond your choice in model. Key challenges include:

How do you address knowledge staleness (i.e. your model being trained on out of date / not relevant information)?
How do you balance cost optimisation with model latency?
How do you reduce bias and factual hallucinations?

A widely adopted approach to address these is Retrieval Augmented Generation (RAG).

RAG pipelines implement a two-tiered approach (Retrieval & Generation): allowing models to be given domain-specific information prior to generating their response to a question. Through these techniques, "Off the Shelf" LLMs can be applied to a much wider domain context than what they were originally trained for.

In this lecture, we will explore how to improve the adaptability of LLMs without the need for fine-tuning: covering RAG and related architectures, physics-based approaches like entropix that allow for self-reasoning / context aware sampling, and the challenges with applying these techniques in a production context.

Number of lecture hours	1
Number of exercise hours	0 (no exercises)
Attended school	tCSC 2023 (Split)

Jack Charlie Munday (CERN)

LLMs In Production.pdf

Recording

Video preview

Inverted CERN School of Computing 2025

CERN School of Computing

LLMs in Production: RAG pipelines and beyond

31/3-004 - IT Amphitheatre

CERN

Speaker

Description

Author

Presentation materials

Choose timezone

Inverted CERN School of Computing 2025

CERN School of Computing

Speaker

Description

Author

Presentation materials