Inverted CERN School of Computing 2025

Name: Inverted CERN School of Computing 2025
Start: 2025-03-24T09:00:00+01:00
End: 2025-03-27T18:15:00+01:00
Location: CERN

24–27 Mar 2025

CERN

Europe/Zurich timezone

There is a live webcast for this event.

CERN School of Computing

Computing.School@cern.ch

Exercise: LLMs in Production: RAG pipelines and beyond

25 Mar 2025, 16:45

513-1-024

Jack Charlie Munday (CERN)

Running Large Language Models (LLMs) in production presents lots of complexities extending far beyond your choice in model. Key challenges include:

How do you address knowledge staleness (i.e. your model being trained on out of date / not relevant information)?
How do you balance cost optimisation with model latency?
How do you reduce bias and factual hallucinations?

A widely adopted approach to address these is Retrieval Augmented Generation (RAG).

RAG pipelines implement a two-tiered approach (Retrieval & Generation): allowing models to be given domain-specific information prior to generating their response to a question. Through these techniques, "Off the Shelf" LLMs can be applied to a much wider domain context than what they were originally trained for.

In this lecture, we will explore how to improve the adaptability of LLMs without the need for fine-tuning: covering RAG and related architectures, physics-based approaches like entropix that allow for self-reasoning / context aware sampling, and the challenges with applying these techniques in a production context.

Number of lecture hours	1
Number of exercise hours	0 (no exercises)
Attended school	tCSC 2023 (Split)

Jack Charlie Munday (CERN)

exercises.ipynb

jet-substructure.pdf

LLMs in Production - Exercise.pdf

README.md

Inverted CERN School of Computing 2025

CERN School of Computing

Exercise: LLMs in Production: RAG pipelines and beyond

513-1-024

Speaker

Description

Author

Presentation materials

Choose timezone

Inverted CERN School of Computing 2025

CERN School of Computing

Speaker

Description

Author

Presentation materials