5th ICFA Beam Dynamics Mini-Workshop on Machine Learning for Particle Accelerators

Name: 5th ICFA Beam Dynamics Mini-Workshop on Machine Learning for Particle Accelerators
Start: 2025-04-08T08:00:00+02:00
End: 2025-04-11T19:30:00+02:00
Location: CERN

8–11 Apr 2025

CERN

Europe/Zurich timezone

Workshop Support

ICFA-ML-2025@cern.ch

The Journey of Developing Specialized Text Embedding Models- 15'+5'

11 Apr 2025, 10:20

20m

80/1-001 - Globe of Science and Innovation - 1st Floor (CERN)

80/1-001 - Globe of Science and Innovation - 1st Floor

CERN

Esplanade des Particules 1, 1211 Meyrin, Switzerland

Show room on map

Invited talks LLMs and AI Assistants LLMs and AI Assistants

Thorsten Hellert

The specialized terminology and complex concepts inherent in physics present significant challenges for Natural Language Processing (NLP), particularly when relying on general-purpose models. In this talk, I will discuss the development of physics-specific text embedding models designed to overcome these obstacles, beginning with PhysBERT—the first model pre-trained exclusively on a curated corpus of 1.2 million arXiv physics papers. Building upon this foundation, we turn our attention to accelerator physics, a subfield with even more intricate language and concepts. To effectively capture the nuances of this domain, we developed AccPhysBERT, a sentence embedding model fine-tuned specifically for accelerator physics literature. A key aspect of this development involved leveraging Large Language Models (LLMs) extensively to generate annotated training data, enabling AccPhysBERT to facilitate advanced NLP applications such as semantic paper-reviewer matching and integration into Retrieval-Augmented Generation systems.

Thorsten Hellert

Andrea Pollastro Mr João Montenegro (LBNL) Marco Venturini (LBNL)

250411_MALAPA.pdf

5th ICFA Beam Dynamics Mini-Workshop on Machine Learning for Particle Accelerators

Workshop Support

The Journey of Developing Specialized Text Embedding Models- 15'+5'

80/1-001 - Globe of Science and Innovation - 1st Floor

CERN

Speaker

Description

Author

Co-authors

Presentation materials