28th Conference on Computing in High Energy and Nuclear Physics (CHEP 2026)

Name: 28th Conference on Computing in High Energy and Nuclear Physics (CHEP 2026)
Start: 2026-05-25T08:00:00+07:00
End: 2026-05-29T14:00:00+07:00
Location: Chulalongkorn University

25–29 May 2026

Chulalongkorn University

Asia/Bangkok timezone

Towards Autonomous Computing Operations with AI Assistance for Belle II Experiment.

26 May 2026, 17:45

18m

MHMK 202

Oral Presentation Track 4 - Distributed computing Track 4 - Distributed computing

Mr Dhiraj Kalita (KEK (High Energy Accelerator Research Organization))

The Belle II experiment at KEK, Japan, operates with data volume reaching over 30 petabytes, with datasets distributed and processed worldwide using DIRAC and Rucio. With the globally distributed computing infrastructure, and expecting an order of magnitude larger data volume, we face operational challenges for both computing experts and end-users. The end-users frequently struggle with multiple issues (e.g. problem with job submission, locating relevant documentation) generating load on experts who provide support.
This contribution reports on ongoing research and development of an intelligent, automated assistance system. The proposed system is designed to optimize experiment workflows, diagnose common failures, and provide continuous 24/7 monitoring to reduce service downtime and accelerate incident response. Our work leverages recent advances in open-source Large Language Models (LLMs) combined with Retrieval-Augmented Generation (RAG) to incorporate experiment-specific documentation such as software guides, troubleshooting resources, and FAQs for authoritative, context-aware assistance. In parallel, we explore AI-Agents for automated analysis of grid job logs, failure classification, and root-cause suggestion.
This research proposes a local LLM infrastructure for enhanced privacy, security, and sustainability by keeping sensitive data internal. The self-contained deployment allows for task-specific fine-tuning, integration with Model Context Protocol (MCP) tools, and long-term cost control. The contribution details the prototype architecture, preliminary evaluation, and a roadmap to improve Belle II Experiment operations and user experience.

Cedric Serfon (Brookhaven National Laboratory (US)) Mr Dhiraj Kalita (KEK (High Energy Accelerator Research Organization)) I Ueda (KEK IPNS) Michel Hernandez Villanueva (Brookhaven National Laboratory (US))

Mr Paul Gebeline (University of Mississippi | Ole Miss) Quinn Campagna

CHEP-2026-belle2-AI:LLM.pdf

28th Conference on Computing in High Energy and Nuclear Physics (CHEP 2026)

Towards Autonomous Computing Operations with AI Assistance for Belle II Experiment.

MHMK 202

Speaker

Description

Authors

Co-authors

Presentation materials