Speaker
Description
Since 2015, LHCb’s central onboarding resource for new collaborators has been the Starterkit, a set of self-study lessons that also form the basis of an annual in-person workshop in Geneva. Ahead of Run 3 (2022–2026), a new version of the Starterkit was
developed to accompany the Upgrade I software stack, with improved testing and updated exercises now used in the workshop.
However, participation in the Geneva workshop is difficult for many Chinese collaborators due to distance, cost, and administrative constraints. To address this, the Starterkit content has been translated into Mandarin Chinese to support a
corresponding workshop hosted within China. This is particularly important because China represents the largest group of LHCb collaborators outside Europe.
Initial translations were produced using LLMs trained on LHCb documentation (based on LLaMA-derived models) and subsequently refined by students and reviewed by
academics before going live to the website. Since then, updates to either language version are synchronised by a dedicated liaison who uses the same LLM tools to maintain consistency between the English and Chinese versions.
To measure its popularity, basic analytics have been integrated into the lessons showing that approximately 25% of all Starterkit users access the Chinese translation. In this presentation the experiences around performing this translation are discusses as well as maintainance in the long-term to ensure synchronisation.