28th Conference on Computing in High Energy and Nuclear Physics (CHEP 2026)

Name: 28th Conference on Computing in High Energy and Nuclear Physics (CHEP 2026)
Start: 2026-05-25T08:00:00+07:00
End: 2026-05-29T14:00:00+07:00
Location: Chulalongkorn University

25–29 May 2026

Chulalongkorn University

Asia/Bangkok timezone

Enabling Accessibility to CERN audiovisual content via Automated Speech Recognition

Not scheduled

18m

Chulalongkorn University

Poster Presentation Track 8 - Analysis infrastructure, outreach and education Poster

Pablo Saiz (CERN)

Diversity awareness requires that we provide all CERN-made multi-media content with subtitles and make them fully searchable, addressing in particular the needs of persons with impairments and speakers of foreign languages. The goal of the “Transcription and Translation as a Service” (TTaas) software [1] is to deliver a performant, privacy-preserving and cost-efficient Automated Speech Recognition and translation system, for existing and newly created audiovisual content such as videos and webcasts.

We will start by listing the requirements in the particular CERN context, as well as describing the corpus of audiovisual material targeted by this project. The metrics for measuring accuracy and performance, used during an initial product survey, will be explained. We will then describe the identified solution and its core components, powered by technology developed at the MLLP group [2] of the Universitat Politècnica de València. We will furthermore detail how the MLLP language models are trained with CERN content containing a large variety of accents and technical terms, and how this allows to outperform other systems. We will also summarise the feedback we received from different stakeholders in the community. Finally, we will outline how the language models will evolve in the future, and describe the current and next steps for integrating this new system into CERN’s ecosystem of collaborative tools.

[1] https://ttaas.docs.cern.ch/
[2] https://www.mllp.upv.es/

Ruben Domingo Gaspar Aparicio (CERN)

There are no materials yet.

28th Conference on Computing in High Energy and Nuclear Physics (CHEP 2026)

Enabling Accessibility to CERN audiovisual content via Automated Speech Recognition

Chulalongkorn University

Speaker

Description

Author

Presentation materials