8–12 Sept 2025
Hamburg, Germany
Europe/Berlin timezone

CLUEstering: a novel high-performance clustering library for scientific computing

8 Sept 2025, 17:00
20m
ESA M

ESA M

Oral Track 1: Computing Technology for Physics Research Track 1: Computing Technology for Physics Research

Speaker

Simone Balducci (Universita e INFN, Bologna (IT))

Description

CLUEstering is a versatile clustering library based on CLUE, a density-based weighted clustering algorithm optimized for high-performance computing that supports clustering in an arbitraty. The library offers a user-friendly Python interface and a C++ backend to maximize performance. CLUE’s parallel design is tailored to exploit modern hardware accelerators, enabling it to process large-scale datasets with strong scalability and speed.
To ensure performance portability across diverse architectures, the backend is implemented using alpaka, a C++ performance portability library that enables near-native performance on a wide range of accelerators with minimal code duplication. CLUEstering's unique combination of density-based and weighted clustering makes it a unique among popular clustering algorithms, many of which lack built-in support for such combination.
This work will show comprehensive clustering results and performance benchmarks against other state-of-the-art algorithms.

References

https://www.frontiersin.org/journals/big-data/articles/10.3389/fdata.2020.591315/full

Significance

This work presents a new clustering library that combines density-based and weighted clustering, opening a new area of possibilities for clustering applications. The library is based on a highly parallel algorithm that supports clustering in an arbitrary number of dimensions and is implemented using a performance portability library that allows to leverage new types of accelerators with minimal code duplication.

Experiment context, if any The work is related to the CMS experiment

Author

Simone Balducci (Universita e INFN, Bologna (IT))

Co-authors

Aurora Perego (Universita & INFN, Milano-Bicocca (IT)) Felice Pantaleo (CERN) Francesco Giacomini (INFN CNAF) Marco Rovere (CERN) Wahid Redjeb (Rheinisch Westfaelische Tech. Hoch. (DE))

Presentation materials