17–24 Jul 2024
Prague
Europe/Prague timezone

Machine Learning-based Data Compression

20 Jul 2024, 15:55
17m
Club A

Club A

Parallel session talk 14. Computing, AI and Data Handling Computing and Data handling

Speaker

Axel Gallen (Uppsala University (SE))

Description

"Data deluge" refers to the situation where the sheer volume of new data generated overwhelms the capacity of institutions to manage it and researchers to use it. This is becoming a common problem in industry and big science facilities like the MAX IV laboratory and the LHC.

As a solution to this problem, a small collaboration of researchers has developed a machine learning-based data compression tool called "Baler". Baler allows researchers to design lossy compression algorithms tailored to their data sets via an easy-to-use pip-package. This compression method yields substantial data reduction and can compress scientific data to 1% of its original size.

Baler recently performed compression and decompression of data on FPGAs, which extends Balers reach into the field of bandwidth compression. This contribution will bring an overview of the Baler software tool and results from Particle Physics, X-ray ptychography, Computational Fluid Dynamics, and Telecommunication.

Alternate track 14. Computing, AI and Data Handling
I read the instructions above Yes

Primary authors

Axel Gallen (Uppsala University (SE)) Caterina Doglioni (University of Manchester (GB)) Per Alexander Ekman (Lund University (SE)) Pratik Jawahar (University of Manchester (UK - ATLAS))

Presentation materials