CHEP 2018 Conference, Sofia, Bulgaria

Name: CHEP 2018 Conference, Sofia, Bulgaria
Start: 2018-07-09T08:00:00+03:00
End: 2018-07-13T13:00:00+03:00
Location: Sofia, Bulgaria

9–13 Jul 2018

Sofia, Bulgaria

Europe/Sofia timezone

Contact us

Advancing throughput of HEP analysis work-flows using caching concepts

9 Jul 2018, 14:30

15m

Hall 8 (National Palace of Culture)

Hall 8

National Palace of Culture

presentation Track 4 - Data Handling T4 - Data handling

Christoph Heidecker (KIT - Karlsruhe Institute of Technology (DE))

High throughput and short turnaround cycles are core requirements for the efficient processing of I/O-intense end-user analyses. Together with the tremendously increasing amount of data to be processed, this leads to enormous challenges for HEP storage systems, networks and the data distribution to end-users. This situation is even compounded by taking into account opportunistic resources without dedicated storage systems as possible extension of traditional HEP computing facilities for end-user analyses.

Enabling data locality via local caches on the processing units is a very promising approach to solve throughput limitations and to ensure short turnaround cycles of end-user analyses. Therefore, two different caching concepts have been studied at the Karlsruhe Insitute of Technology. Both are transparently integrated into the HTCondor batch system in order to avoid job specific adaptations for end-users.

The first concept relies on coordinated caches on SSDs in the worker nodes. Data locality is taken into account by custom developed components around the HTCondor batch system ensuring that jobs are assigned to nodes holding its input data.

The second concept utilizes CEPH as a distributed file system acting as a system-wide cache. In this case no data locality specific adjustments need to be applied to the HTCondor batch system. In combination with developed XRootD caching and data locality plug-ins, this approach is also very well suited to tackle bandwidth limitations on opportunistic resources like HPC centers offering parallel file systems.

In this talk an overview about the utilized technologies, the data locality concepts and the current status of the project will be presented.

Christoph Heidecker (KIT - Karlsruhe Institute of Technology (DE))

Gunter Quast (KIT - Karlsruhe Institute of Technology (DE)) Manuel Giffels (KIT - Karlsruhe Institute of Technology (DE)) Max Fischer (GSI - Helmholtzzentrum fur Schwerionenforschung GmbH (DE)) Matthias Jochen Schnepf (KIT - Karlsruhe Institute of Technology (DE)) Eileen Kuhn (KIT - Karlsruhe Institute of Technology (DE))

CHEP-2018-07-09-cheidecker.pdf

CHEP 2018 Conference, Sofia, Bulgaria

Contact us

Advancing throughput of HEP analysis work-flows using caching concepts

Hall 8

National Palace of Culture

Speaker

Description

Primary author

Co-authors

Presentation materials

Choose timezone

CHEP 2018 Conference, Sofia, Bulgaria

Contact us

Speaker

Description

Primary author

Co-authors

Presentation materials

Share this page

Direct link

Social networks

Calendaring