12-18 May 2019
Split, Croatia
Europe/Zagreb timezone

Academic programme

High Throughput Distributed Processing of Future HEP Data


  • The challenges of HEP data processing in the post upgrade scenarios.
  • Scientific software as the key to achieve the deliverables of the (HL-)LHC Physics Programme
  • Parallelism, performance and programming models for exploitation of resources on a single box or on a cluster.
  • The central role of data management, input and output.
  • Evolution of hardware and platforms and their requirements on data analysis and tools.

Track 1: Technologies and Platforms

(4h lectures + 4h exercises)

"Introduction to efficient computing" by Andrzej Nowak

  • The evolution of computing hardware and what it means in practice
  • The seven dimensions of performance
  • Controlling and benchmarking your computer and software
  • Software that scales with the hardware
  • Advanced performance tuning in hardware

"Intermediate concepts in efficient computing" by Andrzej Nowak

  • Memory architectures, hardware caching and NUMA
  • Scaling out: Big Data – Big Hardware
  • The role of compilers and VMs
  • A brief look at accelerators and heterogeneity

"Data-oriented design" by Andrzej Nowak

  • Hardware vectorization in detail – theory vs. practice
  • Software design for vectorization and smooth data flow
  • How can compilers and other tools help?

"Summary and future technologies overview" by Andrzej Nowak

  • Teaching program summary and wrap-up
  • Next-generation memory technologies and interconnect
  • Rack-sized data centres and future computing evolution
  • Software technologies – forecasts

Track 2: Parallel and Optimised Scientific Software Development

(6h  lectures + 6h exercises)

"Computational challenges of run III and HL-LHC" by Danilo Piparo

  • HEP data processing: from acquisition to analysis
  • The upgrades of the LHC detectors and of the accelerators
  • Upgrades: challenges of the new dataset and implications for scientific software
  • Commonalities and differences with other disciplines such as genomics, plasma physics, astronomy

"Scientific programming: a modern approach" by Danilo Piparo

  • Introduction: Amdahl's law, Performance and correctness of codebases
  • Modern C++: new constructs, their advantages
  • Exploit modern architectures using Python
  • Near the hardware: the role of compilers
  • Understanding the differences and commonalities of data structures, metrics for their classification, concrete examples

"Expressing parallelism pragmatically" by Danilo Piparo

  • Trivial asynchronous execution
  • Task and data decomposition
  • Threads and the thread pool model
  • In depth comparison of threads and processes, guidelines to choose the best option

"Protection of resources and thread safety" by Danilo Piparo

  • The problem of synchronization
  • Useful design principles
  • Replication, atomics, transactions and locks
  • Lock-free programming techniques
  • Functional programming style and elements of map-reduce
  • Third party libraries and high level solutions

"Optimizing existing large codebase" by Sebastien Ponce

  • How to measure performance. Key indicators, tools and their pros and cons
  • The nightmare of thread safety
  • Data structures for performant computation in modern C++

"Pratical vectorization" by Sebastien Ponce 

  • Measuring vectorization level
  • What to expect from vectorization
  • Preparing code for vectorization
  • Vectorizing techniques in C++: intrinsics, libraries, autovectorization

Track 3: Effective I/O for Scientific Applications

(2h  lectures + 2h exercises)

"Data storage and preservation" by Sebastien Ponce

  • Storage devices and their specificities
  • Risks of data loss and corruption
  • Data safety (redundancy, parity, erasure coding)

"Key ingredients to achieve effective I/O" by Sebastien Ponce

  • Asynchronous I/O
  • I/O optimizations
  • Caching
  • Influence of data structures on I/O efficiency
Your browser is out of date!

If you are using Internet Explorer, please use Firefox, Chrome or Edge instead.

Otherwise, please update your browser to the latest version to use Indico without problems.