May 22 – 28, 2016
MEDILS in Split, Croatia
Europe/Zurich timezone

Scientific Program

  • The challenge of scientific data processing: commonalities, analogies and main differences among different sciences.
  • Size of scientific software projects.
  • Parallelism and asynchronism: computation and I/O.
Technologies & Platforms
  • Introduction to Efficient Computing
    • The evolution of computing hardware and what it means in practice
    • The seven dimensions of performance
    • Controlling and benchmarking your computer and software
    • Software that scales with the hardware
    • Advanced performance tuning in hardware
  • Intermediate Concepts in Efficient Computing
    • Memory architectures, hardware caching and NUMA
    • Scaling out: Big Data – Big Hardware
    • The role of compilers and VMs
    • A brief look at accelerators and heterogeneity
  • Data Oriented Design
    • Hardware vectorization in detail – theory vs. practice
    • Software design for vectorization and smooth data flow
    • How can compilers and other tools help?
  • Summary and Future Technologies Overview
    • Teaching program summary and wrap-up
    • Next-generation memory technologies and interconnect
    • Rack-sized datacenters and future computing evolution
    • Software technologies – forecasts
Programming for concurrency and correctness
  • Scientific programming in C++: a modern approach
    • Introduction: Amdahl's law, Performance and correctness of codebases
    • A modern C++: new constructs, their advantages
    • Gain runtime leaning on modern compilers
    • Understanding the differences and commonalities of data structures, metrics for their classification, concrete examples
  • Expressing Parallelism Pragmatically
    • Trivial asynchronous execution
    • Task and data decomposition
    • Threads and the thread pool model
    • In depth comparison of threads and processes, guidelines to choose the best option
    • Message passing approach and analogies with object orientation
  • Protection of Resources and Thread Safety
    • The problem of synchronization
    • Useful design principles
    • Replication, atomics, transactions and locks
    • Lock-free programming techniques
    • Functional programming style and elements of map-reduce
    • Third party libraries and high level solutions
  • Ensure Correctness of a Parallel Scientific Application
    • Correctness and reproducibility of a scientific result
    • Stability of results and testing: regression, physics performance, tradeoffs
    • Enforce avoiding thread unsafe constructs: focus on static analysis
    • Algorithms for detecting synchronisation pathologies: focus on the DRD and Helgrind tools
    • Elements of the GNU debugger: introduction and specific usage in the multithreaded case
Effective I/O for Scientific Applications
  • Structuring data for efficient I/O
    • Pro/cons of row-column and mixed formats
    • Compression and its efficiency dependencies on variable types, impact of data format
    • Data addressing : limitation of hierarchical approach, usage of flat namespaces
    • Stateful vs stateless interfaces for namespaces and I/O
  • Many ways to store data
    • Storage devices and their specificities
    • Data federation
    • Parallelizing files storage
    • Map/Reduce
  • Preserving Data
    • Risks of data loss and corruption
    • Data consistency (checksumming)
    • Data safety (redundancy, parity, erasure coding)
  • Key Ingredients to achieve effective I/O
    • Asynchronous I/O
    • I/O optimizations
    • Caching
    • Influence of data structures on I/O efficiency