4th Thematic CERN School of Computing

Name: 4th Thematic CERN School of Computing
Start: 2016-05-22T15:00:00+02:00
End: 2016-05-28T14:00:00+02:00
Location: MEDILS in Split, Croatia

22–28 May 2016

MEDILS in Split, Croatia

Europe/Zurich timezone

Support

computing.school@cern.ch

Scientific Program

The challenge of scientific data processing: commonalities, analogies and main differences among different sciences.
Size of scientific software projects.
Parallelism and asynchronism: computation and I/O.

Technologies & Platforms	Introduction to Efficient Computing The evolution of computing hardware and what it means in practice The seven dimensions of performance Controlling and benchmarking your computer and software Software that scales with the hardware Advanced performance tuning in hardware Intermediate Concepts in Efficient Computing Memory architectures, hardware caching and NUMA Scaling out: Big Data – Big Hardware The role of compilers and VMs A brief look at accelerators and heterogeneity Data Oriented Design Hardware vectorization in detail – theory vs. practice Software design for vectorization and smooth data flow How can compilers and other tools help? Summary and Future Technologies Overview Teaching program summary and wrap-up Next-generation memory technologies and interconnect Rack-sized datacenters and future computing evolution Software technologies – forecasts
Programming for concurrency and correctness	Scientific programming in C++: a modern approach Introduction: Amdahl's law, Performance and correctness of codebases A modern C++: new constructs, their advantages Gain runtime leaning on modern compilers Understanding the differences and commonalities of data structures, metrics for their classification, concrete examples Expressing Parallelism Pragmatically Trivial asynchronous execution Task and data decomposition Threads and the thread pool model In depth comparison of threads and processes, guidelines to choose the best option Message passing approach and analogies with object orientation Protection of Resources and Thread Safety The problem of synchronization Useful design principles Replication, atomics, transactions and locks Lock-free programming techniques Functional programming style and elements of map-reduce Third party libraries and high level solutions Ensure Correctness of a Parallel Scientific Application Correctness and reproducibility of a scientific result Stability of results and testing: regression, physics performance, tradeoffs Enforce avoiding thread unsafe constructs: focus on static analysis Algorithms for detecting synchronisation pathologies: focus on the DRD and Helgrind tools Elements of the GNU debugger: introduction and specific usage in the multithreaded case
Effective I/O for Scientific Applications	Structuring data for efficient I/O Pro/cons of row-column and mixed formats Compression and its efficiency dependencies on variable types, impact of data format Data addressing : limitation of hierarchical approach, usage of flat namespaces Stateful vs stateless interfaces for namespaces and I/O Many ways to store data Storage devices and their specificities Data federation Parallelizing files storage Map/Reduce Preserving Data Risks of data loss and corruption Data consistency (checksumming) Data safety (redundancy, parity, erasure coding) Key Ingredients to achieve effective I/O Asynchronous I/O I/O optimizations Caching Influence of data structures on I/O efficiency