Speaker
Description
Abstract
I/O bottlenecks are often the hidden performance killers in HEP data analysis, yet many scientists lack a deep understanding of how storage systems actually work. This lecture demystifies the complete I/O stack (from application code to physical storage) and introduces both classical and modern techniques for optimal performance. We’ll explore fundamental concepts (buffered I/O, direct I/O, mmap), diagnose real bottlenecks using Linux tools, and cover network I/O patterns critical for Grid jobs. The session culminates with modern asynchronous I/O APIs like io_uring that are revolutionizing high-performance computing. Participants will leave with practical skills to optimize their analysis workflows and a solid foundation for future learning.
Topics
Fundamentals and Diagnosis
- I/O stack and core concepts
- Diagnosing I/O performance problems
- Optimization strategies for local storage
Network I/O, Frameworks, and Modern APIs
- Network I/O for Grid Computing
- Modern Asynchronous I/O APIs
- Quick reference for common HEP scenarios
Hands-On Session (examples)
- Profile and diagnose I/O bottlenecks in analysis code
- Optimize ROOT analysis workflow
- Build simple async storage system with io_uring
| Number of lecture hours | 2 |
|---|---|
| Number of exercise hours | 1 |
| Attended school | tCSC 2024 Heterogenous Architectures (Belgrade) |