Kacper Surdy (CERN)
Public and private clouds based on VMs are a modern approach for deploying computing resources. Virtualisation of computer hardware allows additional optimizations in the utilisation of computing resources compared to the traditional HW deployment model. A price to pay when running virtual machines on physical hypervisors is an additional overhead. This is an area of concern in the context of high throughput computing and big data analytics where distributed data processing frameworks typically push hardware capabilities to their limit. This presentation reports on our tests and experience with the Hadoop components running on fully virtualized hardware using CERN OpenStack infrastructure. Pros and cons of running Hadoop on VMs vs. physical machines will be discussed as well as performance aspects when running CERN data analytics workloads on a virtual stack.
|Length of presentation (minutes, max. 20)||15|