Apr 18 – 22, 2016
DESY Zeuthen
Europe/Berlin timezone

Running virtualized Hadoop, does it make sense?

Apr 18, 2016, 4:10 PM
Seminar room 3 (DESY Zeuthen)

Seminar room 3

DESY Zeuthen

Platanenallee 6, 15738 Zeuthen (near Berlin), Germany
Storage & Filesystems Storage and file systems


Kacper Surdy (CERN)


Public and private clouds based on VMs are a modern approach for deploying computing resources. Virtualisation of computer hardware allows additional optimizations in the utilisation of computing resources compared to the traditional HW deployment model. A price to pay when running virtual machines on physical hypervisors is an additional overhead. This is an area of concern in the context of high throughput computing and big data analytics where distributed data processing frameworks typically push hardware capabilities to their limit. This presentation reports on our tests and experience with the Hadoop components running on fully virtualized hardware using CERN OpenStack infrastructure. Pros and cons of running Hadoop on VMs vs. physical machines will be discussed as well as performance aspects when running CERN data analytics workloads on a virtual stack.
Length of presentation (minutes, max. 20) 15

Primary authors

Presentation materials