Hops Hadoop and Q&A with Visiting Guest Speaker Jim Dowling

513/1-024 (CERN)



Show room on map
Videoconference Rooms
Analytics WG meeting
Luca Canali
Auto-join URL
Useful links
Phone numbers
    • 15:00 16:00
      Hops Hadoop, Hopsworks and Q&A with Guest Speaker 1h

      This sessions follows up from the morning Computing seminar, see https://indico.cern.ch/event/716743/

      Hops is a drop-in replacement for Hadoop that can scale the Hadoop Filesystem (HDFS) to over 1 million ops/s by migrating the NameNode metadata to an external scale-out in-memory database. This talk will introduce recent improvements in HopsFS: storing small files in the database (both in-memory and on SSD disk tables), a new scalable block-reporting protocol, support for erasure-coding with data locality, and work on multi-data center replication. For small files (under 64-128 KB), HopsFS can reduce read latency to under 10ms, while also improving read throughput by 3-4X and write throughput by >15X. Our new block reporting protocol reduces block reporting traffic by up to 99% for large clusters, at the cost of a small increase in metadata. While our solution for erasure-coding is implemented at the block-level preserving data locality. Finally, our ongoing work on geographic replication points a way forward for HDFS in the cloud, providing data-center level high availability without any performance hit.
      One novel aspect of Hops we will discuss, is its use of TLS certificates as an alternative authentication/authorization mechanism to Kerberos. Apart from the improved scalability of certificate managers, compared to the Kerberos KDC, certificates offer the ability to support multi-tenancy and easier integration with devices/clients in external administrative domains. Finally, we will discuss operational support for Hops, and how it supports new features such as Anaconda, Spark, Hive, and TensorFlow.

      Speaker: Dr Jim Dowling (KTH Royal Institute of Technology in Stockholm)
    • 16:00 16:30
      Q&A 30m

      Q&A on the topics presented in this sessions and in the morning computing seminar

Your browser is out of date!

If you are using Internet Explorer, please use Firefox, Chrome or Edge instead.

Otherwise, please update your browser to the latest version to use Indico without problems.