Analytics WG meeting

Europe/Zurich
31/S-023 (CERN)

31/S-023

CERN

22
Show room on map

Participants:

Dirk, Christian, Raul, Luca, Luca, Vag, Rainer, Ulrich, Domenico, Sebastien, Eric, Marek

 

News:

- Hardware upgrade for the Hadoop cluster is planned

- We might also want to review what kind of software and configuration is needed. Dirk will ask the experiments, since requirements for IT are reasonably well understood.

- We have a high performance server from Techlab (32 cores, 512GB memory) that can be used for analysis. We will also test a trial version of R-Studio on this machine.

- Update on the Work items:

- Streaming of EOS data to HDFS is underway, Dirk will recheck the current status.

- The Experiment-Dashboard data is now transferred to HDFS

Examples in MapReduce and Spark:

- It might be interesting to have implementations of the same task for several languages, to get a rough idea of the performance impact.

Parallel R analysis:

- For python, ipython with pandas would be a very close alternative to R.

 

There are minutes attached to this event. Show them.
    • 14:00 14:05
      Minutes & News 5m
      Speaker: Dirk Duellmann (CERN)
    • 14:15 14:45
      Filtering, aggregating and histograms - a few complete examples wit MR, spark 30m
      Speakers: Luca Menichetti (CERN), Vag Motesnitsalis (Imperial College Sci., Tech. & Med. (GB))
      Slides
    • 14:45 15:05
      Parallel R analysis - some simple examples 20m
      Speaker: Dirk Duellmann (CERN)
      Slides