Analytics WG meeting
Participants:
Dirk, Christian, Raul, Luca, Luca, Vag, Rainer, Ulrich, Domenico, Sebastien, Eric, Marek
News:
- Hardware upgrade for the Hadoop cluster is planned
- We might also want to review what kind of software and configuration is needed. Dirk will ask the experiments, since requirements for IT are reasonably well understood.
- We have a high performance server from Techlab (32 cores, 512GB memory) that can be used for analysis. We will also test a trial version of R-Studio on this machine.
- Update on the Work items:
- Streaming of EOS data to HDFS is underway, Dirk will recheck the current status.
- The Experiment-Dashboard data is now transferred to HDFS
Examples in MapReduce and Spark:
- It might be interesting to have implementations of the same task for several languages, to get a rough idea of the performance impact.
Parallel R analysis:
- For python, ipython with pandas would be a very close alternative to R.