- Compact style
- Indico style
- Indico style - inline minutes
- Indico style - numbered
- Indico style - numbered + minutes
- Indico Weeks View
PPD/ Round Table-WH11SE - Wilson Hall 11th fl South East
CERM room:
Instructions to create a light-weight CERN account to join the meeting via Vidyo:
If not possible, people can join the meeting by the phone, call-in numbers are here:
The meeting id is hidden below the Videoconference Rooms link, but here it is again:
## 21st Big Data Meeting
* SuperComputing
* Poster session (2 hours)
* 10 people stopped by
* One from Cray, Ebay, Particle physicists, few students
* people were interested in seeing performance difference to read from lustre and hdfs
* comment from Alexey: we have Lustre at Princeton, available for comparisons
* ebay was interested in configuration, hitting some scaling issues
* questions/comments
* Why did we convert from ROOT to HDF5
* H5Spark from NERSC did not work for us
* spoke with person from LBNL, working on future version of HDF5, interested in implementing our queries, Saba could just give him our headers, not the data itself
* LBNL has an own system that does what Spark does
* interested in working with us
* careful, we work on CMS data, LBNL is Atlas
* company: 2sigma
* Spark extension for time series data
* Saba was supposed to look into it
* Wes McKinney works at 2sigma: Arrow, Pandas
* Alexey is participating in SparkSummit East, Boston
* talking about Histogrammer and Princeton efforts in research track
* February 7-9, deadline passed
* Saba: should consider to report at the next SparkSummit
* San Francisco
* call will be opened January
* Future data reading:
* reading ROOT files from Java
* right now, you can read simple types, fixed size arrays variable dimensions, arrays where one dimension is variable length and others are fixed, struct specifying the list of leaves
* BaconProd files are not mapped correctly yet
* working on this
* root4j is in maven central
* spark-root is in git and will move to maven central with the next release including stl types
* new data
* 2016 will be used
* re-reco is ready to be used
* MC is being produced
* end of the year, all should be available
* next meeting in January
* focus on chep proceedings