- attendance:
- Matteo Cremonesi (FNAL), Cristina Mantilla (FNAL/Johns Hopkins), Saba Sehrish (FNAL), Jim Kowalkowski (FNAL), Jim Pivarski (Princeton), Alexey Svyatkovskiy (Princeton), Bo Jayalatika (FNAL), Maria Girone (CERN/openlab), Ian Fisk (Simons Foundation), Volker Tresp (Siemens), Tobias Enrich (LMU), Jin Chang (FNAL), Ruth Pordes (FNAL)
Many thanks to JimK for writing notes!
first milestone in 4 weeks: load BACON ntuples in hadoop+spark and produce a plot
- Overall goal and time schedule
- Realize CMS analysis use case in industry big data technology
- Document comparison to traditional analysis using HEP specific ROOT framework in write-up by Fall 2016
- Start with hadoop+spark, when complete, possibility to branch out
- use case discussion
- Physics: Matteo, Cristina
- currently publishing results of analysis use case with 2015 data
- plan is to further develop the analysis and integrate 2016 data, publish update in Fall 2016
- technical team: Matteo, Cristina, JimP, Alexey, Saba
- start with BACONprod ntuples and not MINIAOD
- Matteo and Cristina will publish to slack
- code via github
- files to download (one small with a few events, one ~GB size)
- JimP will publish to slack interface code via github (already done)
- testing platforms
- Alexey has 10 node testing cluster at Princeton and will give access
- Ian is planning to have a test setup at Simons in New York and will also provide access
- Matteo, Cristina to figure out together with Alexey and Ian to transfer larger quantities of BACON ntuple files to Princeton and New York
- Milestones and meeting time schedule
- Meeting every two weeks in this time slot, Wednesday’s at 10 AM CST, 5 PM CET
- first milestone in 4 weeks: load BACON ntuples in hadoop+spark and produce a plot
- everyone: think about further milestones and parts of the project that needs to be accomplished by Fall 2016
- technical discussion
- discussion about content of BACON ntuples (flat or flat/flat) ➜ answer is flat (simple structure of classes)
- discussion of loading data from ROOT files or from pre-converted data in HDFS
- discussion about analysis in python or scala ➜ will start with python for interactive part, scala will be used for slimming/skiming
There are minutes attached to this event.
Show them.