Minutes
Welcome
Discussion
- Zach about Open Infrastructure meetinfgs: Hepdata, Swan, binder, collab etc (cloud infrastructures that support Open Data), EOSC infrastructure. How broad should this be?
- Could be split: Inward looking vs outward looking, or documents (INSPIRE, cds, indico) vs. data (hepdata etc)
- Giacomo endorses the idea to have a joint event with OSPO
Community Engagement
- New Open Science website is currently being developed and to be published soon: Beta-testers needed
- Community Ambassador network being launched: Get in touch if you want to share your excitement for Open Science
- In-person OSPF should become a regular event: Feedback needed as to who would like to join
- Indico survey for feedback
Discussion
- ATLAS can share a lot of stories, but it should be as clear as possible what the ask is
- For training it needs to be known who the target audience is
ATLAS Open Data
- Full chain of tools for Open Data from no-code solutions to OD for research
- Collecting projects on how OD is used
- Regional usage patterns monitored via monit-grafana: https://monit-grafana.cern.ch/d/da06d76c-24f0-4d23-b51e-da08d36c4ece/welcome?orgId=93
- In 2026 US use went down, and Brazil went up, also Africa got some uptake
- OD Tutorial (https://indico.cern.ch/event/1564767/): tried to cover all audiences, won't be repeated, rather online tutorials for specific audiences in 2026
- Big takeaway: We need to understand the audience
- Upcoming white paper on event generation data
- proof-of-concept: Agentic workflow to easily generate output from the Open Data from natural language
Discussion
- How do you collect the materials people use
- Reaching out to individuals, word of mouth, googling
- Can be tricky to track, as often data is only downloaded once and then used regularly in education
- What about the Masterclasses?
- Masterclasses are very well known and widely used, so it's hard to replace.
- Needs more discussions with IPPOG, maybe will be partially replaced
- Why using atlasopenmagic and not something from the open data portal?
- Long term vision would be to have something universal for everyone
CMS Open Data
- It was clear the data needs to be preserved, but who would use open data?
- First use case by theorists: Had significant challenges with the file format
- Eventually had the first OD workshop for theorists in 2020
- So far in total 6 workshops: https://cms-opendata-guide.web.cern.ch/cmsOpenData/workshops/
- Ratio participants to registrants is relatively low
- Workshops so far have received good feedback (>90% recommendation rate)
- Data format has now changed to NANOAOD from AOD in the beginning, which does not require CMS specific software
- Next workshop: 28-30 July 2026 at University of Notre Dame, USA, focus on educational use-cases targeted to teachers
Discussion
- In ATLAS also issues with no-shows, maybe worth considering a small participation fee so people are committed
- What about the drop-out rate over the week as material gets harder
- Decline was not super high
- Had large registration numbers in the beginning, but then many didn't show up. However, actual participation seemed to be quite constant
- Some wanted to get a certificate for university. For this it should be necessary to go through exercises
- Which datasets do you use in the CMS workshops? And do participants actually understand an actual analysis?
- Hopefully the students should understand the whole analysis, at least that's what the material is presenting
- Support is there via the OD Forum
LHCb
- Ntupling Service is now released since February, no more need to know LHCb tupling software
- Ntuples are created on Grid and can be downloaded by user
- Access to 4 PB of Run 1 & 2 data
- Needs some guidance as first-time user: https://lhcb-opendata-guide.web.cern.ch/ntupling-service/
- Currently working on implementing more examples, to be published by May
- Only 2 people actively working on Open Data in LHCb
- Several presentations at conferences etc.
- Answering questions from users
- So far 14 requests, various use cases (CP asymmetry searches, fitting algorthms, educational/learning root)
- So far 4 TB od OD has been produced
- Will organize future LHCb OD events
Discussion
- What's the background of the people placing the requests?
- Mostly either theorists or high school students
- Are there restrictions for LHCb members to use the OD? It might be interesting to create educational datasets
- For educational datasets, this is used also by LHCb members; for physics publications it's not allowed to use OD for LHCb members
- How is this advertised?
- On the outreach website and conferences; not via Social Media
- Could be worth putting in physics talks
ALICE
- Data format has been changed after run 2 to AO2D, so the run 1 and run 2 data has to be converted into this format to be shared as OD
- currently old format data (7.6 TB) still in OD Portal, but probably cannot be used anymore
- 2015 Pb-Pb data (62 TB) has been released as educational data in March
- In the next years much more data is being released to hit the release targets, expect to upload the converted run 1 data by the end of the year
- O2OpenAccess is the software repository for ALICE Software: https://github.com/AliceO2Group/O2OpenAccess
- Software is the same software for internal use and open data
There are minutes attached to this event.
Show them.