Skip to main content

EDM4hep Discussion

Europe/Zurich
Zoom

Zoom

EDM4hep Live Notes
==================

Date: Apr 29, 2025
Indico: https://indico.cern.ch/event/1542864/

Connected: Mateusz, Brieuc, Sanghyun, Juraj, Tao, Leonhard, Thomas, Juan, Jacopo, Pere, Andre, Aurora, Frank

Apologies:


### Upcoming workshops / conferences
https://github.com/orgs/key4hep/projects/4/views/1
    
## Progress and discussion

### dynamic loading / path issue in k4FWCore / podio / ROOT / spack?
* https://github.com/key4hep/key4hep-spack/issues/736
* [x] Check differences in build output for podio before and after break
    * No real differences observed between broken & non-broken builds
* https://github.com/spack/spack/pull/42844 seems to fix things
    * Sets `PODIO_SET_RPATH=ON` option for cmake

## Podio
* https://github.com/AIDASoft/podio/pulls
* https://github.com/AIDASoft/podio/issues
* https://github.com/orgs/AIDASoft/projects/2/views/1
 
### Merged PRs
* Use ccache to speed up CI workflows [#764](https://github.com/AIDASoft/podio/pull/764) [#769](https://github.com/AIDASoft/podio/pull/769)
* Kepp RNTuple API compatibility down to ROOT v6.34 [#757](https://github.com/AIDASoft/podio/pull/757)
* Re-enable RNTuple support for ROOT 6.32 [#768](https://github.com/AIDASoft/podio/pull/768)
* Remove double locking in `prepareForWrite` [#771](https://github.com/AIDASoft/podio/pull/771)
* Pass by const reference in the object constructor and member setters [#767](https://github.com/AIDASoft/podio/pull/767)
* Remove unnecessary checks of cpp standard [#774](https://github.com/AIDASoft/podio/pull/774)


### Store the collection information in a struct instead of a tuple
* https://github.com/AIDASoft/podio/pull/711
* Makes `TTree` version store information as `struct` instead of `std::tuple`
* RNTuple version already stores info without `std::tuple`, **but in different format**
* [x] Harmonize formats
* Now stored as a `vector` of structs. Previosly was AoS (effectively) for RNTuple
* Now also store the data format that hits the disk (i.e. `vector<XYZData>`)
* [ ] Let Pere double check if this is as desired

### Restore compatibility with gcc11
* https://github.com/AIDASoft/podio/pull/773 
* https://github.com/AIDASoft/podio/issues/705
* Essentially done
    * [x] CI workflow that builds on top of gcc 11
* Merging later today

### Add `RVec` to root dictionaries
* https://github.com/AIDASoft/podio/pull/750
* Triggered by https://github.com/key4hep/EDM4hep/issues/416
* `RVec` dictionaries are currently (partially) defined via FCCAnalyses
* [ ] Where should the go (formally)?
    * FCCAnalyses is effectively "polluting" the global environment. If we provide this, it should be done centrally
    * In principle our files are not part of the public API (no schema evolution, no link resolution)
    * No easy way to figure out where dictionary definitions come from via ROOT.
        * E.g. bisecting `LD_LIBRARY_PATH`
* Features are used nevertheless. Users don't care about all the features always
    * Other use cases: ML related studies partially use uproot
    * Reality: Users do what they want, regardless of what we officially support
    * Better to support centrally than to have it in random places
    * RDataFrame is the more or less official way for processing files in ROOT
* We could improve usability with plain ROOT
    * E.g. more schema evolution support (that we partially need to do in any case)
* [x] Discuss again in slightly larger round
* Main responsible repository (e.g. receiving issues)
    * Could be FCCAnalyses (as main customer)
* Push problem to ROOT?
    * Default storage in snapshot is `RVec` but can be switched to `std::vector` from ROOT 6.34 onwards (need to set corresponding option)
    * Might be possible to read these files back via podio, but should not spend a lot of effort to enable this to not give false incentive to mix Gaudi and FCCAnalyses / RDataFrame filtering / analysis
    * [ ] Issue in ROOT to not change to `RVec` if input is `std::vector`
* **TL;DR: We keep things as they are at the moment and document future usage, i.e.**
    * **No central dictionaries for `RVec` in EDM4hep** (but they can stay in FCCAnalyses)
    * **Document how to switch to storing `std::vector`s and mention caveats that go along with it**
    * **Mention clearly that there is no support for coming back from an RDataFrame based analysis**

### Formalize common interface collections with a concept
* https://github.com/AIDASoft/podio/pull/758
* Better compile time checks for collection interace parts that are commonly generated but go beyond `CollectionBase`
* Make it public?
    * Currently keep it as a test to make sure that all collections provide the same interface. We have had some issues with inconsistencies in the past 
* [ ] Clarifying (documentation) comment missing
    
### Reading every library in `LD_LIBRARY_PATH` when using Python
* github.com/AIDASoft/podio/issue/770
* Opens every library it finds on `LD_LIBRARY_PATH`
* Could be mitigated but is a ROOT issue
* ROOT issue: https://github.com/root-project/root/issues/18489
* CVMFS caching doesn't apply here because you have to open all libraries not just locate them


## EDM4hep
* https://github.com/key4hep/EDM4hep/pulls
* https://github.com/key4hep/EDM4hep/issues
* https://github.com/orgs/key4hep/projects/5 


### Merged PRs
* Fix building with c++23 [#427](https://github.com/key4hep/EDM4hep/pull/427)
* Remove intermediate variables and make loading consistent in `__init__.py` [#426](https://github.com/key4hep/EDM4hep/pull/426)


### Clarification of trackerHitPlane members
* https://github.com/key4hep/EDM4hep/issues/421
* TrackerHitPlane in LCIO originally meant for tracking without geometry(?)
* Directions that span the measurement plan are a property of the sensor, not the hit?
* Is TrackerHitPlane incomplete or overlapping with information from sensor to do proper tracking
* [x] Ping Frank and clarify documentation
* Was `TrackerHitPlane` ever used in tracking without accessing the geometry?
* `u` and `v` are mainly required for interpreting errors
* Problem from genfit: Requires (true) local position on the surface. Need origin of surface for that
* `TrackerHitPlane` stores global Position and information that spans the plane (via `u` and `v`), but no local position on the sensor surface
    * Need geometry to go from global to local
    * Or a origin of the sensor in global coordinats for self-consistency (and convertability without geometry)
* DD4hep geometry also gives access to `u` and `v`
    * `TrackerHitPlane` pre-dates this well defined geometry system
* "Proper" tracking will always need geometry (material, etc.)
* [ ] PR with documentation clarifications
* Need to revisit tracking classes and hits once we are closer to ACTS tracking

### Pointer persistency in EDM4hep
* https://github.com/key4hep/EDM4hep/issues/418
* Interplay, e.g. with Pandora that expects a pointer
    * Pandora is not touching this, just using this to transport information along
    * E.g. could be `std::any` and Handle is kept alive because `std::any` is constructed by value (via new member)
* Need to have a way for EDM4hep
    * Current workarounds would require filling a temporary vector to keep pointers to handles alive
    * No really user friendly way at the moment 
* Do we need a nice way? Should only appear in few central places
    * [ ] Document workaround with vector
    * [ ] Point uses to issue and make them indicate they use this
    * [ ] Decide on number of actual users whether it's worth the effort to do a "proper solution"

### Limit the RNTuple backwards compatibility to podio 1.2 
* https://github.com/key4hep/EDM4hep/pull/413
* Necessary after https://github.com/AIDASoft/podio/pull/711

### [WiP] Replace the spin by a helicity property for the MCParticle
* https://github.com/key4hep/EDM4hep/pull/404
* Debugging special schema evolution that ideally stores `spin.z` into `helicty`
    * [x] Do we want / need this? -> Yes, at least would be worthwile to see if schema evolution can deal with this
    * Quite a few existing LCIO files with `spin`, but could be absorved in the LCIO to EDM4hep converter
    * [ ] Are there existing EDM4hep files with this?
* [ ] Need to check where this breaks downstream
* `9` as default value for "unset", why?
    * Historically what some generators have been using

### add python bindings for PIDHandler
* https://github.com/key4hep/EDM4hep/pull/397
* Some details still to be clarified(?)
* Lifetimes with `std::optional` (and others) are not behaving intuitively on the python side (with cppyy)
    * Could add throwing overloads for wrappers because there is better support for that. But naming could become confusing because we have `get_optional` in other cases and would need a different name here for the throwing version.
* Many use cases dictated by podio usage of `optional` (e.g. parametes). 
* [ ] Potentially need to discuss / decide on how our interfaces should look like in general

### Use pre-built hooks in pre-commit
* https://github.com/key4hep/EDM4hep/pull/402
* Streamlines setup
    * "Looks like other repositories"
    * clang-tidy not easily packaged (no issue for EDM4hep, but other packages use it)
* [x] Can pre-commit use a different config file than `.pre-commit-config.yaml`
    * Yes, can use the `--config` flag
    
## Converter & MarlinWrapper

### Add utilities to facilitate the handling of the different I/O formats and related conversions
* https://github.com/key4hep/k4MarlinWrapper/pull/234
* See application in https://github.com/iLCSoft/ILDConfig/pull/161
* Based on `IOSvc`
* Mid-term future this should replace the helper functionality in `inputReader` (`create_reader`, `attach_edm4hep2lcio_conversion`) as that is still based on `PodioInput`, etc.

## AoB

## Next meeting:
* May 13, 09:00 CET (IMCC meeting @DESY)

There are minutes attached to this event. Show them.
    • 09:00 09:05
      Introduction 5m
      Speakers: Andre Sailer (CERN), Frank-Dieter Gaede (Deutsches Elektronen-Synchrotron (DE)), Thomas Madlener (Deutsches Elektronen-Synchrotron (DESY))
    • 09:50 09:51
      Discussion 1m
      Speaker: Dr All