EDM4hep Live Notes
==================

Date: April 20, 2021
Indico: https://indico.cern.ch/event/1030566/
This is a document for taking notes during EDM4hep meetings.

Connected: Benedikt, Teng, Tao, Andre, Thomas, Frank, Wenxing, Sang Hyun,
Birgit, Graeme (until 9:30), Placido, Clement, Jiaheng, Gerri

Apologies: Valentin,

## Introduction and General Points

* podio vCHEP submission accepted for plenary presentation
*

## Progress and Discussion

### Schema Evolution discussion

- Version for object descriptions, etc.
- issue: https://github.com/AIDASoft/podio/issues/86
- See slides

* Question 1: which use cases?
    * Dropping Members?
        * LCIO: was not allowed, only adding

    * Reading old files with new code, or also new files with old code?
        * Only compatibility with old files, not the other way

    * LCIO Experience: Sometimes changed numerical precision (e.g., int to long).

    * Root IO can do some "schema evolution"
        * Hook function to do anything one wants
            * Details?
    * only support up-to changing of meaning of parameters (e.g., cartesian to polar coordinates (point Q1.6))
    
    * No merging or splitting of datatypes

    * Changing from one-to-one to one-to-many relations?
        * Involves multiple datatypes (logically)

* Question 2: Where to put schema evolution
    * Evolve POD buffer?
        * TM: Not all IO libraries can do this
        * BH: can put logic in the POD buffer
    * FG: Example: datatype with energy, add energyError. Then add code getEnergyError that returns 0 if old file, member if there
        * BH: create fixed POD that has energyError=0
        * How to know if member is there?
            * Check against version of edm in the file
            * Would have a "pointer" against the file in the data object(?)
            * Better to hide this in the de-serializer
    * TM: Updating POD also let's the writer be agnostic of different versions
    * All agreed on the POD buffer evolution

* Question 3: Granularity of version tracking
    * Global versioning
    * Book keeping of versions: Need to keep old yaml files around, how?
        a) edm4hep.yaml (newest) <- edm4hep_v1.yaml
        b) edm4hep_v1.yaml, edm4hep_v2.yaml (newest), ...

    * Need to generate the transformation code.
        * Generate transformation to current version instead of chaining transformations
        * Need to allow user code to provide the transformation
        * IO library function that creates new version objects from old version objects

    * Version the EDM independent of the library
        * adding methods doesn't change datamodel, adding members (etc.), does
        * Use something human "sortable" for identifying versions

 

## Podio

### Benchmarking

* Discuss how to integrate with validation in Key4hep

### Issues/PRs

* Fixed width integer types
    * https://github.com/key4hep/EDM4hep/issues/112
    * Needs a bit of adaptation for code generation

* Root performance fixes
    * https://github.com/AIDASoft/podio/pull/180
    * Ready for Review

* Number of fixes [WIP]:
    * https://github.com/AIDASoft/podio/pull/177
    * Not working with ROOT 6.20/04 and older
    * New (stable) pyroot in 6.22 (to be confirmed?)
    * Drop support for older root?

#### Make a tag after all the warnings are fixed

#### Heap-use-after-free

* https://github.com/AIDASoft/podio/issues/174
* Not a problem in frameworks, but if collections used outside of them
* Deep inside the memory management of podio, so not easy to fix
* Happens more often with clang than with gcc, but could be compiler options.
    * Flagged by address-sanitizer
* Maybe requires deep changes. Change for reference counting of object classes.

#### c++ concepts
* BH: add compile time checks for class behaviours: e.g., movable

#### issue w/ ROOT and (vectors of) non-copyable collections
* happens in ROOT 6.22
* PM: there is a patch available in LCG repository
    * ROOT team is working on a general solution

#### What are the different branches in the root file?
* https://github.com/AIDASoft/podio/issues/169
* Encode more information in the _relation_ branch names?
* Related to use in RDataFrame/RNTupe, directly looking at root file content
* Are branch names an implementation detail?
* backward compatibility "Impossible" (?)

#### Schema Evolution
- https://github.com/AIDASoft/podio/issues/86
- Discussion: https://indico.cern.ch/event/1030566/

#### Multi-Threading

#### "event class" in podio

* Currently being perceived

### PRs
* https://github.com/AIDASoft/podio/pulls

### Meta Data

#### Usage of "metadata" for user defined data
* need to check if current implementation addresses all use cases
* need test use-cases

### EventStore

### Features

* Subset collections?

## LCIOConverters
* https://github.com/key4hep/k4LCIOReader

## EDM4hep
https://github.com/key4hep/EDM4hep/pulls

### EDM4hep tools
https://github.com/key4hep/EDM4hep-utils

### Issues

## AOB

### Next meeting:

* May 4, 2021

### TODO

### New tags for k4SimDelphes
* Updated podio
* Updated Delphes version
    * Update spack fork and main spack