Ddev
Live meeting notes: https://codimd.web.cern.ch/QMVdoWxiShS9Iv9Mp9ifKw?view
# DIRAC Development Meeting (Ddev)
**At CERN**: Juraj, Yan, Alexandre, Christophe, Benedikt, Ryan, Andrea, Alan
**On Zoom**: Loris, Natthan, Simon, Yan, Andrei, Stella, Mazen
**Apologies**: Federico, Chris
## Product Goals & Roadmaps
- Transition to DiracX:
```mermaid
flowchart LR
subgraph CWL["CWL"]
CWL1("CWL submission endpoint"):::inprogress
CWL2("CWL production system")
CWL3("Transformation system machinery"):::blocked
CWL4("Use CWL natively in new matcher"):::blocked
end
subgraph Core["Core"]
CoreTasks("Tasks"):::inprogress
Core2("RSS"):::inprogress
Core3("DMS")
end
subgraph WMS["WMS"]
WMS1("Matcher"):::inprogress
WMS2("Pilot authentication"):::inprogress
WMS3("Pilot submission"):::blocked
end
CWL3 --> CWL4
CoreTasks --> Core2 --> Core3
CoreTasks --> WMS1
CoreTasks --> CWL3
WMS1 --> CWL4
CoreTasks --> WMS3
click CoreTasks "https://www.github.com" "This is a tooltip for a link"
classDef done fill:#B2DFDB,stroke:#00897B,color:black,stroke-width:2px;
classDef inprogress fill:#FFF9C4,stroke:#F9A825,color:black,stroke-width:2px;
classDef blocked fill:#BBBBBB,stroke:#222222,color:black,stroke-width:2px;
subgraph Legend
L2("Completed"):::done
L4("In progress"):::inprogress
L1("Ready for work")
L3("Blocked"):::blocked
end
```
- CWL integration:
```mermaid
flowchart LR
subgraph dirac_cwl["dirac-cwl"]
job1("Prototype Job Endpoint"):::done
transformation("Prototype Transformation Endpoint"):::inprogress
workflows("Workflows"):::inprogress
prod("Prototype Production Endpoint"):::inprogress
end
subgraph DiracX1["DiracX"]
prod_diracx("Implement the CWL Production System")
trans_diracx("Implement the CWL Transformation endpoint")
trans_diracx_original("Implement the Transformation System"):::blocked
diracx_tasks("Implement DiracX Tasks"):::blocked
job_diracx("Implement the CWL Job Endpoint"):::inprogress
end
diracx_tasks --> trans_diracx_original
trans_diracx_original --> trans_diracx
transformation --> trans_diracx
job1 --> workflows
prod --> prod_diracx
prod_diracx -.-> deliver2(["Can submit productions to DiracX /productions"]):::milestone
trans_diracx -.-> deliver3(["Can submit transformations to DiracX /transformations"]):::milestone
job_diracx -.-> deliver5(["Can submit jobs to DiracX /jobs"]):::milestone
classDef done fill:#B2DFDB,stroke:#00897B,color:black,stroke-width:2px;
classDef inprogress fill:#FFF9C4,stroke:#F9A825,color:black,stroke-width:2px;
classDef blocked fill:#BBBBBB,stroke:#222222,color:black,stroke-width:2px;
classDef milestone fill:#FFDFE5,stroke:#FF5978,color:#8E2236,stroke-width:2px;
subgraph Legend
L1("Completed"):::done
L2("In progress"):::inprogress
L3("Ready for work")
L4("Blocked"):::blocked
L5("Milestone"):::milestone
end
```
## Refinements
### Needs triage
https://github.com/orgs/DIRACGrid/projects/30/views/7
**Goal: build a shared understanding of the project.**
> DIRAC
- [PilotSync Agent does not upload files to https servers](https://github.com/DIRACGrid/DIRAC/issues/8640)
> DiracOS2
> WebAppDIRAC
> diracx-web
> diracx
- [Fix warnings in test suite](https://github.com/DIRACGrid/diracx/issues/935)
- [Cache static endpoint responses](https://github.com/DIRACGrid/diracx/issues/835)
- [Integrate MCP Server](https://github.com/DIRACGrid/diracx/issues/827) - still needs to be discussed on DOps first
- [RSS](https://github.com/DIRACGrid/diracx/issues/790)
- [Phase1](https://github.com/DIRACGrid/diracx/issues/836)
- [Phase2](https://github.com/DIRACGrid/diracx/issues/889)
> Pilot
> diracx-charts
> dirac-cwl
> signurlarity
- [Shall we commit pixi.lock file](https://github.com/DIRACGrid/signurlarity/issues/45)
- [Move to httpx2](https://github.com/DIRACGrid/signurlarity/issues/45)
- [name=Christophe] TODO: we should add lhcb workflow transition documentation directly in diracx for the other communities
- TODO: adding a word about dropping pre-commit ci
**External deps**
### [Temporary Section] In progress, predating the new organization
https://github.com/orgs/DIRACGrid/projects/30/views/8
Various people still need to deal with old and staled PRs. We will take them into account in the next sprints.
### External dependencies
https://github.com/orgs/DIRACGrid/projects/30/views/9
---
[Planning Poker](https://en.wikipedia.org/wiki/Planning_poker)
Story points values (based on Fibo)
- `1pt`: Trivial, very clear (small bug fix, config change)
- `2pts`: Small, well understood (small feature, clear requirements)
- `3pts`: Medium, some unknowns (moderate feature)
- `5pts`: Large, significant complexity (major feature, integration)
- `8pts`: Very large, many unknowns (should probably be split)
- `13+pts`: TOO BIG - must split!
- `?`: not enough knowledge to answer (remember it's ok to ask any questions)
## Sprints
### Planning (Velocity and Planning Poker)
- Backlog: https://github.com/orgs/DIRACGrid/projects/30/views/3
- Current Sprint: https://github.com/orgs/DIRACGrid/projects/30/views/1

**Average Velocity: ~5.8 x FTEs** *Last update: Jun 25th*
#### :warning: Velocity is a planning tool, not a performance target
- Velocity going down is NOT bad
- Velocity going up is NOT always good (might mean over-estimation)
- Velocity varies sprint-to-sprint
- We track it to improve estimation, not to judge people
**What affects velocity:**
- Estimation accuracy (we're still learning)
- Complexity of work
**Our focus:** Delivering value and hitting commitments, not maximizing velocity numbers.
### July 9th (IN PROGRESS):
#### Target and Context
- **Transition**:
- Clean up existing issues/PRs: [Burning Charts](https://github.com/orgs/DIRACGrid/projects/30/insights?period=3M)
- Integrate ADRs & precise roadmap
- **RSS**: Finish Phase1
- **Match Making**: Finish Phase3
- **Pilot**: Finish PR1
#### Availability
- [name=alexandre] 50%
- [name=natthan] %
- [name=luisa] %
- [name=loris] 40%
- [name=stella] 10%
- [name=jorge] %
- [name=ryun] 30%
- [name=federico] %
- [name=heloise] %
- [name=christophe] 5%
- [name=chris] %
- [name=janusz] 40%
- [name=mazen] 10%
- [name=andrei] %
- [name=yan] 100%
- [name=Simon] 10%
- [name=daniela] %
- [name=Hideki] %
- [name=Benedikt] 30%
- [name=Juraj] 10%
_ FTEs * _ = _ story points
Expected Story Points:
Persons:
Expected Velocity:
#### Sprint Planning:
- Backlog: https://github.com/orgs/DIRACGrid/projects/30/views/3
- Sprint: https://github.com/orgs/DIRACGrid/projects/30/views/1
### June 25th (DONE):
Expected Story Points: 59
Persons: 4.2
Expected Velocity: 14
18/4.2 = 4.3
Comments: as usual, we are still overestimating our efforts. A lot of unexpected bug fixes preventing us from going forward with current items.
#### Sprint Planning:
- Backlog: https://github.com/orgs/DIRACGrid/projects/30/views/3
- Sprint: https://github.com/orgs/DIRACGrid/projects/30/views/1
#### Sprint review: https://github.com/orgs/DIRACGrid/projects/30/views/11
Related to our goals:
- **DIRAC to DiracX transition:**
- Mostly bug fixes
- Hardened github CI config
- Dropped boto: should speed up fresh installation
- **CWL integration:**
- Closed Ryun's PR: was a POC that will need to be adjusted. Discussion with CWL founder in the coming week to get some feedback
- **Match-Making POC:**
- NTR
- **DIRAC maintenance:**
- Lot of work to improve security
#### Sprint retrospective
*The sprint is a boat :boat: ; we are trying to reach an island (target); identify anchors (what slowed you down), wind (what helped), and rocks ahead (risks for next sprint)*
:warning: **Focus on the process, not people. We're here to improve together! 🚀**
**:anchor: Anchors (what slowed you down)**
- *Example: Unclear requirements on X; Waiting for Y delayed Z; ...*
refresh token DB:
- issue not well understood: just realized while making the PR
- we should describe the process when something like this happens, and we should update the first comment (github provides a diff of the different versions of the comment so it's fine)
- Likely a setting (for PRs/issues automatically closed) - to investigate
**:cloud: Wind (what helped)**
- *Example: Good communication in weekly meetings; Quick code reviews; Clear acceptance criteria on user stories; ...*
- LLMs: how pytest pluging related to the different toml files (list is long)
**🪨 Rocks (risks for next sprint)**
- *Example: Team member K on vacation; Dependency on external API L; Technical debt in M; ...*
- LHCb week
- Hackathon
---
### Previous Sprints
#### Summary
- June 11th:
- *20 Story Points / 3.8 people = 5.2 velocity*
- Comments:
- Big difference between expected velocity and effective one.
- May 28th:
- *16.2 Story Points / 4.2 people = 3.8 velocity*
- Comments:
- Big difference between expected velocity and effective one.
- CHEP planning and long weekends (CH, France) largely affected it.
- May 14th:
- *11 Story Points / 4 people = 2.75 velocity*
- Comments:
- Lowest velocity since we started, but:
- RSS end of phase1 is actually trickier than what we initially thought
- Many people having to start preparing presentations (CHEP, LPC retreat)
- Many people are working on tasks that are not in the scope of the sprint (to prepare future sprints): LHCb commands to replace the workflow modules, integration of CWL job submission endpoint within diracx
- A CI failure in DIRAC preventing from merging
- April 30th:
- *42 Story Points / 4.1 people = 10.2 velocity*
- Comments:
- ~1/4 of the counted SP come from the integration of the tasks
- `diracx-tasks` are here :tada:
- All the essential components are here to transition now.
- RSS Phase1:
- Should have been completed but it's still under development. [name=Loris] any blocking point?
- New Matcher:
- working on a v0.2 schema design expliciting more details about what we want
- April 16th:
- *20 Story Points / 2.6 people = 7.7 velocity*
- Comments:
- Less people available during the sprint (holidays, CTAO had deadlines). Also some people seemed to spend more time than originally described, some of them less time.
- A lot of bug fixes that were not planned
- April 2nd:
- *37 Story Points / 3.1 people = 11.9 velocity*
- Comments: NTR
- March 19th:
- *23 Story Points / 3.5 people = 7.2 velocity*
- Comments:
- Need to adapt the velocity computation because we are processing a lot of tasks not planned originally in the sprint (which is expected since we still have a lot of PRs without any attached issue to process, ...)
- March 5th:
- *19 Story Points / 2.8 people = 6.8 velocity*
- Comments:
- Less people available during this sprint, but more realistic expectation, we almost reached the expected velocity!!
- LHCb AI hackathon: [name=Alexandre] was much less available than expected.
- Took into account items that were in progress before scrum process (added some SP): resurrecting diracx-web, RSS simplified...
- February 19th:
- *38 Story Points / 4.4 people = 8.6 velocity*
- Comments:
- French holidays
- [name=Alexandre] was more available than expected, but did not manage to quickly follow all the PRs.
- A few tasks have been delayed (10 SP): waiting for further discussion on scheduling and diagrams for new LHCb workflows
- Lot of "unplanned" items: expected as long as we have to deal with the large backlog of old items.
- February 5th:
- *29 Story Points / 3.1 people = 9.4 velocity*
- Comments:
- LHCb-CERN had a computing workshop
- Various people worked on old PRs I did not take into account :warning:
- January 21st:
- *6 Story Points / 2.5 people = 2.4 velocity*
- Comments:
- LHCb-CERN had a team retreat, LHCb-Spain had a conference.
- January 7th:
- *15 Story Points / 3.9 people = 3.8 velocity*
- Comments:
- No specific comment, the sprint was split by the holidays.
- December 10th:
- *6 Story Points / 3 people = 2 velocity*
- Comments:
- About the same as the previous sprint: still a gap between expected/actual availability
- November 26th:
- *6 Story Points / 3 people = 2 velocity*
- Comments:
- Much lower than the previous sprint because it included tasks started before the sprint.
- Lots of "almost done" PRs: we are improving the description of the tasks and their size but still not enough (each task should bring value though).
- November 10th:
- *22 Story Points / 4.3 people = 5.1 velocity*
#### Actionable Results from the Retrospective
- **Action:** Developers should start reviewing PRs if they have some time available and they finished their tasks (at least the first passes)
- Owner: developers
- When: Sprint17
- Status: 11/06/26 In Progress
- **Action:** Share a more detailed roadmap that we should all understand.
- Owner: architects
- When: Sprint17
- Status: 11/06/26 In Progress
- **Action:** Share a more detailed roadmap that we should all understand.
- Owner: architects
- When: Sprint17
- Status: 11/06/26 In Progress
- **Action:** Reviewer has to double check titles of the PRs before merging.
- Owner: reviewers
- When: Sprint16
- Status: 11/06/26 In Progress
- **Action:** Feature PRs should be thoroughly tested in certification.
- Owner: developers
- When: Sprint12
- Status: 29/04/26 In Progress
- **Action:** Avoid verbose (AI-generated) issues with many implementation details that can deprecate over time.
- Owner: developers and product owners
- When: Sprint11
- Status: 15/04/26 In Progress
- **Action:** Better view of the PRs ready to be reviewed vs needing changes.
- Owner: developers
- When: Sprint8
- Status: 15/04/26 In Progress
- **Action:** Better communicate when a PR is going to be big, as soon as possible. Split the work in this case.
- Owner: developers
- When: Sprint6
- Status: 21/01/26 DONE
- **Action:** Better use of the mattermost channel to get reviews on a given PR
- Owner: everyone
- By when: Sprint3
- Status: 04/02/26 DONE
- **Action:** Define estimates and velocity based on Sprint2's results, taking into account external contributions (bonus Story Points) and availability
- Owner: alexandre
- By when: Sprint3
- Status: DONE
- **Action:** Better define the scrum roles
- Owner: alexandre
- By when: Sprint5
- Status: DONE
- **Action:** Better define `DONE` criteria (what should be included into the PR, and how to make sure we are not introducing too much technical debt)
- Owner: everyone
- By when: Sprint2
- Status: DONE
- **Action:** Avoid planning dependent tasks in a same sprint
- Owner: everyone
- By when: Sprint2
- Status: DONE
## AOB