Rucio Development Meeting
-
-
15:00
→
15:10
News 10m
- Meeting format
- Protected material
- Pre-Filling of minutes and topics to discuss
- Meeting room
- ATLAS P1 for 2018
- 2019 meeting room closer to main building
- New Rucio project logo
- Will be integrated in the website
- Please use this one for presentations etc.
- 2nd Rucio community workshop
- Call for Hosting until Oct 19
- Interest for a Rucio Coding camp in December 2018?
- 2 days at CERN
- Pre-select some specific developments and use two focused days to work on these
- Meeting format
-
15:10
→
15:30
News from the experiments 20m
- CMS developers at DUNE workshop at the moment
- Currently working on Kubernetes setup
- DUNE workshop in Edinburgh - Cedric attended
- Possible manpower to do some development in Rucio, possible objectstores;
- Data Management session
- Cedric showed slides
- Tape management different - More similar to STAGING area
- Ian at RAL migrating to PostgreSQL
- Migration recipe - Should put it on ReadTheDocs
- Oracle schema from ATLAS DBA
- Iterating through it to get it in sync with github/schema.sql
- News on CMS intance
- Everything setup on Kubernetes cluster except authentication
- No certificate to move data around
- LIGO / IceCube installation
- Making progress: Moving files between RSEs
- CMS developers at DUNE workshop at the moment
-
15:30
→
15:50
Hot topics 20m
- Hanging Judge evaluator
- Recently there was an issue of a hanging judge evaluator which lead to some unwanted replica deletion
- As the application of replication rules (replica locks) to files is done asynchronously for performance reasons, a specific workflow lead to the deletion of replicas which should not have been deleted
- 1: File1 with Replica1A gets added to DatasetX (which has a rule for RSEA)
- 2: DatasetX gets deleted --> Rule gets removed --> Replica1A gets a tombstone
- 3: File1 gets added to DatasetY (which has a rule)
- As the Judge evaluator was hanging, Replica1A never got the tombstone removed
- 4: As RSEA is full, Replica1A gets deleted within 3-4 hours
- This was the first time the evaluator was hanging since 2014
- The workflow was adapted to not remove DatasetX immediately, but give it an expiration of 6h
- Discuss if #1578 should be implemented, which would prevent the reaper from deleting data if it detects a backlog in the evaluator
- --> Go ahead with the check
- Hanging Judge evaluator
-
15:50
→
16:20
Developers roundtable 30m
- Custom configuration / policy / permission files #533
- Currently part of the repository & package
- Should be removed from main repository and made simple for users to add
- Load the module or path in the configuration (Can be path or module)
- Protection of sources by the reaper #1637
- Currently the reaper is potentially over-protective of source replicas
- Changes possible/necessary
- If 1 source, do not allow deletion
- If 2 sources or more, always keep the alphabetically first one
- Issue for alphabetical selection?
- Should not be, as replica becomes eligible after transfer finished
- Issue for alphabetical selection?
- Python 3 plan #1505 and #67
- Via pyline --p3k
- Hannes will prepare a script
- Rucio sometimes returns a 404 for a list-replicas call, even if the file exists #1568
- Rerun arcls against integration service and try to find out based on the logfile
- Message payload #48
- Add new column (CLOB) and only write messages in there
- Only context switch when the non-clob column is empty
- Add new column (CLOB) and only write messages in there
- ZIP Files #1091
- Open ticket for list_replica inconsistencies
- Conveyor-consumer wrong handling of multi source jobs #704
- In preparation
- Kubernetes
- traefik released now, needs to be integrated at CERN
- For WebUI minimal work, for authentication server more complex
- Database full for DOMA instance
- message_history, rules_hist_recent, requests_history
- Nagios probes #1638
- Need to evaluate what we want to use instead nagios?
- Should be a separate repository for probes (maybe a common one)
- Also probes which create network metrics
- Custom configuration / policy / permission files #533
-
16:20
→
16:30
AOB 10m
- Development meeting schedule for 2018
- Development meeting every week except
- 15. Nov 2018 - SC'18
- 13. Dec 2018 - ATLAS Software & Computing week
- Maybe CMS people can join ATLAS DDM Session?
- 20. Dec 2018 - pre CERN-closure ???
- People might be on vacation already - will check closer in time
- 27. Dec 2018 - CERN closure
- Any CMS constraints?
- Computing week in 2 weeks, but CMS guys would like to join Rucio Dev Meeting in person
- First meeting in 2019: 10. Jan 2019
- Development meeting every week except
- Dev meeting on the 25. Oct 2018:
- Rucio 1.19.0 "Fantastic Donkey" release planning
- Development meeting schedule for 2018
-
15:00
→
15:10