UK Rucio F2F meeting

Europe/London
The Cosener's House
Alastair Dewhurst (Science and Technology Facilities Council STFC (GB))

Scope of UK activities

Various sites are doing different things:

  • RAL aim to produce a production quality multi-VO instance
  • Imperial (Janusz) aim to integrate multi-VO DIRAC with Rucio
  • Edinburgh (Teng Li), DUNE monitoring
  • Edinburgh (James Perry), object store support

 

 

Monitoring

Teng presented his talk from GridPP.

 

‘System health’ (internal operations of the various components) with Graphite. Tracing calls in Rucio code send stats to pystatsd. Rucio dev team are working on a pull request to replace pystatsd as it will not run under Python 3.  Both ‘statsd’ and ‘collectd’ seem to be under active development with recent GitHub commits. Data Transfers/Deletions etc. send info into Apache ActiveMQ (Java code, but not Java Message Queue or JMS)

 

 

AENEAS / ESCAPE - Experience, Requirements, Plans and Schedule

Monitoring requirements:

  • visual representation of functional tests. Provide info to determine why are transfers failing e.g. to a particular site – has the link gone down or is there a protocol mis-match.
  • Show levels of RSE usage. The WebUI offers this, but it doesn’t appear to work.
  • Break down usage by scope or account (e.g. are some users transferring/storing more data than others).
  • Check ‘liveness’ of replication (useful for when state is stuck ‘Replicating’ – potential problem with transfer tools/FTS link might not have been generated due to bug?). Rucio doesn’t record this per se.

Functional requirements:

  • Permissions – provide isolation within same VO. Some classes of user (part. astronomers) will not wish to divulge their data to others. ESCAPE have development effort to work in this area. One suggestion is scope-local perms.
  • Desire for co-location and popularity-based replication – informed by processing stats. (Is sufficient data presently gathered?)
  • Desire for a Rucio ‘lightweight client’ – I took this to be something self-contained to just up- or download data e.g. when using a new machine (could clarify with Rohini).
  • Workflow Management System integration – create an end-to-end use case.

 

Problems with FTS links disappearing - might increase timeout on FTS Dev.

 

Plan to get data from more nodes e.g. Meerkat (SKA pathfinder) – set up rules to move Meerkat data to UK. Suggestion about using Ceph replication – but need to control access to e.g. processed data (hence might be easier to sell Rucio replication with rules, rather than Ceph rep.)  MeerKat talk at CERN Ceph day September 2019: https://ceph.com/cephdays/ceph-day-cern-2019/

 

 

 

ESCAPE

WP2 create a data lake.  WP5 science analysis platform as a prototype of some of the SRC processing. Will show utility of metadata-based searching.

 

There is an ESCAPE Rucio instance hosted at CERN. Rohini has own Kubernetes instance for development.

 

Things they were wondering about:

  • Would there be a way to detect accounts looking at data ‘belonging’ to other accounts?
  • Does metadata searching require a separate metadata catalogue?
  • Meerkat presentation at Ceph Day – S3 syncing IDIA to RAL? How could Rucio use ‘synced RSEs’?
  • Is there the ability to move Rucio instance between hosts, e.g. recreate rules – export/import Rucio configuration tables etc.

 

 

Multi-VO Rucio

Ian presented Andrew Lister’s slides from GridPP meeting.

 

RAL aim to have a development instance for multi-VO DIRAC developers to work against.

 

Upgrade current instance to support multi-VO.

 

Object Store support

James has had pull requests to allow direct access to object stores accepted.  Developers were worried that signing so many pre-signed URLs would use CPU time, however this appears to not be the case.

 

James is updating the configuration to allow VO specific permissions to be set more easily.

 

Development plans for integrating Multi-VO Rucio and DIRAC

The draft plan that was discussed at the DIRAC workshop in May 2019 was looked at again:

https://docs.google.com/document/d/1W5F3VZBtt3_J5ST6CadJDzHOjz7LMMAhcipY83wg3Xc/edit

 

Plan:

  1. JAnusz is looking at the Rucio File Catalogue Plugin

11th October GridP Technical meeting: Janusz to present proposal of how DIRAC will expect Rucio to behave (https://indico.cern.ch/event/849681/ ).

    1. Explain DIRAC works
    2. List DIRAC data management commands.
  1. Ian J to look into the RESTful interface for DMC commands

 

 

AOB:

Recent GDB meeting:

https://indico.cern.ch/event/739882/

 

CHEP19:

Mario presenting: Evaluating user experience Rucio talk from .

RAL: Multi-VO Rucio work.

 

Rucio coding camp:

https://indico.cern.ch/event/819753/timetable/#20191015

 

Someone should attend the AENEAS Close out meeting.  The UK should always have a presence at the Rucio development meeting

There are minutes attached to this event. Show them.
    • 1
      Introduction
      Speaker: Alastair Dewhurst (Science and Technology Facilities Council STFC (GB))
    • 2
      Scope of UK Activities

      What are we trying to achieve as a UK Collaboration?

      Speakers: Alastair Dewhurst (Science and Technology Facilities Council STFC (GB)), Peter Clarke (The University of Edinburgh (GB))
    • 3
      Rucio Monitoring
      Speaker: Dr Teng LI (University of Edinburgh, UK)
    • 10:40
      Tea Break
    • 4
      AENEAS / ESCAPE - Experience, Requirements, Plans and Schedule

      VC from Manchester

      Speaker: Rohini Joshi
    • 12:00
      Lunch
    • 5
      Status of Multi-VO Rucio development
    • 6
      Plans to trial Multi-VO Service
    • 7
      Development plans for integrating Multi-VO Rucio and DIRAC
    • 15:30
      Tea Break

      Edinburgh team depart to catch train home