eulake technical coordination

Europe/Zurich

eulake coordination meeting 24th April 2018

Participants: Crystal(Aarnet), Brian (RAL), Andrew (NIKHEF), Wybrand (SARA), Jean-Marie (SARA), Simone (CERN), Xavi (CERN)

  • Coordination update:
    • WLCG Data lake R+D project was presented at the 2nd GEANT SIG-CISS meeting in Amsterdam thanks to invitation from Guido.
      • Meeting agenda and notes available: https://wiki.geant.org/display/CISS/2nd+SIG-CISS+meeting 
    • Agreed to change start time of the meeting, new start time 14:30 (more reasonable for our Melbourne colleagues)
    • Next meeting proposed by the 8th of May
  • Round table
    • NIKHEF
      • First storage nodes joined the eulake, it is ocmposed by two nodes of two filesystems each totalising (2x2x14TB).
      • Installation was driven by Andrew. After installing a full vanilla EOS instance (with namespace, MQ and all the components) the current deployment is running the minimal needed software to expose disks which is the eos diskserver daemon.
      • It has been agreed to keep installation documentation and experiences in a Twiki where the different ways to setup disks across the lake can be summarised and explained so this info will be much useful for new sites/people joining. The Twiki should be editable by our eulake-devops egroup (let me know otherwise!):
    • RAL
      • ​​​​​​​Brian presented our Data Lake R+D project at GridPP which draw much interest among the audience.
      • Next item in the planning is to expose CEPH pools directly with RADOS and avoiding the NFS mount as it is now. 
        • This is particularly interesting as EOS developers are releasing new RADOS plug-in (ExOS) in the upcoming release - tuned for EC pools - as the current XRootD Ceph plug-in heavy/slow for small reads/writes.
    • Aarnet:
      • The nodes in Melbourne are now writable after setting up the firewall correctly and storage nodes in Perth and Brisbane will come probably in about 4 weeks.
      • Special thanks to Crystal for solving the firewall issues and attending the meeting after midnight in Melbourne!
    • SARA
      • ​​​​​​​SARA added three new filesystems, one extra FS into the the existing three nodes. The particularity of these three new FS is that they are dCache pools exported via NFS.
      • Wybrand reported that overwrite should not be possible so the scenario for testing is quite interesting right now with  the big diversity we have in this primal lake:
        • Native EOS diskservers
        • EOS diskservers exposed via containers
        • EOS diskservers based on CEPH volumes exported via NFS
        • EOS diskservers based on dCache volumes exported via NFS
    • CERN
      • Aarnet nodes needed firewall tweaking, diskservers could not be contact storage nodes resulting in "partial" copies: warning=failed to open remote stripe: root://crlt-cext8.cdndev.aarnet.edu.au:1095
        • Thanks to this we spot a bug : as long as N-2 copies can be granted the copy command returns zero and no warning about redundancy deficit is returned - bug report filed and reported to developers.
      • Starting implementation of real experiment workflows with the goal to compare data access once eulake reach a meaningful size.
      • Functional tests ongoing - 7 sites with usable storage: Dubna, SARA, RAL, NIKHEF, Aarnet, CERN-Meyrin and CERN-Wigner
      • Simone proposed start doing tests between our eulake and external "standard" storage:
        • xrdcp TPC (Third Party Copy)
        • FTS
        • gridftp
        • Data deletion
There are minutes attached to this event. Show them.
    • 16:00 16:05
      Coordination update 5m
      Speaker: Xavier Espinal (CERN)

      Next meeting proposed by the 8th of May

      WLCG Data lake R+D project was presented at the 2nd GEANT SIG-CISS meeting in Amsterdam

    • 16:05 16:25
      Tour-de-table 20m

      CERN

      • Functional tests ongoing - 7 sites with usable storage: Dubna, SARA, RAL, NIKHEF, Aarnet, CERN-Meyrin and CERN-Wigner
      • Starting implementation of real experiment workflows with the goal to compare data access once eulake reach a meaningful size.
      • Aarnet nodes needed firewall tweaking, diskservers could not be contact storage nodes resulting in "partial" copies: warning=failed to open remote stripe: root://crlt-cext8.cdndev.aarnet.edu.au:1095
        • Thanks to this we spot a bug : as long as N-2 copies can be granted the copy command returns zero and no warning about redundancy deficit is returned - bug report filed and reported to developers.