WLCG MW Readiness WG 15th meeting Minutes - January 27th 2016

WG twiki

Agenda

Summary

  • Two Volunteer sites are verifying DPM for ATLAS (different configurations).
  • New StoRM v.1.11.10 is out for testing.
  • New ARGUS EL7 rpms are available on github.
  • Experiments still using the deprecated gfal1 are invited to use the MW Readiness WG to move on. gfal2 v.2.10.3 is now available on /cvmfs/grid.cern.ch.
  • Experiments are invited to announce their plans concerning EL7 and/or CentOS7.
  • The MWR JIRA dashboard shows per experiment and per site the product versions pending for Readiness verification.
  • The next meeting will contain a dedicated discussion on the the MW Readiness App. Suggested dates are 9/3 or 16/3. Both dates are fine for ATLAS, Napoli and the participants of the 27/1 meeting. Preferences by others please?

Attendance

  • local: Maria Dimou (chair & notes), Maarten Litmaath (ARGUS report), Andrea Manzi (MW Officer), Lionel Cons (MW Readiness software developer)
  • remote: Andrea Sciabà (CMS), Antonio Yzquierdo (PIC), Matt Doidge (Lancaster).
  • apologies: Alessandra Doria (Napoli), Frederique Chollet (LAPP Annecy), David Cameron (ATLAS), Catherine Biscarat (French grids).

Minutes of previous meeting

The minutes of the last (14th) meeting HERE are accepted as such because the meeting was virtual.

Verification status report

ATLAS workflow Readiness Verification Status:

MW Product version Volunteer Site(s) Comments Verification status
DPM 1.8.10 LAPP Annecy They are testing SRM-less DPM + gridftp redirection+ stagein/out using http On-going, some problem on the configuration discovered and fixed, stagin/stageout via http still not possible cause it requires some changes on the ATLAS pilot
DPM 1.8.10 Glasgow CentOS7 verification issues at the site, still not ready for testing
FTS 3.4.1 CERN 3.4.0 done. Also for CMS and for CentOS7 as well On-going
StoRM 1.11.10 CNAF & QMUL Just released in the italian grid github CNAF is installing it, no answer from QMUL
dCache 2.10.47 Triumf tested also the bringonline from tape Completed
BDII 5.2.23 GRIF-IRFU & Brunel both TopBDII and SiteBDII tested on CentOS7. Released in UMD4 Completed before the openldap bug hit.
gfal2 & gfal2-utils 2.9.3 Napoli - stagein/stageout works ok with gfal2 taken from the ATLAS cvmfs, now studying how to test new versions ( grid.cern.ch or local installation)

CMS workflow Readiness Verification Status:

MW Product version Volunteer Site(s) Comments Verification status
DPM-xrootd 3.6.0 GRIF-LLR performance improvements for federation tested manually by A.Sartirana with AAA
EOS 4.0.8-citrine CERN Next version integrating xroot4 On-going
dCache 2.14.5 PIC some network issues caused transfers failing for some time Completed
dCache 2.14.8 PIC - Completed
gfal2 & gfal2-utils ? GRIF-LLR No news since mid-December JIRA:MWREADY:101 Pending for verification

During the discussion on how to 'activate' the gfa2 testing for CMS, which currently appears stalled at GRIF-LLR, Matt D. offered his email m.doidge@lancasterNOSPAMPLEASE.ac.uk for any technical exchange needed. After the meeting Andrea S. learned from Nicoḷ Magini that very soon there will be no CMS tool still using lcg-utils, gfal-utils will be used everywhere. Nevertheless, lots of sites still use lcg-cp for the local stageout, unfortunately and only a few use gfal-utils. It would be good to ask a volunteer site to use the latest gfal-utils for PhEDEx and/or for the local stageout. Anyway, CMS does a basic usage of gfal-utils, so, in the worst case, CMS would trust the validation done for other VOs. Raul Lopes (Brunel) emailed the following info: 'UKI_London_Brunel does stage-out using xroot. The SAM tests, however, are still using lcg_cp.'

Maria D. asked what will happen with the Prometheus platform for dCache testing at DESY, as per JIRA:MWREADY:36. The ticket is almost 1 year old and multiple reminders fetched no updates. The issue that prevents us from making progress is that the dataset shouldn't be scratched at night and no way was found so far to avoid this. To be followed up by the Andreas, David Cameron and Paul Millar.

Andrea M. asked if experiments plan to test the WN or the UI on CentOS7. The discussion showed that this may be premature, given that the UI is expected to appear in UMD in several months but we can start gathering input.

WLCG MW Readiness Software Status

  • The pakiti client is installed in all Volunteer sites as per the last status report.
  • After some testing of the MWReadiness app v0.3 at https://wlcg-mw-readiness.cern.ch some improvements have been discussed (i.e. JIRA:MWREADY:111 , but other tickets will follow)
    • not so much time to work on this during the last 2 months
    • the plan is to have v0.4 ready for the next WG meeting together with the software documentation
    • Integration with SSO JIRA:MWREADY:50 has been postponed so far cause it affects only admin access. To evaluate implementation in v 0.4 or later versions
  • We need to start planning the deployment in production of the pakiti client + MW Readiness App dedicated to WLCG production, to be discussed and presented during the next WG meeting

Sites' feedback

  • LAPP Annecy: The configuration of mysql db on the testbed was fixed on 25/1. Follow-up in JIRA:MWREADY-104. Transfers work fine. Jobs fail. Issue in the hands of ATLAS Rucio experts.

  • French Grids: Question from Catherine B. On the baseline web page (https://wlcg-mw-readiness.cern.ch/baseline/current/), I see that DPM baseline is still 1.8.9. CMS is pushing for DPM 1.8.10. Shall sites supporting LHC VO other than CMS go for it or stay with 1.8.9 ? Answer: The baseline view will be updated to DPM 1.8.10.

  • PIC report:
    • MWR storage running dCache 2.14.8
      • 2.14.9 released today, moving to it this week.
    • CMS test (Phedex Dev) transfers:
      • CERN, GRIF -> PIC running fine
      • PIC->* transfer quality improved over the last weeks, however still with intermittent errors on both links. PIC -> GRIF with "Failed to connect to llrpp01.in2p3.fr connection refused" and PIC ->CERN with "open/create files on eos" sporadic errors.
    • CMS HC test jobs continue running fine

Report from the ARGUS meeting

  • main items for MW Readiness:
    • EL7 rpms are available on github
    • SL6 rpms will be added
    • both sets will go into the UMD-4 preview repo to facilitate testing
    • CERN will deploy the EL7 rpms on a QA node in the Argus cluster
    • if all looks well, the cluster will be upgraded to EL7

  • Argus at CERN
    • the service did not break down since the gridmapdir NFS mount options were changed
      • attribute caching was switched on
    • low error rates have still been observed
      • hopefully the EL7 release with updated deps will get rid of them

Actions

Action items Done from past meetings can be found HERE.

  • 20160127-03: Ben to install the ARGUS EL7 rpms available on github.
  • 20160127-02:Andrea S. and David C. to obtain their experiments' plans concerning EL7 and/or CentOS7.
  • 20160127-01: Andrea M., Andrea S., David C., Paul M. see how the nightly data scratch can be handled so that the Prometheus dCache tests can start JIRA:MWREADY:36.
  • 20150318-02: Ben to set-up the ARGUS testbed at the T0. The testbed is there, the load testing is in the list but Postponed
  • 20141119-03: Andrea M. and Andrea Sartirana to discuss how the GRIF-LLR Volunteer site can proceed with gfal2 and WN testing via the CMS workflow In progress

Next meeting

  • 9/3 or 16/3?

AOB

-- MariaDimou - 2016-01-18

Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng HC_test_T1_ES_PIC.png r1 manage 32.0 K 2015-06-16 - 16:32 AntonioPerezCalero HC jobs reading from dcache validation storage at PIC
Edit | Attach | Watch | Print version | History: r73 < r72 < r71 < r70 < r69 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r73 - 2018-02-28 - MaartenLitmaath
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback