HEPiX IPv6 working group F2F meeting

Europe/Zurich
600/R-002 (CERN)

600/R-002

CERN

15
Show room on map
Dave Kelsey (STFC - Rutherford Appleton Lab. (GB))
Description


WIFI access: pre-register your MAC address here (contact person: David Kelsey)

DRAFT agenda. Topics may change. Timings are approximate. 

Registration
Participants
Participants
  • Alastair Dewhurst
  • Andrea Sciaba
  • Bruno Hoeft
  • Costin Grigoras
  • David Kelsey
  • Duncan Rand
  • Edoardo Martelli
  • Francesco Prelz
  • Kars Ohrenberg
  • Raja Nandakumar
  • Ulf Bobson Severin Tigerstedt

HEPiX IPv6 Face to Face meeting 2016-05-15 - Minutes

Agenda: https://indico.cern.ch/event/615970/

 

Attending:

• Alastair Dewhurst 

• Andrea Sciaba 

• Bruno Hoeft 

• Costin Grigoras 

• David Kelsey 

• Duncan Rand 

• Edoardo Martelli 

• Francesco Prelz 

• Kars Ohrenberg 

• Ulf Bobson Severin Tigerstedt 

 

Matters arising from last meeting

Matters arising at the 12 April 2017 meeting:

a) The main CHEP2016 paper needs to be revised and resubmitted. Alastair will propose the changes.
b) DPM issues. See GGUS ticket. (GGUS: 127285) Globus, FTS and dCache all break the RFC. Andrea to talk to Oliver Keeble. Ulf and Raul(?) report problem to IETF.
c) CMS testing of IPv6-only at Brunel - on hold. Any news?
d) Edoardo to add 2 columns to Tier 1 table  (done)
e) Need to work on plan for training half-day in Manchester - June 2017.
f) All to help Andrea with documentation on deployment in the

 

Roundtable updates

 

Andrea Sciaba (CMS): Still testing glidein-wms job submission. Still have problems, updated to latest version of HTCondor. Server attempts to make a connection to the WN from the outside. Needs extra component. Only one of the xrootd redirectors in Europe is dual-stack and this failed causing problems - acts as a single point of failure.  

 

Costin (Alice). Recently had a workshop with sites, IPv6 was asked about.  Most sites aware but few have done anything concrete - details on agenda. Oxford will fully support IPv6 in the next 6 months. RAL T1 making good progress but awaiting xrootd-ceph plugin so no ALICE support as yet. Ceph can accept names without a ’slash’ in it but xrootd cannot.  Nothing to do with IPv6. Unique to ALICE. Working with CMS at the moment, ALICE is next. (ALICE uses third-party copying to transfer data). US (ORNL, LBL) sites awaiting EOS to support IPv6 (should be soon). Romania - work in progress. IN2P3 - Tier-1 has dual-stack storage. INFN will act in an coordinated way, but no exact estimate. Kisti within a month. Japan - little progress.  NGDF - IPv6 has been working for some time. 

 

Francesco - At Milan he has not been given permission to dual-stack things yet. Reminded us that Terry had reported that Vidyo doesn’t work with NAT64. Edoardo reported that the Vidyo developers say it won’t work with IPv6 anyway.

 

Bruno: Server people are setting up new dCache file system with dual-stack.

 

CERN: storage team is ready with EOS4.1 (supports IPv6). Will start with LHCb instance this week. If OK will plan upgrades for the other experiments. 

Francesco: What about the DHCPv6 RFC that has come out? Edoardo: we’ve been testing it locally and are moving forward. There is a new IETF working group on DHCPv6 version 2. 

 

Duncan - v4 of perfSONAR was recently released. Most hosts will have updated the software automatically. Duncan has been helping UK sites to get (dual-stack) mesh working well after the new release. Traceroute results now appears in the mesh http://maddash.aglt2.org/maddash-webui/index.cgi?dashboard=UK%20Meshconfig. The issue of making all the PS meshes show IPv6 data has been recently discussed. 

 

NDGF: T1 working fine. Will implement perfsonars at each of the sites (it is a distributed T1).

 

Alastair (ATLAS): In March Panda-Server front ends were made dual-stack. Run jobs on IPv6 only WNs. Upgrading Frontier to dual-stack (numerous operation problems (not IPv6 related) has slowed things up). Small but non-trivial number of sites accessing Panda servers via IPv6. BNL have recently (since Sept 2016) actively implemented IPv6. Made their FTS dual-stack but had to roll-back. BNL dCache now dual-stack.  Note: BNL is 25% of ATLAS T1s. Reported at BNL HEPiX meeting: https://indico.cern.ch/event/595396/contributions/2544103/attachments/1447978/2231941/BNL_Site_Report_HEPiX_Spring_2017.pdf

 

Fernando: Not much news from PIC. Put HTCondor into production < 50 nodes. All HTCondor installation is dual-stack.

 

CHEP paper (Alastair)

 

There were a few corrections which have been implemented. 

 

Tier-0 & Tier-1 status

Two new columns have been added to the table http://hepix-ipv6.web.cern.ch/sites-connectivity, with percentage dual-stack storage offered by the site by three dates. Several T1s have not filled this in yet and so Bruno created GGUS tickets (keyword WLCG-IPv6 Tier-1 readiness) enquiring as to progress. Bruno will summarise the status for Martin to present at the the WLCG meeting tomorrow. 

 

Dave K enquired about Francesco’s web page (http://orsone.mi.infn.it/~prelz/ipv6_bdii/) showing the number of WLCG hosts that are dual-stack, there has been a recent increase - seemingly at SurfSara. 

 

There was a discussion about traffic in and out of CERN: https://netstat.cern.ch/monitoring/network-statistics/ext/?q=IPv6&p=EXT&mn=Internet&t=Monthly and progress on getting monitoring of IPv6 traffic on the LHCOPN.

 

Documentation

Andrea has some information which he has put on the web including perfSONAR, dCache etc which support IPv6 ‘out of the box’. DPM was slightly more complex with the redirect issue. http://hepix-ipv6.web.cern.ch/content/how-deploy-ipv6-wlcg-tier-2-site

 

Monitoring and supporting roll out of IPv6

It was suggested that the WLCG SAM tests could be used to check IPv6 status at sites. The main service to be tested would be the storage. A new profile for each LHC VO could be created for IPv6 and suitable tests created. The existing WLCG operations reporting could be used to summarise state across the whole WLCG. How will sites moving to IPv6 be supported. We could use lcg-rollout - this is a rollout after all. Perhaps we should start with a small set of sites so as to keep things manageable.

 

 

DPM FTS issue

Oliver had previously asked a question about DPM redirection to another host and the RFC. The RFC had previously been ignored by others. Should we try to change the RFC which is old and they probably didn’t think about this use case when it was written. There is the possibility to submit an RFC errata report. 

(minutes taken by Duncan Rand)

Minutes - Day 2

- Review of agenda. Shorten meeting to 11:00.

- Perfsonar?

- IPv6 tickets: a few sites responded to getting pushed.

 

- Training: 2 1½h sessions

   - intro to ipv6 basics

   - is my traffic on ipv6 or not?

   - data transfer tests

   - set up perfsonar

   - slide-session on wednesday afternoon

   - edoardo will talk about dhcpv6

   - Cheat-cheat cards for subnetting

   - Ulf will do transfers

   - Should we explain why we are doing this?

   - Session on "how to prepare a T2 in general"

   - Hands on perfsonar

   - we need more sysadmins to the session

   - preliminary f2f in september 11-12.9

   - 13.7 vidyo meeting

 

- DPM/globus problem. Packages in epel-testing about to be deployed.

(minutes taken by Ulf Tigerstedt)

 

There are minutes attached to this event. Show them.
    • 14:00 18:00
      Session 1
      • 14:00
        Introductions, agenda, note takers 10m
      • 14:10
        Review minutes and actions 10m

        Matters arising at the 12 April 2017 meeting:

        a) The main CHEP2016 paper needs to be revised and resubmitted. Alastair will propose the changes.
        b) DPM issues. See GGUS ticket. Globus, FTS and dCache all break the RFC. Andrea to talk to Oliver Keeble. Ulf and Raul(?) report problem to IETF.
        c) CMS testing of IPv6-only at Brunel - on hold. Any news?
        d) Edoardo to add 2 columns to Tier 1 table
        e) Need to work on plan for training half-day in Manchester - June 2017.
        f) All to help Andrea with documentation on deployment of IPv6 at Tier 2s

      • 14:20
        Roundtable updates 30m
      • 14:50
        Tier 0/1 status 40m
      • 15:30
        Coffee 30m
      • 16:00
        Tier 2 status 30m

        Current status?
        How do we determine/track the status?
        Guidance documentation - what? who? when?

      • 16:30
        Revising of the CHEP2016 papers 15m

        Comments from the reviewer with my replies are below:


        Thanks for the proceedings, interesting and easy to read. Just a few comments below, maybe worth considering. In particular, the references are mis-leading (not in the right order) and the experiment part might be revised: the balance between the all 4 experiments is puzzling.
        Cheers.
        - page 1: country of author 11 missing in his affiliation
        Done, added Finland.

        - page 2: \"In order to provide an incentive for sites to move it was\",        add a coma after \"move\”
        Done.

        - page 2: \"in order for for IPv6-only...\", remove one \"for\”
        Done.

        - page 3: add a space right before \"[2]\”
        Done.

        - page 4: reference [5] comes up right after reference [2]... please fix the order of the references
        Done (Needed to change order of references in reference section).

        - page 4: add a space right before \"[6]\”
        Done.

        - page 4: \"All Tiers 1s\" should be \"All Tiers-1s\" for consistency
        Done.

        - page 5, Figure 1:  could the figure be made bigger ?        The caption should finish with a full stop (period).
        Caption has been corrected.  Difficult to make the figure bigger without splitting it and then it takes up whole page and takes us over limit.  Best to leave it as is.

        - page 5:  references \"[8,7]\" are not in the right order        and a space should come before the \"[\”
        Done.

        - page 6: reference [3] : not in right order why is there a reference for ATLAS and not for ALICE ?
        Done.  Reference added for Alien.

        - page 6: \"PanDA\" or \"Panda\", check for consistency
        Done.

        - page 7: DIRAC, add a reference ?
        Done.

        - page 7: Tier-2D should be defined for LHCb- experiments: the balance between the 4 experiements is not good ; one has the feeling  there is not as much information about ALICE and ATLAS, comparedto CMS and LHCb ; is there really a difference ?
        Removed D from Tier-2D.  I have modified the text slightly.  The ALICE and ATLAS use cases are simpler.

        - page 8: \"(ii) EGI acknowledges...\", this is strange to me, EGI is in the acknowledgement part, should be rephrased like (i), meaning by defining EGi first.
        Didn’t really know how to change this.

      • 16:45
        Guidance/documentation for Tier 2 deployment 30m
      • 17:15
        Current technical issues 15m
    • 09:00 13:00
      Session 2
      • 09:00
        Review agenda 10m
      • 09:10
        Plans for Training in Manchester WLCG workshop 50m
      • 10:00
        LHCOPN, LHCONE, perfSONAR 30m
      • 10:30
        Coffee 30m
      • 11:00
        Other issues 1h

        Monitoring of the transition
        Plans for WLCG Ops to takeover Tier 2 deployment

      • 12:00
        AOB and next meetings 15m
      • 12:15
        Review decisions and actions 15m