HEPiX IPv6 working group F2F meeting

Europe/Zurich
31/S-028 (CERN)

31/S-028

CERN

30
Show room on map
Description


WIFI access: pre-register your MAC address here (contact person: David Kelsey)

DRAFT agenda. Topics may change. Timings are approximate. 

The PIN for Vidyo has been distributed by email.

Participants
  • Andrea Sciaba
  • Catalin Condurache
  • Dave Kelsey
  • Duncan Rand
  • Edoardo Martelli
  • Francesco Prelz
  • Kars Ohrenberg
  • Martin Bly
  • Raja Nandakumar
  • Ulf Bobson Severin Tigerstedt
  • Xavier Espinal Curull

HEPiX IPv6 Working Group meeting - day 1 of F2F at CERN
- 6 Sep 2016.

Notes taken by Raja Nandakumar and a few mods made by Dave Kelsey.

Present: Andrea, Alf, Martin, Dave, Edoardo, Duncan, Raja, Francesco, Xavier, Jerome, Kars, Costin, Ulf.
Remote partipants: Fernando, Antonio, Catalin, Jiri, Terry, Alastair.

Introduction :
--------------
Lawnmower (outside the meeting room!) : Brrrrr budbudbudbud.
Welcome to new people - Martin Bly (RAL), Antonio Falabella (CNAF) and to Jerome Belleman (CERN Tier 0).

Review / discussions / ... :
----------------------------
Dave K : Nothing to review. Minutes not written. Sorry!
Need to discuss Alastair's proposal to the WLCG MB to see where we are with the different responses.
Dave K to present an update to the MB on 20 Sept 2016
- Important to mention the lack of showstoppers
Also discuss planning for the written paper for CHEP2016
Tomorrow discuss migration of Tier-1s to dual stack, possibly as a function of service. A visually appealing display would be good!
-- (Duncan) FTS have added the ability to track if a transfer has gone through ipv6/ipv4. It is being propagated.

Lawnmower : Brrrrr budbudbudbud. Whine ...

Round table reports :
---------------------
Edoardo (CERN): Completed reconfiguration of campus network to optimise use of ipv6 addresses. Still have the old addresses for users who have static addresses - will be recovered in October. Deploying a new wifi system to give better coverage of area and users at CERN.

Lawnmower : Goodbye!

Duncan : Not much from Imperial. Brunel has ipv6-only WN (dc2-grid-25.brunel.ac.uk), which runs CMS SAM tests. Visible from production CMS dashboard.
Failing some tests - glexec (voms server issue), frontier (passed previously with correct client), xrootd access (was working at one point - problems with older versions of cmssw)
Warning for - WN basic test (gitHub@CERN is ipv4-only, to become dual-stack this month), squid (getting results from IC, but fails from RAL), xrootd-fallback (currently no dual-stack redirector for Europe)
Successfully staging out data.

Raja : LHCb is ipv6 compatible. Main needs going forward - CERN EOS to become dual-stack, voms-proxy-init to be fixed to work
with ipv6 (java environment variable) and VOMS server(s) serving LHCb to be made dual stack. 
Action1 : Raja to open a GGUS ticket about VOMS servers becoming dual stack for LHCb.
Action2 : Raja to investigate if the ipv6 java environment variable can be set in the LHCb software, while Francesco investigates the same issue from the lxplus / java end.

Terry : Walk through presentation uploaded to the indico page.
https://indico.cern.ch/event/561262/contributions/2266892/attachments/1332046/2002283/20160902-gridpp37-ipv6testingatqmul.pdf
Action3 : Terry to put the WNs running pure-ipv6 on a separate queue and also circulate the list of these machines to the mailing list.

Francesco : Tinkered with using Jool for providing NAT64. Working on making the INFN CA to dual-stack in end Sept (26 Sept or 4 October depending on extracurricular activities)

Kars : NTR for Desy. Usage relatively flat.

Xavier : No news from Storage side - proved that EOS talks to ipv6. Testing an endpoint for experiments to test. In discussions with Andrea for this.
For Castor, ipv6 access will be through dual-stack srm. For AFS, project is falling apart and CERN is leaving it. For CEPH, okay as long as clients are dual-stack. If the client is pure-ipv6, then the CEPH server also has to be pure-ipv6 (possible bug / feature somewhere here?). For CVMFS - not an issue.
(Alastair) Not yet tested at RAL on CEPH.

Costin : NTR for Alice

Andrea : Nothing to add for CMS

Ulf : egi.eu repository (run from Greece) is ipv4 only. GGUS ticket opened, but nothing going on there. Most (59) CAs still ipv4-only.
Action4 : Dave to raise the issue at the next IGTF meeting later this month.

Martin : Perfsonar box available. No news about the UK CA. About to try testing FTS and CVMFS in dual-stack testbed. Making changes to RAL central networking to make these changes easier. Plan to retire the "UK light" router in early October which will help all of this - the hardware will then be ipv6 compatible to start with. Tiju leaving and ipv6 work being taken over by Catalin.

Bruno : NTR for Desy

Antonio : NTR currently. Learning curve.

Catalin : Taking over from Tiju

Alastair : emailed all ATLAS Tier-1 contacts about the routing. Responses from RAL and NorduGrid. ATLAS computing coordinators suggest to open GGUS tickets.
(Fernando) did not receive any message from the ATLAS contact. Will follow up.
Doing various testing of dual-stacking of panda servers. Should be able to submit jobs soon (which will be able to go to dual-stack machines)

Fernando : NTR for PIC. In the last two hours, a peak of 6.5Gb/s ipv6 FTS transfer rates from PIC to Imperial.

Tier-0/1 status and feedback to MB :
------------------------------------
Jerome : Starting to think about dual-stacking various things, but nothing concrete as yet.
Alastair : CERN going dual-stack is very important and will give an impetus for the others.
Edoardo : BNL have taken first steps at getting ipv6 working there.
Discussion on how to present the proposal to the MB and chivvy along reluctant sites, with input from sites and experiments, especially with regards to the number of sites which should support dual-stack mode by 1 April 2017. The words are currently as vague as they can be while actually saying something.
LHCb : Can already start using the storage for MC simulation and some analysis
Alice : Calibration files are on SEs. So, even for MC, they need most sites to be dual-stacked. However, once they move to having the files on CVMFS like LHCb, they will also have similar traction like LHCb.
CMS : Just need all sites to move asap to dual-stack. But no minimum number to define a success or failure here.
(Fermilab - email) : Actively working and preparing to deploy a dual-stack xrootd redirector. More detailed plans available. Close to enabling ipv6 for WNs on their farm and for Condor.
Atlas : A nuanced view somewhere between LHCb and CMS.

CHEP paper :
------------
Discussion on how to go about updating Alastair's paper.
Alastair to circulate a decent version of the slides for comments by end of September. Maybe a preliminary version also earlier.

Poster by Bruno, Francesco and Dave on ipv6 security:
Not yet started work on it. Francesco is the only one who will be at CHEP and will be happy to present it.
Next - start writing the paper and an appealing poster.

Day 2 of F2F meeting at CERN - 7 Sep 2016

Notes by Francesco Prelz with some mods by Dave Kelsey

DaveK reviews agenda for today (see Indico page) - We may need to worry more about the management 
board paper than the CHEP paper.
No other items are proposed.

Brainstorming: how do we best monitor the traffic trends to get at plots like the Google statistics plots ?

Does LHCOPN record traffic data ?
BrunoH and EdoardoM: there are 1 year per-link data for LHCOPN.
See https://netstat.cern.ch/monitoring/network-statistics/ext/?p=LHCOPN

CostinG suggests to look at the Alice Monalisa database access statistics (alimonitor.cern.ch -> Services -
> Central Services -> ML Repository
-> Alternative views ->IPv6 ratio). There are statistics for the past 3 years.

DaveK: do other experiments have anything similar?
RajaN: not for LHCb, but could put together a script to analyze logs.
        Can put an action on me.  (ACTION 5)
        A quick inspection of the configuration server logs shows
        a 5% fraction of connection via IPv6.

FrancescoP: Should we attempt to aggregate them ?
DaveK: We have the BD-II content based statistics.
        How should we handle the fact that there is a push to move away from use of BDII?

DaveK: We should have a static table for the Tier-1s, which colums
        should define whether a site is IPv6 enabled ?

  0) Peering with the NREN enabled.
  1) Peering with LHCOPN/LHCONE enabled.
  2) Site network ready for IPv6.
  3) Dual-stack perfsonar enabled.
  4) Dual-stack storage enabled. Which technology ? Which fraction of
     the accessible storage ? Which VO(s) is it enabled for ?
  5) Other services enabled (GocDB, FTS servers)?

DuncanR: Shoudln't make it more complex than needed.
DaveK: Do we need to break down by VO  for multi-VO sites ?

CostinkG: Would we benefit from a dynamic testing dashboard ?
DaveK: At the start of Run 3 it could become part of the availability
        measurements.
DuncanR: If we had the worker node SAM tests include copying files from
          IPv6-only worker nodes or UIs test copying files or running jobs
          to all Tier1s that would provide the needed coverage.
DaveK: Sounds great, who is going to do it ?  - Each VO rep should keep
        it in mind.

AndreaS: There was a ticket on FTS to write addresses in the logs. 
DuncanR: It should be possible. It has to be enabled in the configuration.

DaveK: Do each VO have a breakdown of total data transfer ? 
RajaN: Yes, but currently there is no breakdown by IP protocol version.

AndreaS: The FTS production server has to be dual-stack for any transfer
          to occur on IPv6.
DuncanR: fts3.cern.ch -is- dual-stack!

DaveK: It would be nice if all experiments started collecting this data
        by applying the appropriate configs.

AndreaS: In the CERN Kibana there is now a 'Data IPv6' section.

DaveK: There are three main thrusts:

1) The existing data (eg. Alice)
2) The T1 table. Could set up a table that the sites can update themselves.
    Possibly T2s as well.
3) Expansion of the experiment functional tests - make sure that all
    network transfers leave a trace of whether they occur on IPv4 or IPv6.

DuncanR shows the Dirac Network test results on GridPP.
  on http://pprc.qmul.ac.uk/~lloyd/gridpp/

DaveK: We should be testing the status of the central services as well.
    Should be done via the testing framework or another static table ?
AndreaS: There's a way to test MyProxy.

Discussion on whether glexec via Argus needs to establish any network connection at the time the user 
identity is switched. Will need to find out.

- Coffee break -

Prepare for the Management Board on September 20th.
We need a discussion with the Tier0 people (either via email or in person, especially people in charge of 
the storage) to get some agreement before the MB so that Tier0 can state they are happy with the plan.

DaveK proposes two changes to the paper executive storage:
Change 'required' to 'requested', as it's less confrontational.
Add a sentence describing that just a fraction of the storage is needed, as described in detail in the 
paper.

AlastairDW: 'required' is less ambiguous, but it's just one word.
       Can add the sentence.

DaveK: Will KIT object ?
BrunoH: Will try to preempt the MB representative.
         Turning on dual-stack on a fraction of the storage should be acceptable. 
DaveK: What about the experiment reps in the MB ? The experiment reps should
        check that the MB member doesn't object.

DaveK: Anything to do with LHCOPN, Perfsonar, performance testing ?
        BNL started the peering. France and Germany are peering. RAL
        and Fermilab still missing.

BrunoH presents his draft status slides for the Helsinki meeting on Sep. 19.
Various comments on the status of peerings and perfsonar reachability.
The table will be updated as appropriate.

DaveK: Was the perfsonar measurement system updated in any fashion? What is in
        the upcoming release ? Should we be pushing for an upgrade.
DuncanR: The way the tests are conducted changed, but not the presentation.
        Mixing IPv4 and IPv6 results doesn't seem viable.

AndreaS: Will talk about ETF with DuncanR after the meeting.

DaveK: Should we be doing any data transfer performance measurement (beyond what
        perfsonar measures).

DuncanR: The FTS logs record the destination address of the gridftp
         transfer, but the throughput is sent elsewhere as part of the
         completion record. FTS was asked to send both pieces of
         information together but the request was ignored.

DaveK: How do we challenge the results of ipv6-test.com that often shows
        worse speed test results for IPv6 ?

Francesco P: How can we gather similar measurement results in a reliable
        fashion ?

DuncanR can collect data from the FTS logs and publish them.

AOB and dates of next meetings:
For next F2F meeting consensus on Thursday-Friday February 2-3, 2017.

Next phone meetings: Thursday October   6th 1600 MET-DST
                      Thursday November  3rd 1600 MET
                      Thursday December 15th 1600 MET

Summary of actions (from the two days):

Action1 : Raja to open a GGUS ticket about VOMS servers becoming dual stack for LHCb.
Action2 : Raja to investigate if the ipv6 java environment variable can be set in the LHCb software, while Francesco investigates the same issue from the lxplus / java end.
Action3 : Terry to put the WNs running pure-ipv6 on a separate queue and also circulate the list of these machines to the mailing list.
Action4 : Dave to raise the issue of IPv6 access to CA CRLs at the next IGTF meeting later this month.
Action5 : Raja to grep LHCb service logs and collect weekly measurements of the fraction of IPv6 connections.

 

 

 

There are minutes attached to this event. Show them.
  • Tuesday, 6 September
    • 14:00 18:00
      Session 1
      • 14:00
        Introductions, agenda, note takers 10m
      • 14:10
        Review minutes and actions 10m
      • 14:20
        Roundtable updates 1h 10m

        Including report from QMUL on their recent tests

        Speaker: Mr Terry Froy (Queen Mary University of London)
      • 15:30
        Coffee 30m
      • 16:00
        Tier 0/1 status and feedback on plans for IPv6-only CPU 45m
      • 16:45
        The CHEP2016 paper 45m
        Speaker: Alastair Dewhurst (STFC - Rutherford Appleton Lab. (GB))
      • 17:30
        Current technical issues 20m
  • Wednesday, 7 September
    • 09:00 13:00
      Session 2
      • 09:30
        Review agenda 5m
      • 09:35
        How do we best monitor the status of the migration to dual-stack? 25m
      • 10:00
        Back to CHEP2016 paper (if needed) 30m
      • 10:30
        Coffee 30m
      • 11:00
        Other issues 1h

        LHCOPN/LHCONE
        PerfSONAR
        Data Transfer performance testing?

      • 12:00
        AOB and next meetings 15m
      • 12:15
        Review decisions and actions 15m