HEPiX IPv6 working group F2F meeting

Europe/Zurich
31/S-028 (CERN)

31/S-028

CERN

30
Show room on map
Dave Kelsey (STFC - Rutherford Appleton Lab. (GB))
Description

Some timings are approximate and the agenda content might still change. Given the significant number of people attending who are not formal members of the working group, we have added some more general status and overview talks on Thursday afternoon.

Other suggestions for topics are welcome.

Please register if you plan to attend in person.

 

 

 

HEPiX-IPv6 WG meeting - minutes for January 16, 2020.

(notes by Francesco Prelz - for day 1)

The agenda for today and tomorrow is briefly reviewed. Today will mostly be devoted to a retrospective of the past (almost) 10 years of activity in this group. (as we have many visitors from the week-long networking meetings at CERN).

A brief round of introduction of attendees follows.

Present at the beginning of the meeting (around the table):
Working group members: D. Kelsey, B. Hoeft, M. Bly, K. Ohrenberg, J. Chudoba, M. Babik, F. Lopez, F. Prelz, D. Rand,
A. Sciaba, E. Martelli

Visitors from the on-going networking week at CERN: R. Evans (JISC), S. McKee (UMichigan), T. Suerink (NIKHEF), S. Zani (INFN-CNAF), A. Vorontsov, A. Baginyan, A. Balandin, A. Dolbilov, Y. Mineev (all from JINR), R. Hughes-Jones (GEANT)

 

Dave K. shows the retrospective talk at:
https://indico.cern.ch/event/855123/contributions/3704555/attachments/1970579/3277848/kelsey16jan20.pdf

Questions or discussion ?

D. Rand: How did CERN handle the address shortage in 2012 ?
E. Martelli: In the end we obtained two more classes from RIPE! This

              would not be doable today...

Rob Evans shows this presentation on the JISC involvement with IPv6:
https://indico.cern.ch/event/855123/contributions/3597207/attachments/1970758/3278084/HEPiX-ipv6_historyupdate_.v01.pptx

Questions and discussion:

D. Rand: It strikes me how slowly the UK university are moving with IPv6 adoption.
R. Evans: They all have managed networks: introducing IPv6 requires a person-power allocation they don't feel they can afford. If your network is not managed and IPv6 gets to every edge device via autoconfiguration this doesn't require any effort. Facebook and other large concerns only run IPv6: at the device level "it just works".

D. Rand: Will the RIPE announced IPv4 address outage will have an impact on these university sites ?
R. Evans: Most UK universities have a comfortable IPv4 allocation.

B. Hoeft: The percentage within DFN is around 10%, but on a rise due to a project to promote IPv6 adoption that will make a change.

D. Kelsey: Are the telephone providers (5G and the like) in the UK using IPv6 at all ?
R. Evans: There's a lot of IPv6 deployment in telephony in the US, but not in the UK.

D. Rand: At the UK IPv6 council there were talks by various commercial organisations that motivated the move to IPv6 with cost-saving/business arguments. Universities for some reason don't seem to be as 'money-minded'.

 

- coffee break -

 

Bruno Hoeft presents on the status of the IPv6 activities within LHCONE/LHCOPN using the slides at:
https://indico.cern.ch/event/855123/contributions/3597207/attachments/1970758/3278084/HEPiX-ipv6_historyupdate_.v01.pptx

Comments:
D. Kelsey: Fermilab responded to the GGUS ticket mentioning they would be ready in December. I see it's now January. Should we chase them ?
E. Martelli: They reported yesterday they need about three more weeks.

D. Kelsey: Good: weeks and not months! What about the Russian site (Dubna)?
 (nobody around the table can offer wisdom about it).

 

Andrea Sciaba' now comments on the process of migrating the Tier-2 storage to IPv4/IPv6 dual-stack access. His presentation is at:
https://indico.cern.ch/event/855123/contributions/3597213/attachments/1970783/3278208/IPv6_T2_deployment_update_20200116.pptx

Dave K. looks in more detail at the per-country statistics.  Andrea S. remarks that after one site is certified by the experiment as being OK, he won't look at it anymore. Any issue occurring later will be tracked as a new operational issue.

A short discussion on what incentives/punishments can be offered/threatened to sites to nudge them further follows. The extreme punishment would be to blacklist the site, but this is a Management Board type of decision.

We got to 70% compliance without threatening. Remaining cases are typically non pending on either the experiments or the site administrators, but more typically on networking personnel person-power/priorities.

Dave K.: Getting from 70% to 80-85% by the end of the year could be counted as a success.

Edoardo Martelli talks about the history of IPv6 @ CERN:
https://indico.cern.ch/event/855123/contributions/3706823/attachments/1970604/3277800/IPV6-20200116-CERN-IPv6-deployment.pdf

Note: The CERN LAN DB *requires* to have one IPv4 address per device, as a consequence of IPV6 support being retrofitted over IPv4. A dumb private class IPv4 address is assigned to hosts that require IPv6 only. As removing this constraint requires a lot of of work, there's no plan to remove it.

A brief discussion along the same lines that Rob E. was describing earlier on how managed networks at CERN (and possibly other sites) prefer not to change their management model follows. It seems that there won't be a DHCPv6 client on Android, out of a strong architectural preference for stateless autoconfig.

Francesco P: Once proper network access authentication and authorisation is moved to every network outlet via 802.1x or the like, probably SLAAC for all will become acceptable and any need to misuse address assigment (or easily forgeable MAC addresses) as an authorisation technique will drop.

Marian Babik finally presents on the status the WLCG Experiment Test Framework (ETF):
https://docs.google.com/presentation/d/1ZyPIwTRcFiQL5bmBag8QtK0yQ1FkLchnMYb3lc3EUtY/edit#slide=id.p

Dave K.: "Waiting for endorsement" means that we (as a working group) should be promoting anything ? Or is it a Management-Board level issue?
Marian B.: We could propose that the IPv6 reports be used for a transitory period.

Bruno H.: For the 100 GB testbed we can offer an edge node.
Marian B.: Sara, Nordunet, and others also have one. GridPP, JISC are in the process of procuring the equipment.

 

Richard HJ: What about getting statistics directly from the network equipment ?

Marian + Shawn MK:UMichigan worked on an SNMP plugin (plus some proxy) to get the router statistics - should be actually about 15 years old and on a comeback.

Marian B.: We found out the hard way it's not realistic to ask NRENs for counters. In some organisation a lot of red tape is needed to be cleared for data access.

----------- end of day 1 --------------

Notes for day 1 - provided by Martin Bly (STFC)

HEPiX IPv6 Working Group F2F Meeting @ CERN – Notes

16th – 17th January 2020

Martin Bly

Agenda: https://indico.cern.ch/event/855123/

 

Venue: 31/S-028, CERN

Host: Dave Kelsey (RAL), Edoardo Martelli (CERN)

Thursday 18th January

Afternoon Session

Introductions, agenda, note takers

Francesco taking notes this afternoon.

Brief history and status/plans of IPv6 working group: Dave Kelsey (STFC - RAL)

Dave presented the history of the group and its activities. Phase 1: 2011-2016 - Analysis and testing; Phase 2: 2016-2020 – The transition; Phase 3: IPv6-only networking.

IPv4 and IPv6 status in RIRs: Rob Evans (Jisc)

Started with a potted history of the IPv4 runout.  1993 – class-full to classless. Feb 2011 Last /8s handed out to RIRs. Sept 2012: Last /8 policy invoked in RIPE region. April 2018: last /8 exhausted by RIPE, allocating now from recovered address. Oct 2019: last contiguous /22 allocated by RIPS. 25th Nov 2019, no more IPv4, waiting list implemented. No more IPv4.  We really mean it.  New LIRs only, only one /24. Other ways: buy them. Other regions: ARIN – ran out in 24 sept 2015, waiting list only. APNIC – still in last /8 policy, new members still get /22. AFRINIC – IPV4 Exhaustion Phase 2. LACNIC – similar to APNIC.  History of ipv6 deployment. Google’s adoption measurements show that rather than Asia-Pacific leading the way, North America and Europe leading, likely due to broadband provider use. So ipv6 happening in the consumer world rather than the R&E world. Jisc situation: Larger existing customers tend to have large legacy blocks., smaller institutions use Jisc PA space, new assignments at a very low rate with NAT behind,. Probably sufficient to cope with foreseeable future.  Handful of member institutions sell of Legacy IPv4 space, have issued guidance on using brokers.

Coffee Break

 

Tier 0/1, LHCOPN and LHCONE status: Bruno Hoeft (KIT)

Bruno reported on the status of the network and Tier0/Tier1 readiness over time.

Tier-2 IPv6 deployment update: Andrea Sciabà (CERN)

Andrea reported on the project to get the Tier-2s to deploy IPv6. Current deployment complete shows Tier-2s at 70%. Most Tier-2s that haven’t deployed are in progress or stalled waiting for campus network teams to make progress (75% of cases) - it continues to be difficult in some places to get traction. Using Site’s estimates of when they will be ready, we should reach 96% within month.  Using data as of San Diego HEPIX, we should be more than 90% now, so site’s estimates are rover optimistic. Conclude that IPv6 deployment is slowing down – somewhat expected because the remaining sites are the ones having difficulties.  Remaining sites are committed to deploying, some sites are really close to success.  Now one year past the official deadline, last three months, no closed tickets.

 

IPv6 at CERN: Edoardo Martelli (CERN)

Started with a picture of the scale of the network at Cern: 230 routers, 3800 switches, 50000 connected devices.  Three networks plus external connectivity.  Managed from LANDB database because it is too big to do by hand. To deploy IPv6, need to start with LANDB.  Stared using IPv6 is 2001 but no demand for it at that point. IPv6 deployment project approved in 1Q11 – 1 x network engineer, 2 x software developers, each for 2 years.  To be ready for production by 2013. Details of service definition: identical performance, common tools, policies, service portfolio, at least one IPV6 address for every IPV4 address. Described elements of the work plan. Details of work on the LANDB – main tasks, challenges and limitations.  Also for network device configurations, DHCPv6 and DNS. Users able to control how their system is visible on IPv6 via DNS and the Firewall.

 

Monitoring including perfSONAR & ETF: Marian Babik (CERN),Duncan Rand (Imperial College)

Description of the Service Availability Monitoring framework (SAM). Details of the WLCG Experiments Test Framework (ETF), SAM3/MONIT. Notes on deployment and operation. Presented table of tests wrt the experiments and who maintains the tests. SAM reporting ready for IPv6, needs to be ‘endorsed’ by GDB.  Presented an update on perfSONAR.  V4.2.2. latest, new plugins, pScheduler adds pre-emptive scheduling, BWCTL retired, SL6 no longer supported – reinstall with CentOS7. 261 active instances. New network visualisation tools, work in progress, using traceroute data.

 

---- end of Day 1 ----------

 

Notes by Martin Bly for Day 2 (17th Jan 2020)

Morning Session

Round Table Updates: WG members

Cern: Edoardo.  IPv6 not much used (outside DC?) Still some bugs in router and switch firmware being fixed by vendors.

Duncan: Has does some work with Atlas contact (Raoul @ Brunel) to test ipv6-only, using a library that simulates IPv6 only networking (turns off IPv4 networking.) Bruno reported various issues with test jobs, Atlas code now runs mostly. Brunel can run Atlas CMS, LSST jobs IPv6-only.

Also attended UK IPv6 Council meeting. Lots of service providers offering free IPv6 and charging for IPv4 addresses. Noted that Council gives out awards for sites that have more than 20% of traffic over IPv6.  Imperial and QMUL have these.  Also, Tim Chown (Jisc) has been elected to the IPv6 Hall of Fame.

Francesco: Has contacted three remaining INFN sites not running IPv6 services.  Pisa responded, awaiting perfSONAR box. Both Rome sites didn’t respond. Some reluctance to conduct ipv6-only testing from INFN networking -  do not want to disrupt Milan Tier2 production.

Expt VO-feed statistics story: Two of three query sites down, but some Atlas sites data missing for a while due to a change in query response requirements.  Results in a 1.2% drop in number of IPv6 sites.

Fernando (PIC): Starting to migrate services to Kubernetes cluster which is ipv4 only.  First service will be a Squid which is dual stack, so n=moving this is a regression. 

[Kubernetes is (will be) IPv6 compliant, v.1.16 alpha. (https://kubernetes.io/docs/concepts/services-networking/dual-stack/.]

Reported ipv6 traffic is up 92% in 2019 over 2018. 

Richard H-J: Géant reports split traffic volumes for ipv6 and ipv4 for all links in backbone.

Andrea: nothing to report.

Dimitrios (ATLAS): News for future – Andrea has agreed to assist with how to check whether sites are actually working OK with IPv6, perhaps crafting new automated tests. Possible to adapt the CMS storage tests. Sort discussion on use of XRDP.

Kars (DESY): Zeuthen how have their new routers so preparing network for IPv6 – expected to be up any day. Otherwise support most services on IPv6 on campus.

Martin (RAL): 100Gbps for OPN this year, will probably land on border router and use firewall bypass. Will move internal ipv6 link from 10G to 40G soon. Packet loss issue appears fixed.

Bruno (KIT): Every new ipv4 address handed out gets an IPv6 address too, each ipv4 subnet gets an ipv6 subnet as well. Assisting various local universities with IPv6 transitions.

Vitalli (Atlas Tier1 Canada): Have dual stack except for WNs. Tier2s (3 or 4 for Atlas) in Canada not ready, trying to push, but staff time limited, and site campuses not enabled.  Canada Tier1 is at SFU and TRIUMF, CPU at both, storage at SFU. Under the Compute Canada umbrella, various big datacentres around Canada. Shared by many sciences.

Any policy on size of prefixes size for LHCPON/LHCONE?  No, up to site. Might be issues with number of prefixes for sites if they use manual filtering.

Raja (LHCb): No issues with IPv6 in last few months, everything smooth.  Noted (after checking, requested by Duncan) than LHCb is not running @ Brunel.

Coffee break

Set dates of next meeting

Next Face to Face: Weds/Thurs 3rd-4th June @ CERN.

Video meeting: 6th March, 4pm Taipei time - ** Note added 7 Feb ***  HEPiX in Taipei has been cancelled - will find another time/day for this meeting

Video meeting: 29th April 16:00 CEST, (14:00 UTC)

Discussion:

Dave: discussion of email from Petar, Austria. From a new T2.  Wants to know if using an IPv4 infrastructure with some Nat64 interface to talk to IPv6 outside.  This isn’t a good idea.  But they have already reported back that they are going to use dual stack from the start.

Future Plans for IPv6 Working Group

Submission to HEPiX @ASGC, Taipei

Need someone to present paper, perhaps a similar or the same one to the one to give at ISGC. (Note added 7 Feb - HEPiX now cancelled).

Encouragement and support of remaining Tier2s to move to IPv6.

Experiment reps should encourage T2s to move.  Raja will help persuade the LHCb T2s. Document transition strategies.

Need more IPv6-only testing. Discussion on methods etc., checking if it works.  Concern over whether an ipv6-only LAN is required for real confidence that it does.

Ipv6-only on LHCOPN?  Need to fix up FNAL local FTS, the Russian Tier1, older storage end points like Castor. Longer term project – but don’t want to be in a situation where dual stack is for ever.

When do we want to say that WNs should be dual stack?  Soon – could use the carrot that having WNs ipv6-ovly ‘gives up’ ipv4 addresses for other uses.  Perhaps encourage ipv6 containers on a dual stack WN (Richard).   What about services nodes too?

Dave: Related a success story in the USA wrt identity management, after a lot of past stress, uptake of standards made a requirement of participating in various collaborations.  Push take up from 5% to 98%.  Enormous amount of work, through carrot and defining that sites sign up to various requirements.

CEHP2019 paper – status and plans

Discussion of various things to put into paper.  Francesco will contribute a page on IPv6 only, collaborating with Raoul. Need a Tier1 and Tier2 page. We could ask Marian to contribute about monitoring. Limit 6 pages plus references.  Thrust of paper is IPv6-only.

 

---- end of Day 2 ---------

 

 

 

There are minutes attached to this event. Show them.