HEPiX IPv6 Working Group - in person at CERN
These dates are now confirmed. We will meet in person at CERN. Zoom connectivity will also be available for those of you who cannot attend in person.
Please register to say you will attend the meeting (select whether in person or not). And request a CERN Visitor Access pass if you require one.
Afternoon session of the IPv6 meeting at CERN - Building 37.
Wednesday, February 22nd, 2023 - https://indico.cern.ch/event/1242396/
(Francesco Prelz)
In attendance: Martin Bly, Costin Grigoras, Bruno Hoeft,
Dave Kelsey, Edoardo Martelli, Carmen Misa Moreira, Kars Ohrenberg,
Francesco Prelz, Andrea Sciaba'.
Remotely (Zoom): Tim Chown, Christopher Walker, Jiri Chudoba,
Raja Nandakumar, Andrea Sciaba.
After the coffee break: Nick Buraglio, Phil DeMar, Hironori Ito, Shawn McKee
Apologies from Dimitrios C., Raja N., Duncan R., Mihai P. - others ?
Dave K. introduces the meeting and thanks Edoardo M. for the organisation.
Reviews and updates the agenda. Adds news from the management board,
last Tuesday - the last IPv6 presentation there was ~18 months ago.
They were very pleased that we are looking into the 'obstacles' for
IPv6-only, and try to understand why people are still using IPv4.
The Tier-2 status from Andrea was presented at the management board
(92% of the Tier-2's are done with dual-stack storage). The problems
with monitoring and the experiments moving to protocols that are not
instrumented to monitor IPv6 were covered - and how we lost the ability
(due to a router 'bug') to monitor IPv6 traffic at CERN.
BTW: were we sure they were correct in the past ?
Edoardo M.: They were cross-checked in the past, and they matched.
Stressed again that we don't plan to be dual-stack forever.
Liz (from FNAL) noted that we made a statement that IPv6 is less secure -
while it is as secure as IPv4 (probably a misunderstanding).
The management board minutes recorded clearly (already in the past)
that the aim is IPv6 only. What is questioned is just the timetable.
Site (round table) updates.
CERN (Edoardo M.): we hope to have the data collection from the LHCOPN
routers fixed. Juniper suggested both FW updates and configuration changes.
The update did not help. Needs a few more days to check whether
the suggested configuration change (packet sampling rate) has any
effect. The rate is adaptive, but the collector is not notified of what
the configured sampling rate is.
For some Tier-2 (e.g. KIT) the IPv4 and IPv6 traffic are separated
on different VLAN - that may help in recovering the lost statistics.
If everything else fails, I'll propose to separate the traffic at
other sites as well.
Working with Carmen on top IPv4-talkers - this will be reported on.
In the past there were DHCPv6 issues in the computing center, due
to a Juniper bug that caused services to remove IPv6 on hosts
due to mis-mapped DHCP renewals. There are no more excuses for
CERN service managers to remove IPv6. Today the IPv6 DNS entry for
one of the top talkers, a CVMFS server, was restored - it was probably
removed because of these bugs - we'll see
FZU (Jiri C.): I was pleased to see the jumbo frame issues in the agenda
tomorrow. We had to switch them off because of many problems.
We have separate xrootd servers for the ALICE experiment.
Could they be made IPv6-only?
Costin G: No. Please don't even consider it. Only 1/3 of the worker
nodes can connect via IPv6 - they won't be able to connect to storage.
The storage nodes would be the very last to go IPv6-only.
Also, not all data have a second replica on an IPv4-accessible storage
servers, so going IPv6 only will make a fraction of the data
effectively
Dave K.: This shows that, as storage was the first service to get IPv6
it should also be the last to lose IPv4.
Francesco P.: This actually goes for any service needed by all
worker processes, in a many-to-one fashion.
Jiri C.: What is the percentage of unique data that we store then?
We were told that if we lose data, they can be recovered from elsewhere.
Costin G.: Not true - we cannot affort to replicate 2 PB of data,
so most of the data on your site is probably unique/single-copy.
Chris W.: Did you characterise a migration milestone for various
host classes (Storage, Worker nodes, etc.)?
Dave K.: Sometimes in 2018 we made a schedule for storage, but
we don't have it for other services. We, as a working group, can
only have "aspirations" that need then to be cleared by the management
board. More on this tomorrow.
INFN (Francesco P.): mostly silent. Took a while and some pushing and
pulling to bring public IPv6 to new office. Extended a VLAN from
the Milan Tier-2 for the purpose: could serve for some IPv6-only
testing. Working with people at CNAF on a new employee induction
course: it will include IPv6 material.
And... for the first time we found a user connecting from home via IPv6.
He was a customer of the first (and so far only) italian ISP providing
IPv6 (Fastweb - now a subsidiary of Swisscom).
JISC-or-UK-in-general or IETF: Nothing springs to mind in particular.
DESY (Kars O.): No IPv6-related news. We measure 20-30% IPv6 traffic on the
external network connection - we have no internal statistics available
- this is still considered to be a small fraction.
RAL (Martin B.): IPv6 is starting to proliferate more around the Tier-1.
Getting the LHCONE routing (IPv4 first, then IPv6 should follow
easily) is in progress, slowly. We had sporadic issues with outside
DNS resolution, under investigation. Nothing IPv6 specific.
The plan for IPv6 dual-stack worker nodes will follow the LHCONE
configuration.
KIT (Bruno H.): Nothing going on in our WLCG center (GRIDKA). The core
routers claim to be supporting DHCPv6 delegation: we'll check them
out. Memory usage on the core devices is a bit tight. Careful hand-made
configuration for memory shaping is required, and version upgrades
require manual checking. IPv6 firewall functions may have to be moved
to ACLs.
BNL (Hiro I.) Nothing big. FTS is now capable of identifying IPv6.
Dave K.: But the log files should be instrumented to show the
version - there's development work to be done.
Experiments updates:
No news from ATLAS (Dimitrios).
ALICE (Costin G.): Last time I checked storage and computing elements
we had 100% coverage on Tier-0 and Tier-1, and 90% overall.
There are storage elements that resolve as IPv6 in the DNS but are
not reachable. The situations seems to be slowly degrading.
On the VO-boxes, representative of computing nodes, nothing
is changing.
Only 1/3 of the jobs call the central services at CERN over IPv6.
Don't really know how to push the sites.
Dave K: at the management board the question "what should we
be doing" was asked - but the experiments are all represented here.
Andrea S.: Maybe the question was: "what can the *computing management* do"?
That's a bizarre question anyway.
Dave K: IPv6 seemed to be perceived as "new"...
LHCB + DUNE (Raja) sent update slides - these will be uploaded to the agenda
page.
--> The reasons for the large amount of IPv6-only 'aborted' jobs should
be identified in detail.
--> The reasoins why the storage node at RAL was not accessed on IPv6
CMS (Andrea S.): little to say. Checked the status on preferring IPv6
(as a result of Carmen's earlier work). Apart from FNAL and MIT, this
is true everywhere. The issue with the two missing sites seems to
be related with token authentication into HTCondor, and could therefore
be shared between the two sites. Some IPv6-edge of the token transition.
TIER-2 status (Andrea S.): same situation as in January.
The usual summary page is shown:
https://twiki.cern.ch/twiki/bin/view/LCG/WlcgIpv6#WLCG_Tier_2_IPv6_deployment_stat
Dave K.: Wonder what Raja would say about the 79% LHCB figure.
Andrea S.: There are a number of UK sites that have no updates.
Dave K.: Sites are chased periodically: they always report some DNS
and/or networking issue.
Is there anything else we should be doing as a group, other than
chasing the sites in round-robin fashion?
Going tp IPv6-only could be a good threat...
Clearer statements from the research networking team
(about packet-marking and the like) could also help.
I keep proposing that future bandwidth upgrades be
IPv6-only.
(-- coffee break --)
The re-arranged agenda first covers the ISGC talks in Taipei.
Bruno shows these slides:
Costin G.: On the DNS configuration, we found that configuring all the 20 IPv6
DNS addresses of our DNS pool was too long - did not fit in one non-fragmented
UDP packet and so broke the update of caching DNS servers at some site.
We resorted to setting up a list of 20 IPv4 addresses, plus a random
selection of IPv6 addresses.
Jiri C.: I'm curious about your solution for NFS - the major IPv4 talker.
Bruno H.: Need to get an answer from the people running the servers.
Dave K.: The 'obstacle investigation' material should be coordinated
between the two talks - but there is definitely material for
two talks. You showed clearly that the task is not trivial.
Next on the agenda is Carmen:
Hiro I.: The issue with IPv4 was identified in February, one year ago.
Dave K.: I remember - it was Valentine's day one year ago.
Bruno H.: The storage is shared among all 4 VOs, but there's a separate
server pool for each VO, so we can tell the VO by the IP address.
Animated discussion on how to identify the source of the protocol choice from
the network flow data follows. IPv4-only CVMFS server nodes were found.
Third-party HTTP transfers may be driven by the destination address.
They may even be predetermined by whomever organises the third-party transfer
(rucio et al.). The flow port number may help in identifying the protocol
at play.
Dave K.: Acting on the top talkers on a weekly basis could benefit
from a better chance of finding information from logs before they
roll over, etc...
Chris W.: The top-talker list could be posted on a weekly basis, as
it's scriptable.
Dave K.: "name & shame" at the management board ?
Nick B.: then shares information from two slides (to be posted to Indico):
We finished "data call #3" at the end of 2022. This was a request for
information across all the DoE sites: what does it take to shut down IPv4?
73% of the assets are aither dual-stack or dual-stack capable - a
fairly high percentage.
Roughly 1% is running on IPv6-only: an, (ironic) reduction in percentage
w.r.t. the last data call, as the number of assets has increased.
It's encouraging that the estimate of incompatible assets is very low,
including specialty instrument installations and the like.
Very happily surprised that two sites are migrating their VoIP systems
*now* to IPv6-only. Would have thought this would have been done last
and not first. This is encouraging: lots of progress is happening.
On January 18th the NSA released a document "IPv6 security guidance"
(link in the slides, but should be this one:
https://media.defense.gov/2023/Jan/18/2003145994/-1/-1/1/CSI_IPv6_security_guidance_.PDF )
There were a few glaring notable issues, that didn't get the amount of
investigation that should have gone into editing such a document, that,
once released, is considered to be "gospel".
E.g.: the recommendation to use DHCPv6 as address management protocol.
There's a large installed base that doesn't support
DHCPv6 - Android, in particular: they claim it will never be supported.
The document also refers to a couple of deprecated/outdated RFCs.
E.g. RFC1999 - outdated. There are typos and signs of poor proofreading.
(... had to leave at the foreseen end of meeting time, 17:30 ...)
2023-02 HEPiX IPv6 WG F2F Cern – Notes
Martin Bly
22 and 23 Feb 2023, Building 37, Cern.
Wednesday:
General Status: (Dave K)
Report to Management Board: General status. Issues with IPv6 LCHOPN/LHCONE traffic stats. General support from MB for progress, encouragement to continue. Biggest concern that the monitoring is broken.
Updates from Sites (round table):
Cern (Edoardo): Latest fix from Juniper has not resolved the monitoring issue. Testing the latest configuration now, need a few days to verify whether or not the new fix is good. Fix is related to sampling rates. Now have a fix for bug in Juniper routers that caused ipv6 addresses to be lost in DHCP space on renew requests. No longer an excuse for hosts not to be IPv6.
(Jiri): Wants to discuss Jumbo frames – have used it in past but had to stop due to lack of support and problems at remote sites. Also wants to do IPv6-only for Alice storage – considerable resistance from Costin due lack of resources on IPv6 at their storage sites. Cannot use failover to IPv6 (or IPv4) since don’t have the resources to duplicate most data. Chris W: Timetables for dual-stack on other classes of hosts? Only did it for storage.
INFN (Francesco): Noted that one user using ipv6 from home. ‘How come it works if I use ipv6 and not IPv4?’ Highlighted the different firewall rules… need the same rules in both.
Tim or Chris (for UK in general): None.
DESY (Kars): No IPv6 related news.
RAL (Martin): Joining LHCONE in progress, then dual-stack WNs to follow.
KIT (Bruno): Memory on routers tight, trying to alleviate with URPF instead of access lists, but configuration difficult, may have to resort back to IPv6 ACLs.
Updates from the Experiments
Atlas (Dimitrios): NTR
Costin (Alice): 100% of storage at T0, T1s on IPv6, but only 15% at T2. Reported that seeing cases of IPv6 failing to work silently die to a system having an IPv6 address but is unreachable – no one noticed because IPv4 works.
LHCb & DUNE (Raja, slides presented by Dave K.): No changes since Oct 2022, smooth running across grid. Testing of IPv6-only for LHCb at Brunel: runs well, at Cern: ce666. Noted that there is a GGUS ticket for RAL LHCb VO box – 1% of the connection
CMS (Andrea): Very little to report. FNAL and MIT still have issues (related to not using dual stack and preferring IPv6.)
T2s (Andrea): Status is the same as in January – nothing has changed.
Presentations to ISGC2023 (Bruno)
Jiri and Bruno will be at ISGC2023 in Taipei. Two talk abstracts submitted. Bruno’s slot is 30 minutes. Presenting IPv4 to IPv6-only worker node. Outline of WN transition to ipv6-only. C.f. slides. Noted issues with NTP responses from ‘dubious-looking sites’ – now understood. Note on DNS resolution: list IPv6 addresses for DNS servers in resolve.conf first. Config option for CVMFS to prefer IPv6. CVMFS frontiers need to switch off IPv4 to force IPv6. (cvmfs_ipfamily_prefer=6). Notes on HTCondor, Logstash, etc for IPv6. Deployment process issues. Stats etc.
Investigation of IPv4 top talkers (Carmen)
Looking at LHCONE/LHCOPN IPv4 to talkers. Step 1, check if dual-stacked. 12 systems non dual stack: cern 1, cnaf 3, in2p3 6, 2. Stats for Jan-Jun 2022. Discussion.
Update from USA, DOE, and recent publications (Nick Buraglio, ESNet)
DoE ipv6 only – finished data call 3 to ascertain how landscape for disabling ipv4 looks. Encouraging that large % of resources are dual-stack and preferred. About 1% are ipv4 only – falling due to increase in number of assets reported. DHCPv6 not as widely supported as needed – brought up in I(E)TF.
Sites:
BNL (Hiro): Is monitoring changed to look at / work with IPv6? Yes, monitoring (FTS central code) has been changed and rolled out, but up to storage dev to develop ipv6 monitoring capabilities, and the sites to roll it out.
FNAL (Phil Demar): Nothing significant: Focus on ipv6 mostly on corporate/financial side. Science side is mostly dual-stack. Seeing increase in IPv6 traffic on their science network – most is CMS data.
Thursday:
In-room: Edoardo, Kars, Andrea, Carmen, Marian, Francesco, Martin, Bruno, Dave.
Remote: Tim, Jiri, Chris
Preparing for presentation to CHEP2023
Discussion:
Dave: need to present where we are, and note the problems with lack of monitoring. FTS instrumentation will take time. Are there any other things we need to do?
Edoardo: Can split traffic by VLAN but not everywhere.
No actual mandate of campaign to get dual-stack WNs. This has implication for data transfers.
Work before May:
1/ Monitoring at network level: Some more splitting by VLAN. For Edoardo and colleagues.
2/ Work with FTS devs to upgrade
3/ Lack of IPv6 on WNs. Encourage sites to enable dual-stack WNs.
Get further on with top-talkers - following up more rapidly on results. Weekly rather than monthly.
Include port numbers to help identify application (xrootd etc).
Need to cultivate site contacts via VO contacts? GGUS tickets. Re-engage with IHEP. Go T1 by T1.
Some cloud resources don't yet to IPv6 - need to encourage cloud resources at RAL etc., to do IPv6.
And SKA.
Tim: should there be a WLCG policy that you can't join unless resources are dual stack?
Dave: and Jisc saying that high speed connections are dependent on full support for ipv6.
IPv6-only planning
(note – I missed the change of topic in the discussion…)
Dave: proposes plan to do some more testing before chep, report on finding, then go to WLCG to get mandate for dual stack.
Francesco: have a mandate for a few ipv6-only WNs and sites for testing, excuse sites from lack of efficiency.
Dave: ipv6-only WNs and RAL? Martin: big ask for all resources - maybe testing, but after dual-stack.
Lots more study needed. Concentrate on ipv6-only services as WNs rather than whole node ipv6.
Chris: should we encourage WNs to be the first ipv6-only service?
Concentrate on ipv6-only services, go for a date after which ipv4 services may no longer work? Start of HL-LHC.
Bruno: all WNs joining WLCG must be dual-stack or IPv6 only?
Jumbo frames (Tim)
Periodic interest in Jumbo in community. Outlined issues. See slides. Previous WLCG recommendations. Local host at 9000 would need larger MTU on backbones. Also not block PMTUD protocol to allow ramp down to 1500.
Not clear how much this was supported by NRENs. Benefit - in principle, higher throughput - reasons and experiments presented. Also showed how bad the default OS tuning for NICs etc., is. Higher MTU tests - no retransmits! Do larger MTUs reduce or eliminate retransmits?
Concerns? If PMTUD doesn't work, ipv6 doesn't negotiate correct MTU.
All hosts on a LAN must run the same MTU.
NREN backbones may not have enough overhead. Mostly they do.
Various thoughts (see slides).
Discussion:
Bruno: LHCOPN/LHCONE is set for jumbo frames, but few sites set to use it.
Francesco: big move. Noted that jumbo make shared file systems much faster.
Edoardo: problems at CERN with jumbo, mostly due to PMTUD not passing. Fixing them is problematic due to nature of network at Cern.
Chris: problem of only 1500 MTU
Can't put jumbo on for ipv6 only - NIC operates at MTU size regardless unless negotiated down. Would need a second NIC plus switches/cables etc.
Jiri: Tried Jumbo in Prague. Had to turn it off due to transfer issues. Difficult to debug.
Tim: sites with science DMZ type networks should be easy to turn on jumbo.
- Find out what the current use of Jumbo is, and who's happy
- Raise with WLCG management? NetWG (WLCG working group) - is it our problem?
- Link jumbo and IPv6 issue together (Francesco).
- Ask NRENs what evidence of jumbo frames in use.
Coffee
News from other activities
Packet Marking (Marian): (c.f. slides)
Packet Marking (packet flow label) and Flow Labelling (UDP firefly).
Current:
- xRootD 5.0+ supports udp fireflies. dCache: PoC ready, supports udp fireflies. Testing at AGLT2.
- flowd service to do flow and packet marking, v.10 released. Can mark packets for 3rd party services.
- Collectors and receivers developed by ESnet (scitags github).
- Registry provides list of expts and activities supported: JSON @ api.scitags.org.
- Flow id propagation. Work needed agreed with Rucio and FTS
- SC22 demonstration. Showed packet marking at 200Gbps using flowd with both xrootd and iperf3. Sflow collectors.
- Data collected at Cern on P4 programmable switch. No appreciable impact on the switches. Instrumentation on hosts showed impact on Linux kernel for each flow. Lots of publicity at SC22, well received at booth. Lots of interest in the project.
RNTWG Plans:
- engage with storage techs adopting scitags, dCache, EOS, Ceph/Echo, StoRM to understand plans etc.
- Propagation of flow identifier in WLCG DOM. FTS and Rucio implementations. Engage with DIRAC and Alice O2.
- Colectors/Receivers - establish production level network of receivers (ESnet, JISC, GEANT?)
- R&D - routing and forwarding using flow label in P4 testbed (MultiONE).
Draft plan for networking objectives and milestones.
PerfSonar (Marian):
Perfsonar5 beta is out, expected GA soon. Opensearch as local archive replacing esmond/cassandra. Grafana visualisation.
Toolkit supports CC7, latest Debian 10, Ubuntu 18/20 and RHEL8 (alma/Rocky). CS8 not supported, Alma 9 TBD.
WLCG-wide update campaign to follow release.
Evolution of Network Measurement Platform. Move to directly publish results from PS to ES@UC.
ATLAS Alarms & Alerts Service. non-ATLAS-specific tool kit can subscribe to for various alerts. Now working on Network
Path change alearts, and bandwidth decrease alarms.
Update on FTS, monitoring etc re IPv6 (email from Mihai):
Released FTS: v3.12.4 which has desired "ipver: ipv4|ipv6|unknown" fix. Needs to be updateded at sites.
HTTP-TCP: see update from Mihai email. Dave will create a slide and upload.
AOB, dates of next meetings
Next meetings:
Zoom:
13th April 16:00 CET (Thursday)
31st May 16:00 CET (Wednesday)
F2F:
Aim to hold next F2F @ CERN. Avoid CHEP in May. TNC in June.
28th-29th June @ CERN - TbC. Dave will email the list.
AOBs:
Discussion of slides for ISCG/HEPiX? By email.
HEPiX Abstract: need to generate and submit - Dave/Bruno