HEPiX IPv6 Working Group - in person at Jisc Offices in London, UK
Jisc, London
These dates are now confirmed. We will meet in person in London. Zoom connectivity will also be available for those of you who cannot attend in person.
Venue Jisc, London
Address 15 Fetter Lane, London. EC4A 1BW
Map URL https://maps.app.goo.gl/jyDMVXDbHEHSVSMt7
Zoom videoconference - we are using a Jisc room
https://jisc.zoom.us/j/91773420653
The password has been sent to the HEPiX email list
Please register at your earliest convenience and state whether in-person or remote.
Note: this meeting is the day after the annual meeting of the UK IPv6 Council. Attendance at that event is "optional" - registration for that event is separate (free of charge) - see https://www.ipv6.org.uk/2023/09/19/ipv6-council-annual-meeting-2023/
HEPiX IPv6 Working Group meeting (at Jisc offices in London, UK)
Wednesday 22 Nov 2023 and Thursday 23 November
(notes by Christopher Walker, Jisc - who says they still need tidying)
Day 1 (morning) - Consider UK/GridPP experiences (and also IPv6-only perfSONAR issues)
End goal is IPv6 only. Should the working group get
· Cut them off
· Continue running dual stack for evermore
· Put work in to make services available for IPv4
o But probably at a disadvantage to them
General thrust should be IPv6 only on LHCOPN/LHCONE
How much resource are we prepared to lose (say 2%).
Next thing is ticket campaign to get CPU to IPv6.
business case to management
· Damage to cutting people off
· Damage to not going IPv6 only
Hard to imagine end user devices that don’t support IPv6, problem will be user connectivity.
Not just hosts being IPv6, it’s DNS not being accessible via IPv6.
Perhaps turn IPv4 turn off day (eg dropping v4 routes)
Need to have all universities behind you
All staff at universities behind you.
Jobs phoning home won’t work if home institution does
How about a CHEP paper on consequences of turning off IPv
Will need some
If we cut off IPv4 in order, what will break:
* LHCOPN
· LHCONE
Trend for LHCOPN is away from IPv4 to IPv6 – 90-98% IPv6
LHCONE – 50:50 or 60:40
CMS reads a lot of data from worker nodes – and if worker node is IPv4 only, Found this infinite list of
There was a statement yesterday
RAL thinks they have moved all the worker nodes to be dual stack.
------------
Perfsonar
Geant are one of the partners for perfsonar development. Geant as a partner is
Rezi has been doing some good work on doing perfsonar IPv6 only.
What does IPv6 only mean (does it mean removing IPv4 on the localhost).
Live demo.
Currently working to make Perfsonar IPv6 compatible. Goal is to make perfsonar fully compatible with IPv6. Find and fix some services to make them IPv6 compatible.
Work Rezi has done is to get it working on an IPv6 only host. Should currently work on dual stack hosts.
Has written a report on what he needed to change – was listening on IPv4.
During installation on iPv6 only – no issue.
Elmod service. Proxy is
Default on Ubuntu was localhost was only 127.0.0.1 (and not IPv6 host).
Had to patch elmond/app.py to open ::1, rather than 127.0.0.1
configdaemon.conf – similar issue.
Configmanager/Utils.pm
Is there any standard
· SHOUL DEBIAN DEFAULT TO IPv6 localhost first in /etc/host
2nd issue he is working on – logstash service connecting on IPv4.
Reza has identified all the problem points
· Maybe need to document what the hosts file should be?
Can WLCG run a perfsonar node that is IPv6 only?
* Brunel and KIT offered to run a perfsonar node that is V6only.
ACTION get UBUNTU, DEBIAN, EL9 IPv6 only perfsonar.
· Offer from Raul, Brunel/Jisc, potentially KIT and potentially Francesca for Milan (Debian).
Question, what do we mean by IPv6 only? Agreement that this means:
· No 127.0.0.1 in /etc/hosts
· IPv4 turned off in network manager.
Should try to ensure that if you have a perfsonar server it will work.
------------------------------------
IPv6 at QMUL + overview of IPv6 for UK GridPP sites
Dan Traynor, GridPP, QMUL.
- see Dan’s slides.
Use LUA records on internal DNS to redirect worker nodes to talk to internal address.
IPv6 Routing done in hardware by switch (running SONIC) – no complex firewalls, but ACLs blocking incoming connections (only allow outgoing connections) for worker nodes.
Additionally OOB network – unrouted
Note part of IPv6 address matches VLAN number.
Future plans – add IPv6 to local (internal) DNS
Add IPv6 to Out of Band (IPMI) network
With Migration to EL9, transition to IPv6 only SLURM (batch system).
Need to get IPv6 PXE boot working.
Next version of Lustre will have IPv6 support. Will transition when the next long term support version is released (expected 2025).
UK GridPP picture (putting IPv6 on worker nodes).
- 3 sites use public IPv4 address – trivial to do IPv6.
RALPP, Imperial, Manchester
Most GridPP sites use private networks for worker nodes. World facing
Two cases one switch carrying both internal and external networks – have setup one or more internal VLANs
- Complete physical isiolation – haven’t even set up a valn on internal network (use VLAN1).
- most UK Grid sites have a network for grid that is separate from other users (eg departmental services).
- 2 sites (eg Lancaster) have a shared resource.
Blockers
- Lack of interest from central IT
- Some non IPv6 friendly (HDFS)
- Lack of manpower
- Don’t feel comfortable with IPv6.
- Solutions
o Public IPv4 – just add IPv6
o Big sites / modern switches – QMUL solution of routing on switch
o Small sites / old switches – deploy a server as a crouter. Looks like a NAT, acts a bit like a NAT.
Can GridPP act as a case study and help the sites to move to IPv6 by providing a template.
· /64 for world facing nodes
· /64 for worker nodes with ACLs (to block incoming connnections)
o Potentially use NAT gateway for this
o Modern switches (certainly Mellanox) can also act as router (eg
Raul – Brunel has some IPv6 only worker nodes
· LHCb works
· Atlas also runs jobs
· CMS – requires a ping that currently is IPv4 only
--------------------------
Imperial David Stockdale
- RoCE wasn’t reliable for them
o Some nodes have infiniband for inter node communication (not sure if that’s IPv4 or IPv6 , or not IP – but internal only)
- All compute now v6 only
Matt Harvey – green field HPC install – why don’t we try and make it IPv6 only from day one.
- PBS didn’t support v6
- 32* 100Gig juniper QFX – with everything connected to 2 spine switches.
- EBGP, ASN per switch
- /64 IPv6 and /24 IPv4 per leaf, MTU 8000 (was the number David was given)
- MP-BGP between switches, IPv6 sessions only
- IPv6 only to rest of college network
- /32 IPv4 route on servers via local for PBHS
- 1G management network
HPC refresh
- So far so good
- How to boot
- DHCPv6 and UEFI
- SLAAC, RDNSS
- - plan – stateless DHCPv6 server on switch returning PXE options
- Server; “I only support DUID-UUID” ☹
- Juniper switches didn’t support this
- Plan B
o DHCPv6 relay to Kea returning PXE options
- Plan C
- - Stateless DHCPv6 relay to ISC returning PXE options
- - Server: “I see your RA. SLAAC Address, done”
- - Server: other configuration flag set? Will do”
- Plan D
o Stateful* “DHCPv6 relay to ISC
o Success!
Sort of… hello iPXE
iPXE: Information reques
- Switch: drops packet
- iPXE recompiled to not send Information-requests
- iPXE “I don’t care, the NIC wasn’t iniitalised anyway”
- iPXE recompiled with bodgetastic initial sleep
- Us, 2 days later – machines booted.
o Talk to ret of the world
o NAT64/DNS64
- Presenting software exhibit A
- - Where’s my licence server
- DNS here A or AAAA
- One AAAA ony hostname later Fixed!
Exhibit B talking to licence server “What’s a AAAA”.
- fixed with yum install clatd.
We have a fully functional system.
One last thing .. RoCE
- Oh and remember those jumbo frames
- - 1440 MSS, IPv6 src -> NAT 64 -> IPv4 dst
- 1480 byte IPv4 responses from Internet to NAT64
- 15000 bytes when translated to IPv6 too big for college network
- - broken PMTUD to some remote hosts. Clamped MSS on firewalls.
All 7 racks now in full production use.
What next
- Host side pxe problems – either host side problems
- HPC now no longer needs to run NAT infrastructure.
Heavily managed Linux estate made CLAT possible (but required)
- DHCPv6 implementations in switches and PXE not great.
Wider college:
Legacy campus architecture
· IPv6
· SLAAC, RDNSS
· Public IPv4
· DHCP with mostly static assignments
· No NAT
· A records in DNS for static IPv4 addresses
· Many IPv4 firewall policies
New campus architecture
· IPv6
· NAT64/DNS64
o Returns both IPv4 and IPv6 address, and client should then prefer IPv6 (but some
o Only thing that really broke are Xboxes
· RFC1918 IPv4
· NAT44
· DHCPv4 with mostly dynamic assignments
· AAAA record in DNS for stable IPv6 addresses
· ZTNA
· Deny DHCPv4 requeste ofr hosts capable of IPv6 only
o Option 108 an elegant solution to this
· All the enabling work for IPv6 mostly.
o For inbound connectivity
o Stable EUID address, though realise we are moving
Political desire to get cyberessentials (required by some research grants) – every device in scope managed, security updates applied within a week. Really at odds with student BYOD. College s
F5s were doing NAT 44, now doing 64 – load balancing. Any service will be by default dual stack and you can back IPv4 or IPv6 on the backend.
Experience so far
· Generally working very well
· Avoids twice as many VLANs/subnets
· ZTNA (zscalar) solution currently has limited IPv6 support ☹
· NAT sucks
· XBOXES don’t like DNS64
o Due to the way they punch through firewall (own proprietary solution – toredo tunnel to Microsoft).
· macOS CLAT hindered by lack of PREF64 on routers
· Windows CLAT don’t even get me started
· Future wired 802.1x may hinder selective DHCPv4
· University environment poses unique challenges
Tim notes that zscalar are putting a significant amount of effort into IPv6 support.
Only option that worked reliably on Juniper firewalls – strict nat or 1 to 1 NAT.
What if IPv4 were to break.
Once broken v4 were withdrawn, most people stopped complaining – works for almost all things.
However happy eyeballs – a lot more will work, but some now choosing IPv4.
Day 1 (afternoon)
Hepix IPv6 paper
- Reviewer looks new – and has made a number of comments.
General news
Presented at Hepix in Canada – pointed out that none of the sites had done IPv6. Several people came up to David after his talk who didn’t realise that IPv6 wasn’t that important. There’s no representative in Canada on the Hepix IPv6 working group.
Dan points out that Canadian CVMFS repository wasn’t accessible via IPv6. Bruno thinks that all Tier-1s claim to be dual stack.
# host cvmfsrepo.lcg.triumf.ca
cvmfsrepo.lcg.triumf.ca has address 206.12.9.117
# host cvmfsrep.grid.sinica.edu.tw
cvmfsrep.grid.sinica.edu.tw is an alias for cvmfs02.grid.sinica.edu.tw.
cvmfs02.grid.sinica.edu.tw has address 202.169.169.112
# host cvmfs-stratum-one.ihep.ac.cn
cvmfs-stratum-one.ihep.ac.cn has address 202.122.33.82
RAL, nikhef and cern are ok
dan
Raul reports some CMS jobs seem to be IPv4 only. Perhaps using a very old version of CMS libraries – will share with Andrea to see if that’s an old version of the factories.
Fransco has a list of all services for experiments, CEs, SEs, ARC CE, VOBOX, Atlasclcoud, frontier, xrootd, unknown and undefined etc. Now looks at service name and checks if there’s AAAA. 1211 entries (from Cric and other sources). Each experiment
Could
Andrea will ticket non US sites (and talk to the US about US Tier-2 sites).
FTS now has information on IPv6 yes/no/uknown. Current monitoring doesn’t report this – but there’s a new monit dashboard. Anticipate that this will be rolled out soon from the rate of progress in monitoring (and requirements from DC24).
Xrootd – we’d need the experiment rep from
Alice apparently want to be able to rerun old analysis (using old code) with new data – old code xrootd3 which is not IPv6 compliant. They can do this IPv4 only at a local site.
DC24 monitoring
- There’s a defined schema to report and it currently doesn’t report IPv4 vs IPv6 traffic.
Updates from experiments
-------------------------------------
Only CMS represented – but no new news. Didn’t really close any tickets.
Only 5 sites Torino, Freiburg, waterloo, Victoria and Australia have not completed IPv6 storage.
- Torino, Friburg has been prompted.
Will update list of Tier-2s.
Round table
RAL – Worker nodes at RAL are on LHCONE, but IPv4 only. For a CMS job accessing data from Tier-1, traffic should go over the OPN (but may not yet do so – Martin Bly didn’t know). Martin to talk to James Adams to confirm this.
LHCOPN traffic big spike in IPv4 recently that David Kelsey seems to think is RAL and CNAF.
Francesco - non LHC perfsonar testing.
testing IPv6 only wifi. Testing option 108, and kea.
May ask for a slot in the central computing committee. Looking at world at large – if we treat ourselves as the intranet – but if he looks at IPv6 management at large – present at 30 institutions – all locally managed and operated. Some of them resisting IPv6. Did a limited deployment for symology department. GARR has been ready for a very long time. Just a matter of carving the time out. If we have to take it one step further it really becomes a cultural thing. Hiring a number of fresh young people – perhaps they can. The surrounding environment is out of any common sense. This will happen sooner or later and we may en
Much of the IP addresses are owned by GARR.
Is there any scope for new services being built IPv6 only. Eduardo points out packet marking on IPv6.
Selling IPv4.
Imperial’s /16s – one via dept of computing, one via NHS merger. Central IT has only had 2 /16s.
Suggestions
· Security angle (as anyone on the network can pretend to be a ?
· If there’s any inclination to sell v4 space, replacement is v6.
o Eg Kings have just sold £8million of IPv4 space.
· Giving administrators a clear vision.
QMUL – universities are doing more and more services – Amazon will soon charge for IPv4 (from January) services.
Cloud bursting – hosting things that
QMUL. Dan has locked down his firewall. IPv4 range
UK university
KIT – Ipv6 all over, have found the possibility to move CVMFS to IPv6 entirely. Services still on IPv4.
Raises the question when we find something that is IPv4 on a dual stack node. How do we inform people? We should go back to the knowledge base.
CERN – new data centre filled with machines – network and servers. All dual stack. We working DHCP server from ISC (which will not be maintained any more) to Kea. Have found many strange things with Kea. Have been trying to migrate for 3 years now.
Legacy devices around LHC – still have issued. BootP wasn’t supported (but Dan thinks it might be now). Something strange with IPv6 – renew didn’t work - issue with library on.
Brunel – Dual stack for 10 years. IPv6 only small subset for 6 years. New storage – CEPH – built to be IPv6 only (but end points have IPv4).
Can’t talk to some CMS services on IPv4. Raul will look at his logs to see which CMS services are still on IPv4. Big progress since Fermilab move to dual stack. Number of such sites should be small.
Perfsonar – stop nscd and disable in systemd – now everything is working. et
---------------------------------------------------------------------------------------------------------------
Day 2 (morning)
Hepix in Paris ( April) – should we do an IPv6 training day?-
IPv6 council meeting
CHEP
Book6 project (Brian would like a chapter on wlcg) book6/Contents.md at main · becarpenter/book6 · GitHub
DC24
IPv6 - measure
BBRv3 – sites unlikely to run it in production
Jumbo Frames – a number of sites do this RALPP, QMUL – a number of sites doing this, sites unlikely to change.
IPv6 – pick some links (at least one link) – try and identify why traffic is going over IPv4. Looking over OPN link from KIT to CERN to identify traffic that is going over IPv4.
Work involves identify the links, moving
NEED TO MAKE CLEAR THAT IT ISN’T GO IPv6 ONLY FOR DC24.
Not everything is going from DFN to Geant – not all
Can we plot traffic that is IPv6.
Request for sites to make their counters available has been difficult. Post a JSON file. About half of sites.
Can we extend the schema (on an optional basis) to allow sites to publish IPv4/IPv6 traffic as well as total traffic. View is that it’s too close to DC24, and too few sites are doing this already and we don’t want to do anything to discourage sites.
Why has there been a growth in IPv4 that’s bigger than the growth in IPv6 traffic.
Perhaps we can publish a recommended methodology to work out this.
ACTION – xrootd monitoring is now on the LHC dashboard. We should ask them to report IPv6 if that’s recorded.
Could move to IPv6 peering with IPv4 traffic going to an IPv6 next hop (as reported at IPv6 council). Could potentially move IPv4 traffic to IPv6 peering. RFC5549/8950
Last 3 days of DC24 is contingency – and testing – Eduardo interested
Next steps. Do we have plans to start trying to understand source of IPv4 traffic. Get data out of netflow. Currently broken at CERN.
Currently a limit of 5 days at KIT (limited by database capacity). If they try to do communication over complete farm – will shrink to ½ a day. Is there a way of exporting only the v4 flows.
RAL – no IPv6 on any of the tape systems. Antares is in a pod – plan – so if it’s going direct to tape it will go IPv4. If it’s going to Echo, it will go IPv6. This should show up on the OPN monitoring.
Could the top talkers that Eduardo/Carmbe
What is reasonable now.
Is there anything we can add
What are the outputs detailing traffic we are seeing.
Analysis of what can be moved to IPv6.
A lot of the intelligence coming out of this can be communicated back to tier-1s.
KIT incoming 7.9Gbit IPv4, 22Gbit IPv6.
Analysis of top talkers would be useful. Can we learn something before CERN gets its netflow collector fixed (which we hope is soon).
Could also look at Imperial and Brunel.
Where do we collate notes? CERNBox? Do we want to put it in google doc. CERN has a contract with google Either cernbox or
Next meeting – face to face – CERN, but have once a year meeting away from CERN.
Would it be useful to meet before DC24? End January beginning of February. DC24 starts 12th Feb.
5-6 March. In person at CERN (Afternoon/morning). Note clashes with Internet2 meeting.
Next online meetings:
* Wednesday 13th December 16:00 – CERN Time
* Thursday 18th January 16:00 – CERN time
Tuesday 6th Feb 17:00 – CERN time.
CHEP PAPER
Happy to make changes to the paper as suggested by the reviewer. Andrea’s paper is more than 8 pages, so don’t try and cut stuff to keep within 8 pages (it’s online only).
Other conferences:
· ISGC end of March – current call for papers end of November
o Nominate Bruno to put in a talk. Result of this IPv6 work for DC24?
· TNC – put in a one pager on IPv6 – Chris/Tim
o DC24 in general and then IPv6 side – towards IPv6 only
The link is here:
https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftnc24.geant.org%2Fsubmit%2F&data=05%7C01%7Cchristopher.walker%40jisc.ac.uk%7Ca7d1979fad914c8b0cb108dbec1980f7%7C48f9394d8a144d2782a6f35f12361205%7C0%7C0%7C638363366504018547%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ZdpY7kqmmwmkc5wR4TXSC0ubruZ6tL4TOboOrXhfAxA%3D&reserved=0
· CHEP next year in Poland.
o Need to give it a slightly different spin – we’ve already done towards IPv6 only.
· Hepix
o Meet in Paris in 15-19 April (after Easter). CEA-IRFU Paris
o Always submit a working group report.
o Should we offer some IPv6 training.
§ Tier1/2
§ IPv6 only clusters – don’t tend to get many networking people.
§ Could it match with worker node roll out (eg Dan’s talk yesterday).
§ Half day – say 3 hours
· Little bit of explanation on IPv6
· Network bit – only very scratch
· Administering and configuring nodes
· Highlight some of the programming issues (ex perfsonar)
§ Hepix board was wanting to involve CERN more in this
§ Hepix was by design no side tracks
· WLCG collaboration in May
o Perhaps could repeat the IPv6 training
o Note that SKA may host a future one.
· Next LHCONE meeting has half a day with SKA (Catania in April)
IPv6 only testing.
Perfsonar – discussed yesterday.
USCMS (and ATLAS) would like to move to supporting IPv6 only worker nodes.
· Request put in yesterday for CMS to have an IPv6 instance of ETF.
· Raul at Brunel has some IPv6 only nodes as reported yesterday.
· Do we want to encourage more members of the working group
· Lots of effort – to deploy IPv6 only.
o Question is what IPv6only means
§ Does it include deployment system
§ Containers IPv6 only?
§ Monitoring
o Still needs to be maintained separately
§ May be able to do it automagically
§ For RAL it’s “easy”
o Could we do clat on nodes (as Jen from google talked about at IPv6 council)
§ Then look at logging (what’s the performance impact of this).
§ One of the threads at IETF hackathon is clat.
· Fermilab and Brookhaven, RALPP, AGLT, QMUL, CERN
· Last year’s talk at IPv6 council on how cloud providers implement IPv6 The Cloud, IPv6, and the Enterprise – Radek Zajic [slides] [video] - excellent talk on cloud support for IPv6.