multiOne meeting

Europe/Zurich
31/S-023 (CERN)

31/S-023

CERN

22
Show room on map
  • Edoardo first reviewed the background and evolution of LHCone and highlighted the growing concerns raised by the continued expansion to include other communities. These concerns are two-fold: NRENs worry that if LHCone includes all traffic then they lose the ability to prioritise traffic and sites worry that they are losing control over site access as LHCone traffic often completely bypasses site firewalls.
  • Edoardo then presented three potential options that had been considered to deliver identification of traffic on a per-VO basis:
    • Differentiated data centre domains (VRF/VXLAN)
    • Assigning a dedicated IPv6 address range to each collaboration
    • Applications tag IP packets with information that can be used by network infrastructure to assign traffic to the correct VRF.
  • The first two of these options require action by site infrastructure teams. The third could be implemented by VOs themselves but could also be implemented by site infrastructure.
  • Key questions from the discussion that followed were as follows
    • What is the timescale? There is no deadline for traffic separation, but we need to avoid being in a situation where there is no viable solution should a site/NREN refuse to allow a new community to join---or require improved traffic identification
    • Will solutions be implementable at all sites? This is one of the points to investigate although some options are likely to be more demanding in terms of site infrastructure or effort.
    • Will solutions scale to many VOs, especially from non-HEP communities. From the networking viewpoint, there is no real limit to the number of VRFs. Solutions that are implemented by sites rather than VOs are easier to use to support non-HEP VOs, though.
    • Will LHCone remain or will there be pressure to separate out LHC traffic? NRENs are unlikely to want to treat LHC experiments individually but there might be pressure from sites that just support a single experiment.
    • What about storage? There will be discussions with Kuba, but we have dedicated storage servers for large experiments today. For smaller experiments, proxies could be used to identify traffic from a shared storage pool.
    • What about VO boxes? If these need to communicate with, or be contacted by, the outside world then traffic should also be identified on a per-VO basis.
    • Gavin: can openstack/kubernetes deliver traffic identification for jobs from multiple VOs on a single machine? Ricardo: in principle, yes.
  • It was agreed that we would meet again before the September GDB, probably in the week of September 2nd at which time CM attendees should report on how we could go about testing traffic separation by each of the three proposed options and in any other ways that might be possible.
  • Edoardo first reviewed the background and evolution of LHCone and highlighted the growing concerns raised by the continued expansion to include other communities. These concerns are two-fold: NRENs worry that if LHCone includes all traffic then they lose the ability to prioritise traffic and sites worry that they are losing control over site access as LHCone traffic often completely bypasses site firewalls.
  • Edoardo then presented three potential options that had been considered to deliver identification of traffic on a per-VO basis:
    • Differentiated data centre domains (VRF/VXLAN)
    • Assigning a dedicated IPv6 address range to each collaboration
    • Applications tag IP packets with information that can be used by network infrastructure to assign traffic to the correct VRF.
  • The first two of these options require action by site infrastructure teams. The third could be implemented by VOs themselves but could also be implemented by site infrastructure.
  • Key questions from the discussion that followed were as follows
    • What is the timescale? There is no deadline for traffic separation, but we need to avoid being in a situation where there is no viable solution should a site/NREN refuse to allow a new community to join---or require improved traffic identification
    • Will solutions be implementable at all sites? This is one of the points to investigate although some options are likely to be more demanding in terms of site infrastructure or effort.
    • Will solutions scale to many VOs, especially from non-HEP communities. From the networking viewpoint, there is no real limit to the number of VRFs. Solutions that are implemented by sites rather than VOs are easier to use to support non-HEP VOs, though.
    • Will LHCone remain or will there be pressure to separate out LHC traffic? NRENs are unlikely to want to treat LHC experiments individually but there might be pressure from sites that just support a single experiment.
    • What about storage? There will be discussions with Kuba, but we have dedicated storage servers for large experiments today. For smaller experiments, proxies could be used to identify traffic from a shared storage pool.
    • What about VO boxes? If these need to communicate with, or be contacted by, the outside world then traffic should also be identified on a per-VO basis.
    • Gavin: can openstack/kubernetes deliver traffic identification for jobs from multiple VOs on a single machine? Ricardo: in principle, yes.
  • It was agreed that we would meet again before the September GDB, probably in the week of September 2nd at which time CM attendees should report on how we could go about testing traffic separation by each of the three proposed options and in any other ways that might be possible.

Points raised from discussions between Tony & Kuba, 24th July

  • As far as incoming traffic is concerned, there does not seem any need to distinguish between traffic to EOSPUBLIC for different VOs: files accessed are spread across the different servers in the pool so all relevant VOs need to access all servers.
  • If, as is pretty much certain, EOSPUBLIC servers pull data from remote sites then this traffic does need to be identified per VO. What are the options for doing this?
    • create a temporary server for the transfer in which case presumably openstack can handle the case as for a batch server;
    • telling the application on the EOS server the relevant VO and using the tagging approach; or
    • other?
  • This led to some questions about how tagging would operate.
    • Is this secure? If it is handled at the application level, can a rogue user masquerade as being from a different VO?
    • Can tagging coexist with the VRF/VXLAN or iprange approaches?
  • Kuba agreed to review the different transfer use cases for the September meeting and the applicability and implications of different options for identifying traffic per VO.
There are minutes attached to this event. Show them.