Resource Trust Evolution TF (Focus: Cloud)
Present:
Alessandra F, Andrei, Baptiste, David G, Doug, Enrico, Federica, Hannah, John, Julie, Linda, Maarten (notes), Mario, Matt D, Mine, Petr, Stephan, Tom
Notes:
Mario takes us through his presentation attached to the agenda, explaining the procedures followed by ATLAS to be able to use storage resources hosted by Google as well as Amazon. He adds that ATLAS can run like that for the time being, but feels the procedures to be a bit of a hack that also relies too much on individuals. A load-balancer service offered by the cloud provider to sit in front of the actual storage needs to be given a CERN DNS alias in order to allow it to be equipped with an IGTF host certificate from the CERN Grid CA.
David G asks why this is considered improper? Mario answers that as the storage is not at CERN, why should it fall under the CERN Grid CA? David replies that other cloud customers do things similarly and Mine adds that the cloud providers even encourage such procedures. Mario adds that such a load-balancer is not a free service. David confirms that one has to pay for such an extra service indeed.
Stephan adds that OSG are doing something similar for smaller sites that cannot afford buying certificates: an OSG DNS domain is set up for such a site, for which an OSG certificate can then be used by the site.
Mario asks who should provide the certificate and the "fake" DNS entry? He adds it should not be persons... David replies that such entries are not fake, that adding a CNAME to the DNS is a standard service and that the "branding" of the resources determines who should pay for the load-balancer.
Mine asks what the CERN computer security team thinks of this? Mario answers the computer security officer found it interesting, but did not object to ATLAS going ahead. Mine adds she will check what the FNAL computer security team think. David replies that e.g. Cloudflare would anycast your web site using your certificate. He pastes an example into the chat:
your prototypical example …
davidg@u93xdavidg ~
$ probecert www.mit.edu|head -10
Hostname: www.mit.edu
Issuer: C = US, O = DigiCert Inc, OU = www.digicert.com, CN = GeoTrust RSA CA 2018
Not After : Aug 30 23:59:59 2023 GMT
Subject: C = US, ST = Massachusetts, L = Cambridge, O = Massachusetts Institute of Technology, CN = web.mit.edu
SubjectAltNames:
web.mit.edu
alum-dev.mit.edu
alum.mit.edu
betterworld.mit.edu
emergency-dev.mit.edu
davidg@u93xdavidg ~
$ host www.mit.edu
www.mit.edu is an alias for www.mit.edu.edgekey.net.
www.mit.edu.edgekey.net is an alias for e9566.dscb.akamaiedge.net.
e9566.dscb.akamaiedge.net has address 23.2.233.28
e9566.dscb.akamaiedge.net has IPv6 address 2a02:26f0:fe00:2a1::255e
e9566.dscb.akamaiedge.net has IPv6 address 2a02:26f0:fe00:290::255e
davidg@u93xdavidg ~
$ whois 23.2.233.28|grep -v \# |head -20
NetRange: 23.0.0.0 - 23.15.255.255
CIDR: 23.0.0.0/12
NetName: AKAMAI
Mario then asks who should provide such services for us and suggests it might be shared between 2 or 3 larger institutes. For example, an OSG cloud extension should be handled by OSG, not CERN. He then points out the current ATLAS contract is with Google US, not EU, and asks if some guidelines should be provided? Mine asks how regionality may matter? Mario replies it might not be desirable for CERN to be responsible for US cloud provider host certificates, that CERN host certificates might rather be restricted to member states?
David says that if FNAL were to contract Akamai in the EU, the corresponding hosts should still be equipped with FNAL certificates. Stephan concludes the CA services are extended just like the DNS is. He asks if the reverse DNS also works, which is confirmed. David adds it is your IP address because you rent it. Mario clarifies that you ask for a load-balancer, get the IP address and then there should be a service providing a hostname for it as well as a certificate.
Maarten concludes we appear to have the proper procedures already, we just need to polish them. He adds that with the need for user certificates going away in a few years, we can revisit this matter at that time to see if by then we can just use the certificates provided by the cloud providers themselves. David agrees that by that time, domain-validated host certificates may be sufficient, viz. to serve transport layer security.
Maarten asks how CMS have fared in these matters? Stephan answers that so far, CMS have concentrated on computing in the cloud and that cloud storage has not been integrated into Rucio. Mario adds that, compared to the large providers, Seal storage can be an order of magnitude cheaper and sites might want to consider such storage as part of their pledges. He points out that as Seal does not offer load-balancer services, the host certificates need to be handed over for installation by service admins at Seal instead. After a short discussion this method is considered acceptable. Mine adds she will discuss these matters with computer security and network people at FNAL.
Stephan says that if sites wants to have part of their resources in the cloud, the experiments should not need to be involved? Maarten answers that the QoS of cloud storage may not be the same as that of on-premise storage: for example, it might become more expensive from one year to the next, forcing a site to reduce its capacity. Stephan replies that sites would normally be expected to give advance notice of disk space going down. David adds that cloud storage might also see funding cuts more easily. Stephan concludes we may need a new pledge category for volatile storage.
Maarten concludes this has been a very productive meeting and that thanks to cloud R&D in ATLAS and the documentation provided through the links in the presentation, we may be able to have the technicalities of cloud storage solved in the near future, adding that these matters will have to be followed up in CERN IT. He reminds us of more cloud load-balancer documentation that in fact was already made available 3 years ago and could be updated with findings from today's meeting and beyond.