GridPP Cloud Meeting

Europe/London
Virtual Only

Virtual Only

GridPP cloud meeting. 1 Feb 2013
 
 
Present: 
--------
Alessandra Forti,
Andrew Washbrook
Gareth Roy
Christopher J. Walker (minutes)
Daniela Bauer
David Colling (Chair)
Simon  Fayer 
Raja Nandakumar
Matt Doidge
Mark Mitchell
Robert Frank
Andrew Lahiff
Wahid
John Green
Peter Gronbech
David Crooks
Peter Love
Linda Cornwall
Kashif Mohammad
Robin Long
Ian Collier
 
Atlas (Peter Love)
===================
 
There exist tools to run VMs on various cloud deployments: including
stratuslab, Openstack, nimbus cloud.
 
Atlas also looking at vmware esx, EGI federated cloud. 
 
Atlas has a cloud scheduler - and need to run one in the UK. It spins
up machines according to requirement of jobs submitted to the condor
queue. Condor can also submit to EC2 compatible clouds - including
stratuslab, openstack. See: cloudscheduler.org
 
DC: Is this known to the people in EIS. This ought to be generally
useful, not just for atlas.
 
At GDB, default seems to be assumed to be an EC2 interface. 
 
Plans for the next two weeks. Peter getting familiar with EC2 cloud tools - and
their VMWare instance. Has played with the openstack interface at
CERN.
 
DC: Imperial's resource is for general use not just for CMS - and
Atlas encouraged to use it.
 
ACTION: DC to get contact details for Imperial's cloud into the minutes. 
 
NOTE ** For now contact Adam Huffman a.huffman@imperial.ac.uk cc'ing me d.colling@immperial.ac.uk - a generic email will be set up shortly ** 
 
CMS (Andrew Lahif)
=================
 
Exactly same workflow as general grid. 
 
Data read from EOS using xrootd, then writing back to EOS on
HLT. Working on scaling up - up to 4000 jobs. Jobs starting to fail
due to not being able to access the input data - 700 jobs
worked. Possibly a network issue - plan to scale up access.
 
Reprocessing job also running at Imperial. Glidein WMS factory set up
at RAL, and jobs submitted to GridPP cloud - looking forward to the
size of the cloud increasing. They had an issue yesterday - probably a
firewall issue at Imperial.
 
Glidein WMS at RAL could be used for any number of other VOs. 
 
LHCb: Raja
----------
 
Nothing really to report. Going slow they have had other problems. In
next couple of weeks plan to get a few jobs running on Imperial cloud.
 
Dirac has a backend that can talk to a cloud. Raja doesn't know
offhand which cloud. 
 
GridPP cloud status at Imperial: Dave Colling
---------------------------------------------
 
Just enlarged to 200 cores. DC encouraging Adam Huffman (Imperial) to
post status to the wiki.
 
Brunel, Oxford, RAL have clouds. 
 
ACTION: Each site that has cloud resources create twiki page with
status - linked from GridPP cloud twiki page. Ian Collier to chase this up.
 
 
Connections to other cloud projects
===================================
 
EGI federated cloud project finishing in March. 
 
DC: We should interact as much as we can with them. 
 
ACTION: Dave Colling, Dave Wallom to interact with these other cloud
projects and report back in 2 weeks time.
 
General Discussion
==================
 
Mark Mitchell: IPV6 - what's the status of IPV6. 
DC: GridPP cloud at Imperial has IPv6 connectivity. 
ACTION DC to contact Dave Kelsey to see how you can best contribute to 
IPv6 work. 
Mark Mitchell: Notes that the cloud scheduler presentation looks good. 
 
Ian collier: CernVM has potentially lots and lots of mileage, but you
may not want to put all your eggs in that basket. There's even a micro cernVM
where even the OS is in CVMFS.
 
There was some discussion about the security requirements of running VMs.
 
Ian Collier: Security - EGI federated cloud - isn't clear peaple are
thinking about this, but we need to.  
 
DC: Do we know how the images are built for the HLT farm?
Andrew: No. 
DC: Doesn't think they are using CernVM, but not actually sure - need to 
Ian: Hepix working group on image sharing 
 
At the moment, we are grid sites with certain security requirements,
which it's not clear that all cloud sites are meeting. At RAL, they
require contextualisation must require centralised logging.
 
EGI federated cloud distributing images with credentials baked in. If
a proxy is stolen, it's valid for 12 hours. other credentials
potentially have a longer life. We need to exercise caution here. 
 
ACTION: Ian Collier to find hepix report on VMs - specifically how do
we trust an image, and security requirements. Was presented at a recent GDB.
 
 
CJW: Whilst sites can log the owner of a VM - and its IP address,
perhaps the responsibility for other logging should fall to the VM
owner (the VO). 
 
Ian pointed out that agreed grid security policies require logging at
the site.
 
Linda did put a proposal in for cloud security work. Was rejected, but
suggested we do it anyway.
 
ACTION: John Green to take an active role in security of the cloud work. 
 
AOB
===
Peter Love sends his apologies for the next meeting (in 2 weeks). 
 
 
ACTIONS
=======
 
ACTION -1 : DC to get contact details for Imperial's cloud into the minutes. 
ACTION - 2: Each site that has cloud resources create twiki page with
status - linked from GridPP cloud twiki page. Ian Collier to chase this up.
ACTION - 3 DC to contact Dave Kelsey to see how you can best contribute to 
IPv6 work. 
ACTION -4: Dave Colling, Dave Wallom to interact with these other cloud
projects and report back in 2 weeks time.
ACTION -5 : Ian collier to find hepix report on VMs - specifically how do
we trust an image, and security requirements. Was presented at a recent GDB.
ACTION -6 : John Green to take an active role in security of the cloud work. 
 
 
 
There are minutes attached to this event. Show them.
    • 14:00 14:10
      Atlas Status and Plans 10m
    • 14:10 14:20
      CMS Status and plans 10m
    • 14:20 14:30
      LHCb Status and plans 10m
    • 14:30 14:40
      GridPP Cloud Status at Imperial 10m
    • 14:40 14:50
      Other UK Cloud sites status and plans and how we can interact with them 10m
    • 14:50 15:00
      Connections to other projects 10m
    • 15:00 15:02
      AoB 2m