Speaker
Stefano Dal Pra
(INFN)
Description
The WLCG community and many groups in the HEP community have based
their computing strategy on the Grid paradigm, which proved successful
and still ensues its goals. However, Grid technology has not spread
much over other communities; in the commercial world, the cloud
paradigm is the emerging way to provide computing services.
WLCG experiments aim to achieve integration of their existing current
computing model with cloud deployments and take advantage of the
so-called opportunistic resources (including HPC facilities) which are
usually not Grid compliant. One missing feature in the most common
cloud frameworks, is the concept of job scheduler, which plays
a key role in a traditional computing centre, by enabling a fairshare
based access at the resources to the the experiments in a scenario
where demand greatly outstrips availability.
At CNAF we have opened started, as a preproduction service, the
possibility to access the Tier-1 computing resources as an OpenStack
based cloud service. The system, exploiting the dynamic partitioning
mechanism already being used to enable Multicore computing, allowed us
to avoid a static splitting of the computing resources in the Tier-1 farm,
while permitting a share friendly approach.
The hosts in a dynamically partitioned farm may be moved to or from
the partition, according to suitable policyes for request and release
of computing resources. Nodes being requested in the partition switch
their role and become available to play a different one. In the cloud
use case hosts may switch from acting as Worker Node in the Batch
system farm to cloud compute node member, made available to tenants.
In this paper we describe the dynamic partitioning concept, its
implementation and integration with our current batch system, LSF. We
then present results for the dynamic cloud usecase.
Authors
Stefano Dal Pra
(INFN)
Dr
Vincenzo Ciaschini
(Istituto Nazionale Fisica Nucleare (IT))
Co-author
Luca dell'Agnello
(INFN)