Speaker
Marko Slyz
(FNAL)
Description
The Dark Energy Survey (DES) uses a CCD camera installed in the Blanco
telescope in Cerro Tololo, Chile. The goal of the survey is to study
the effect known as Dark Energy.
DES uses Fermigrid for nightly processing, for quality assessement of
images, and for the detection of type 1A Super Novae. Nighly
processing needs to be carried out for each of the 105 nights in a
season that DES acquires data, and must be completed before
observations begin on the following night. This was seen as feasible
on Fermigrid because the requirements for memory and CPU were similar
to those of HEP jobs. Fermigrid used some HEP computing techniques --
among them, the CernVM File System (CVMFS) for storing software, and
disks on the worker nodes for each job's scratch space -- that were
novel to cosmology experiments accustomed to using HPC machines.
At the same time, we learned of other compute requirements which were
not well served by the the existing model, but were still well suited
to the basic approach of loosely-coupled, high throughput computing
used in HEP. We are working to support workflows with large memory
requirements, workflows that require multi-core cpus, and workflows
that cache calibration data on the worker nodes.
DES started running production on Fermigrad in August of 2014. We
present how we are addressing some notable problems with the current
system, including the variable amount of time it takes for files to be
visible on worker nodes after they're first uploaded to CVMFS, low CPU
efficiency, and jobs hanging while finishing.