Speaker
Vanessa HAMAR
(CC-IN2P3)
Description
We are using Univa Grid Engine as BATCH scheduling system to our satisfaction since four years. We focus on the latest major version 8.2.1, which was deployed at IN2P3-CC 4 months ago, and provides further scalability improvements.
We are supporting about 200 groups and experiments running up to 17.000 jobs simultaneously. The requirements, in terms of computing resources, storage or network, differ widely from one group to another, and therefore the batch system must be particularly scalable and reliable which is what we get with this release.
The new read-only threads work independently from the SGE qmaster which removes the load previously induced by serving status requests.
In addition, this version introduces a way to limit user requests which avoids system overloads, it comes also with new information for job accounting and uses a 32-bit range for job ID’s.
In this talk we will present the assessment of this upgrade and give an overview of the issues being resolved according to the operation of the service.
Finally, we will show the plans for the deployment of new features and the roadmap of Univa Grid Engine at the IN2P3-CC for the next few months.
Length of presentation (max. 20 minutes) | 15 minutes |
---|
Authors
Nadia Lajili
(Centre de Calcul de l'IN2P3 (CC-IN2P3))
Suzanne Poulat
(Centre de calcul IN2P3)