WLCG Information System Evolution Task Force

Europe/Zurich
513/R-068 (CERN)

513/R-068

CERN

19
Show room on map
Description
Meeting to discuss the evolution of the WLCG Information System
    • 15:00 15:30
      Proposed JSON format for CE info publishing 30m
      Speaker: Balazs Konya (Lund University (SE))

      Attendance:

      Adrian Coveney, Alessandro Paolini, Andrew McNab,Malazs Konya, Brian Bockelman, David Crooks, Enol Fernandez, Linda Ann Cornwall, Paolo Andreetto, Vincenzo Spinoz, Matthew James Vijoen, Laurence Field, Maarten Litmaath, Alessandro Di Girolamo, Julia Andreeva

       

      The main discussion was around the document with the proposed json format for the description of the computing resources:

      https://docs.google.com/document/d/1pg_5Kibc_-Z4JF4_HJyW5xL6GVYKwXxOU7DXf2QP9Ag/edit

      Balazs made an introduction describing history and motivation for this proposal.

      Discussion

      The unique CE-Name identifies the set of compute resources of the similar type. It was suggested to rename CE Name into resource name not to confuse with the Compute element concept.

      Single block in the file describes a given compute resource along with the endpoint and queue where relevant. This means that in the flat structure, there might be many repetitions in case the same resource is exposed through a variety of queues and/or endpoints. In the end of the meeting, Julia suggested to look what have been done for storage description, where we also have a concept of the resource (storage share or space quota) and the way top access it - protocol. And then there is a possibility to define which protocols can access which storage shares).

      Julia told that apart of info we need for job submission which is the most important, we should also foresee other usecases like benchmarking, accounting, service availability, downtime.

      Laurence told that all this had been already implemented in Glue2.

      Maarten expressed that we may want to design a new information system,
      this time from the bottom up, and maybe even think of it as "GLUE 3",
      but we had better make sure that all use cases are covered,
      because otherwise sites will need to keep running the BDII in parallel.

      Brian described OSG approach.

      Need to make sure that there is a way to locate and discover resource with minimum requirements to site admins to make it happen. This is a resource centric model, rather than queue centric model. Similar to the Amazon one, web page where available resources are advertised.

      What has to be defined: resource type, who are allowed to access, how to specify that one wants to use the resource

      For example, batch system should not be exposed, it is internal implementation and is not required for job submission

      Not all this information can be automatically generated. On the other hand human input should be minimized, since it is not easy to keep info uptodate.

      Brian also finds that according to the feedback of site admins, for them it was easier to inject data into AGIS rather than provide correct configuration file to be consumed at the local site. So the systems which are consuming this data, should be flexible and capable to consume from different sources.

      Discussion

      Andrew McNab mentioned that this would be good if we preserve possibility to generate certain info from the service itself , for example benchmarking information. Was not clear whether there is a consensus on this point. Maarten made a point about fragility of scripts which generate data. More complex data we need to generate , more things can go wrong.

      Regarding cluster capacity, it was decided that it is a complex topic and might require a dedicated discussion.

      Julia asked Paolo, what he thinks about replacing BDII with JSON description from the CREAM point of view. Paolo replied that no new features are possible in CREAM. CREAM itself does not use BDII, but CREAM info is generated for BDII.  For example for accounting (benchmarking factor). Julia mentioned that for accounting, we would need to ask to change APEL client to get time-to-work conversion factor from a different source rather than BDII.

      Matthew told that EGI will follow up on this proposal and consider possibility to use an alternative to BDII info source. Naturally, EGI is not ready to switch immediately to a source different from BDII. Julia told that there is no plan to drop BDII now. This would be a gradual process of replacing BDII with something else for the sites which would like not to run BDII and all usecases should be considered.

      In the end of the discussion it was decided to ask couple of pioneer sites to provide info in the proposed structure manually. The first experience might come form the UK sites. Since Alessandra and Alastair could not attend today meeting and present their results, it was proposed to have the next meeting in two weeks on the 4th of October  (under the condition that people would not have major conflicts with this date) in order to discuss UK experience.

       

       

       

       

       

       

       

    • 15:30 15:50
      Discussion 20m