EGEE Pre-Production sites meeting

Europe/Zurich
VRVS conference "Pisa" Virtual Room (access it through "All communities")

VRVS conference "Pisa" Virtual Room (access it through "All communities")

"Pisa" Virtual Room (access it through "All communities")
Description
Plenary meeting with PPS sites
    • 16:00 16:20
      SRM-2 testing in PPS 20m
      Although SRM-2 is not certified yet, experiments are requesting the PPS to give them support for initial testing of SRM-2 capabilitiies.
      A number of new sites already running the new SRM-2 are going to join PPS.
      This will force a re-organization of the data management in PPS
      Speaker: Nick Thackray
      • Introduction (new sites joining and why) 5m
        A number of new sites already running the new SRM-2 in the context of the SRM-2 "pilot" are going to join PPS.
        This will force a re-organization of the data management in PPS (e.g. end-points to be published and updated in FTS).
        Sites willing to volunteer to a pre-installation of their SEs with SRM-2 are welcome.
        Sites will be also asked to volunteer to declare SRM-2 SEs as their "Close SE"
      • HEP VO specific testing 5m
        The SRM-2 testing concerns for the time being only HEP VOs.
        Sites mainly dedicated to serve non-HEP VOs (e.g. Biomed, Diligent), although welcome to join the exercise, may found it useful to call them out in order to avoid conflicts
        In that case they would need to stop supporting the HEP VOs
      • Installation of 'uncertified' software 5m
        There is no guarantee, so far, that the SRM-2 will be certified before this test activity starts.
        As usual we will not ask sites to install uncertified software.
        However, if sites are willing to do it in this case, and if it is compatible with any other use currently done of the storage resources of PPS, they are welcome
      • end-points to be reconfigured in FTS : Conflicts? 5m
      • Data in the catalogs to be modified: Conflicts? 5m
        The SRM-2 task force will provide scripts for the migration of the catalogs.
        The migration of the catalogs is not reversible. The migration scripts are meant for use in production.
        Experiments will know, that data created in the PPS catalogs during the exercise are going to be "lost" afterwards.
        We have to check is there is any showstopper to the migration of the existing catalogs in PPS.
      • Configure CE with SRM-2 SEs as 'close SE'. Volunteers? 5m
        the list of end-points with proposed association with sites is available at https://twiki.cern.ch/twiki/bin/view/LCG/GSSDendpoints
    • 16:20 16:30
      Re-introduction of pre-deployment testing 10m
      • Overview 5m
        The release to PPS uses two apt/yum repositories. The release is firstly deployed at CERN, then at CNAF.
        The first one was used as a buffer for a pre-installation test run at CERN_PPS, in the aim of shielding the PPS against the injection of patches with eveident problems
        This pre-deployment test was not done during the last six month due to lack of time.
        In these six months we have observed several incidents with RPMs delivered to PPS and then quickly taken away from the repository due to evident bugs
        So we would like to re-introduce this buffer, as already foreseen in point 4 (Monday) of the current release procedure (https://twiki.cern.ch/twiki/bin/view/LCG/PPSReleaseProcedures#Certification_PPS).
        We cannot run this test in CERN_PPS without becoming a bottleneck for the release, so we are proposing to create a small pool of sites, covering together almost all the services, in charge of deploying the new release on selected services in less than one day and give a green light for the deployment in the CNAf repository.
        Sites member of this pool will be requested to:
        1. Upgrade as soon as possible (not more than 1 day) their prioritary sevices (see list below)
        2. report the result of the upgrade to the pre-deployment test coordinator, who will summarise the result of the whole test
      • Proposal for pre-deployment pool 5m
        Based on the current distribution of services among PPS sites and the observed reaction time to new releases, we propose the following list of sites to be discussed:
        • PPS-LIP (coordinator)(Priority: PROX, LFC)
        • prague_cesnet_pps (Priority: SE, MON)
        • Birmingham (Priority: CE, RB)
        • CESGA-PPS (Priority: IC, MON)
        • KIAM-PPS (Priority: gLite-UI, SE)
        • PPS-IFIC (Priority: WMS, gLite-CE)
        • CERN_PPS (Priority: BDII)
    • 16:30 16:40
      Operation of SAM PPS Client 10m
      • overview 5m
        On December 2006 we sent a call for a volunteer site to run the SAM tests for PPS. The task is not very demanding from the point of view of HW and SW resources (only a UI is strictly needed), but requires a certain level of commitment in support.
        We received at that time an expressions of interest from some sites, but after closer analysis the service was deemed not to be mature enough for a transfer.
        Now the deployment of the SAM client has improved and the application itself is more stable, so we think it is time to recover this thread
      • Pre-requisites to run SAM service for PPS 5m
        Pre-requisites for the site:
        • 1 node with afs to run the SAM client ( Oracle is not needed, we interface to the SAM server in production )
        pre-requisites for the administrator:
        • valid grid user certificate
        • member of the ops VO (eventually to be requested)

        initial set-up steps (at startup and when new sensors are deployed) ( documentation available in https://uimon.cern.ch/twiki/bin/view/LCG/PPSSamInstallation )
        • install and configure SAM client
        • create a SAM instance directory on AFS (not much more than the copy of the existing AFS directory)
        • set-up cronjobs
        service development
        • deployment of new sensors according to the deliveries done by the PPS coordination team (very rare, on demand)
        operational commitments:
        • GGUS tickets (order of 1 - 5 per week)
        • update of CAs (frequent)
        • update of SAM client (rare)
    • 16:40 16:55
      Case study for T1s in PPS: Pre-view of production CEs 15m
      We would like to draw the attention of the T1s in PPS on a deployment scenario we have recently implemented at CERN_PPS
      Basically, in occasion of major and 'painful' upgrades (e.g. release of SLC4 WNs) the site managers of CERN-PROD deployed in advance the new middleware, which was still due to be in PPS for some time, on a partition of the production system. Then the new prduction CEs were published in CERN_PPS. These proved to have some minor side effects (e.g. some confusion in the information system with respect to the published sitenames) but a number of very valuable advantages:
      The PPS Users (experiments) could run real production jobs on a batch system of consistent scale accessing it through CERN_PPS and so tuned their application in advance.
      The CERN-PROD administrators could start working on the new release before it actually happened, and took a considerable advantage from early feedback by users. All this testing was done under a lighter pressure, being their service considered still as PPS.
      The CERN_PPS administrators could benefit of the experience of the production managers to improve their own knowledge

      For whoever is interested in this kind of deployment scenario, the corresponding layout of CERN_PPS can be found in:
      http://egee-pre-production-service.web.cern.ch/egee-pre-production-service/index.php?dir=./PPSsites/CERN_PPS/&