WLCG-OSG-EGEE Operations meeting

Europe/Zurich
28-R-15 (CERN conferencing service (joining details below))

28-R-15

CERN conferencing service (joining details below)

Nick Thackray, Steve Traylen
Description
grid-operations-meeting@cern.ch
Weekly OSG, EGEE, WLCG infrastructure coordination meeting.
We discuss the weekly running of the production grid infrastructure based on weekly reports from the attendees. The reported issues are discussed, assigned to the relevant teams, followed up and escalated when needed. The meeting is also the forum for the sites to get a summary of the weekly WLCG activities and plans
Attendees:
  • OSG operations team
  • EGEE operations team
  • EGEE ROC managers
  • WLCG coordination representatives
  • WLCG Tier-1 representatives
  • other site representatives (optional)
  • GGUS representatives
  • VO representatives
  • To dial in to the conference:
    a. Dial +41227676000
    b. Enter access code 0157610

    OR click HERE

    NB: Reports were not received in advance of the meeting from:

  • ROCs: France, Northern Europe. SouthWestern Europe.
  • VOs: CMS, ALICE and ATLAS
  • Minutes
      • 16:00 16:05
        Feedback on last meeting's minutes 5m
      • 16:01 16:30
        EGEE Items 29m
        • <big> Grid-Operator-on-Duty handover </big>
          From: ROC Germany-Switzerland / ROC CentralEurope
          To: ROC South-Western/ ROC France


          NB: Please can the grid ops-on-duty teams submit their reports no later than 12:00 UTC (14:00 Swiss local time).

          Issues:
          1. (CE ROC)Backup team
            Ticket summary 
            extended   : 41
            opened     : 55
            closed     : 12
            2-nd mail  : 4
            
            total        112
            

            Problem with Dashboard - tickets list is not refreshing itself since Wednesday. Notification was send.
        • <big> PPS Report & Issues </big>
          PPS reports were not received from these ROCs:
          AP, FR, IT, NE, SWE

          Issues from EGEE ROCs:
          1. Nothing to report
          Release News:
          • Release of gLite 3.1.0 Update06 to PPS done
            • new voms certificate for US-ATLAS server (sync to production)
            • uberftp for the glite-UI node
            • lcg-tags added to glite 3.1 UI, WN
            • lcg-infosites added to the glite 3.1 WN
          • Javier Lopez and Esteban Freire from PPS-CESGA have joined the PPS Coordination team. Their main task is to follow-up the roll-out of the gLite middleware updates from Certification to PPS
        • <big> EGEE issues coming from ROC reports </big>
          None.
        • <big>The lcg-RB to be moved no maintanece mode.</big> 15m
          JRA1 and SA3 have now moved the old lcg-RB to zero maintenance mode. glite 3.1 WMS whether on SL3 or SL4 is the supported solution.
      • 16:30 17:00
        WLCG Items 30m
        • <big> Tier 1 reports </big>
          • GridKa
            SRM lockups. Possibly caused by memory subsystem interaction with the java vm. Severity: moderate
          • BNL-LCG
            Monday 17 - Thursday 20
            Users job get stuck
            Cause: too few movers available on some pool nodes
            Severity: one user affected
            Remediation: increase the number of movers
            Problems: 
            dCache has bad local account mappings for some critical users.  It caused USATLAS/ATLAS data transfer failures. 
            
            Cause: 
            We did an in-place GUMS (Grid user management system) upgrade on Thursday afternoon. Some critical users were mapped to wrong accounts due to the configuration file problem.   This problem was undetected because GUMS still provided  mapping service, and the generated grid map appeared to be "OK" while it was not.   The symptom did not show up several hours later until midnight when dCache regenerated its map file from the GUMS server, and experienced data transfer failures thereafter.  Even the GUMS update had been properly announced,  we might not be able to discover this type of problem.   
            
            Cause:  A fraction of USATLAS data transfer failed between midnight and 10:30AM, Friday monitoring.
            
            Solution:
            
            Short term, we corrected the configuration file errors, and let dCache regenerate dCache map file, recovered the data transfer problem.  In order to fix this problem and prevent the future occurrence,  we will do two improvements:
            
            1) The dCache team might want to consider regenerating the grid map file during prime business hour, i.e. 9:00AM, and 3:00PM.
            2) We will develop Nagios probes to validate the certificate mapping for the critical users:  Nurcan''''s production certificate, Hiro''''s data transfer certificates.  (Please let us know  any other critical certificates). 
            
            
            Main symptom was a slower throughput to HPSS. Files kept being flushed
            to HPSS and stayed precious.
            Cause: due to a Solaris/Linux difference, the script that stages files
            to HPSS is not recognizing flushed files correctly and was not working
            reliably on the Thumper
            Severity: Apart from a minor slowdown, few files got corrupted. In the
            case that a file was written, delete, and then rewritten with the same
            name, it could happen that the old copy was the one actually kept in
            HPSS.
            Remediation: Thumper assigned only as a read node
            Long term solution: need to do more work to guarantee that Thumpers in
            the write pool work reliably
            
            Friday 14 - Saturday 15
            Production had problems reading files.
            Cause: no pool was actually assigned to production, because of the
            reconfiguration done prior to the HPSS upgrade
            Severity: USATLAS production affected
            Remediation: reset the configuration
            
            dCache went down
            Cause: unknown - still investigating
            Severity: site down
            Remediation: restart dCache core servers
            
            Saturday 15
            User client hang because of suspended requests
            Cause: due to HPSS upgrade instability, many requests were suspended
            Severity: USATLAS production affected
            Remediation: retry the requests
            
            Ongoing
            File disappear. More so during HPSS upgrade.
            Cause: still unknown - user activity is primary suspect
            Severity: some files are lost
            Remediation: manually retrieve the list of file lost, and clean up the
            data catalog so users do not request files that are not available
            
            Monday 17 - Thursday 20
            Users job get stuck
            Cause: too few movers available on some pool nodes
            Severity: one user affected
            Remediation: increase the number of movers
            Monday 17 - Thursday 20
            Users job get stuck
            Cause: too few movers available on some pool nodes
            Severity: one user affected
            Remediation: increase the number of movers
            
            Friday 21
            dCache was down
            Cause: power failure in the facility brought down, and the UPS servicing
            that rack was not working properly. One of the machines in the rack was
            the PNFS servers.
            Severity: system down for less than an hour
            Remediation: system restarted
            Long term solution: fix the UPS
            
          • USCMS-FNAL-WC1
            Test shows SE and SRM down, but this is not true. Many ongoing transfers.
        • <big> WLCG issues coming from ROC reports </big>
          None.
        • <big>WLCG Service Interventions (with dates / times where known) </big>
          Link to CIC Portal (broadcasts/news), scheduled downtimes (GOCDB) and CERN IT Status Board

          Time at WLCG T0 and T1 sites.

        • <big>FTS service review</big>

          Please read the report linked to the agenda.

          Speakers: Gavin McCance (CERN), Steve Traylen
          Paper
        • <big> ATLAS service </big>
          See also https://twiki.cern.ch/twiki/bin/view/Atlas/TierZero20071 and https://twiki.cern.ch/twiki/bin/view/Atlas/ComputingOperations for more information.

          • From Rod Walker
            A user with EMail in the DN had problems at 2 sites, SARA and BNL, to access dcache. I think it`s the old problem of java GSI chewing up the EMail part. The workaround put various permutations in the dcache mapfile. The response from SARA indicates that they have fewer permutations than TRIUMF. TRIUMF has
            # rpm -qf /opt/d-cache/bin/grid-mapfile2dcache-kpwd d-cache-lcg-6.2.0-1.noarch
            Still not clear what is in BNL. In SARA maybe there is a log saying which DN was refused. This issue is reported to clarify what should be in the mapfile, and whether all sites provide this.
          • From Campana et al.
            In many sites there is a mismatch between all inclusive information published for a CE and information published in the VOViews. As an example, the queue mentioned below supports only ATLAS, therefore, the number of waiting jobs in the inclusive view should be the same as the one for the ATLAS VoView. But it is not. The VOView publishes all zeroes. Moreover, there are some queues where the number of waiting jobs for all views do not add up to the total published in the inclusive view. In total more than 130 ATLAS queues are affected, among which almost all T1s. Since the WMS uses information in the VOView and the latest one is generally the wrongly published one, ATLAS is submitting jobs almost randomly with accumulation of jobs at small sites. The issue is extremely severe. We would like the operation team to investigate the reason for so many mismatches and chase site by site to have the problem cured. Attached below is a list of currently problematic CEs as of today
          • The average ATLAS job requires 1.1GB of memory per core. CERN publishes 1GB of RAM, therefore CERN is empty of ATLAS production jobs. We would like to ask CERN to evaluate if 1GB ram is completely realistic. In case it is, we should start thinking about publishing different subclusters for different types of machines.
          • The requirements for the size of the ATLAS SW installation area have been discussed 4 years ago, and they are surely obsolete. The ATLAS SW manager would like to ask for 100 GB of shared space for ATLAS where to install software at every ATLAS T1 and T2 site.
            # tbat01.nipne.ro:2119/jobmanager-lcgpbs-atlas, RO-02-NIPNE, grid
            dn:
            GlueCEUniqueID=tbat01.nipne.ro:2119/jobmanager-lcgpbs-atlas,mds-vo-
            name=RO
             -02-NIPNE,o=grid
            objectClass: GlueCETop
            objectClass: GlueCE
            objectClass: GlueSchemaVersion
            objectClass: GlueCEAccessControlBase
            objectClass: GlueCEInfo
            objectClass: GlueCEPolicy
            objectClass: GlueCEState
            objectClass: GlueInformationService
            objectClass: GlueKey
            GlueCEHostingCluster: tbat01.nipne.ro
            GlueCEName: atlas
            GlueCEUniqueID: tbat01.nipne.ro:2119/jobmanager-lcgpbs-atlas
            GlueCEInfoGatekeeperPort: 2119
            GlueCEInfoHostName: tbat01.nipne.ro
            GlueCEInfoLRMSType: torque
            GlueCEInfoLRMSVersion: 2.1.6
            GlueCEInfoTotalCPUs: 44
            GlueCEInfoJobManager: lcgpbs
            GlueCEInfoContactString: tbat01.nipne.ro:2119/jobmanager-lcgpbs-atlas
            GlueCEInfoApplicationDir: /opt/exp_soft
            GlueCEInfoDataDir: unset
            GlueCEInfoDefaultSE: tbat05.nipne.ro
            GlueCEStateEstimatedResponseTime: 390175
            GlueCEStateFreeCPUs: 3
            GlueCEStateRunningJobs: 29
            GlueCEStateStatus: Production
            GlueCEStateTotalJobs: 750
            GlueCEStateWaitingJobs: 721
            GlueCEStateWorstResponseTime: 186883200
            GlueCEStateFreeJobSlots: 0
            GlueCEPolicyMaxCPUTime: 2880
            GlueCEPolicyMaxRunningJobs: 93
            GlueCEPolicyMaxTotalJobs: 0
            GlueCEPolicyMaxWallClockTime: 4320
            GlueCEPolicyPriority: 1
            GlueCEPolicyAssignedJobSlots: 0
            GlueCEAccessControlBaseRule: VO:atlas
            GlueForeignKey: GlueClusterUniqueID=tbat01.nipne.ro
            GlueInformationServiceURL: ldap://tbat01.nipne.ro:2135/mds-vo-
            name=local,o=gri
             d
            GlueSchemaVersionMajor: 1
            GlueSchemaVersionMinor: 2
            
            # atlas, tbat01.nipne.ro:2119/jobmanager-lcgpbs-atlas, RO-02-NIPNE,
            grid
            dn:
            
            GlueVOViewLocalID=atlas,GlueCEUniqueID=tbat01.nipne.ro:2119/jobmanager-
            lcg
             pbs-atlas,mds-vo-name=RO-02-NIPNE,o=grid
            objectClass: GlueCETop
            objectClass: GlueVOView
            objectClass: GlueCEInfo
            objectClass: GlueCEState
            objectClass: GlueCEAccessControlBase
            objectClass: GlueCEPolicy
            objectClass: GlueKey
            objectClass: GlueSchemaVersion
            GlueVOViewLocalID: atlas
            GlueCEAccessControlBaseRule: VO:atlas
            GlueCEStateRunningJobs: 0
            GlueCEStateWaitingJobs: 0
            GlueCEStateTotalJobs: 0
            GlueCEStateFreeJobSlots: 15
            GlueCEStateEstimatedResponseTime: 0
            GlueCEStateWorstResponseTime: 0
            GlueCEInfoDefaultSE: tbat05.nipne.ro
            GlueCEInfoApplicationDir: /opt/exp_soft/atlas
            GlueCEInfoDataDir: unset
            GlueChunkKey: GlueCEUniqueID=tbat01.nipne.ro:2119/jobmanager-lcgpbs-
            atlas
            GlueSchemaVersionMajor: 1
            GlueSchemaVersionMinor: 2
            

            [campanas@lxb0709 BDII]$ python VOViewsConsist.py | grep '==>'
            ===> CE:atlasce.lnf.infn.it:2119/jobmanager-lcgpbs-atlas TOTrun:27     TOTVOrun:0    TOTwait:531     TOTVOwait:0
            ===> CE:atlasce.phys.sinica.edu.tw:2119/jobmanager-lcgcondor-atlas TOTrun:1     TOTVOrun:0    TOTwait:20     TOTVOwait:0
            ===> CE:atlasce01.na.infn.it:2119/jobmanager-lcgpbs-atlas TOTrun:18     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:atlasce01.na.infn.it:2119/jobmanager-lcgpbs-atlas_short TOTrun:6     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:bigmac-lcg-ce.physics.utoronto.ca:2119/jobmanager-lcgcondor-atlas TOTrun:12     TOTVOrun:0    TOTwait:3     TOTVOwait:4444
            ===> CE:cclcgceli02.in2p3.fr:2119/jobmanager-bqs-atlas_long TOTrun:4     TOTVOrun:2    TOTwait:0     TOTVOwait:0
            ===> CE:cclcgceli04.in2p3.fr:2119/jobmanager-bqs-atlas_long TOTrun:6     TOTVOrun:3    TOTwait:0     TOTVOwait:0
            ===> CE:cclcgceli05.in2p3.fr:2119/jobmanager-bqs-atlas_long TOTrun:14     TOTVOrun:7    TOTwait:2     TOTVOwait:1
            ===> CE:ce.bfg.uni-freiburg.de:2119/jobmanager-pbs-atlas TOTrun:18     TOTVOrun:16    TOTwait:0     TOTVOwait:0
            ===> CE:ce.epcc.ed.ac.uk:2119/jobmanager-lcgpbs-atlas TOTrun:3     TOTVOrun:0    TOTwait:29     TOTVOwait:0
            ===> CE:ce.gina.sara.nl:2119/jobmanager-pbs-medium TOTrun:141     TOTVOrun:0    TOTwait:1     TOTVOwait:0
            ===> CE:ce.gina.sara.nl:2119/jobmanager-pbs-short TOTrun:14     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:ce.hpc.csie.thu.edu.tw:2119/jobmanager-lcgpbs-atlas TOTrun:0     TOTVOrun:0    TOTwait:3     TOTVOwait:2
            ===> CE:ce.keldysh.ru:2119/jobmanager-lcgpbs-atlas TOTrun:6     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:ce.phy.bg.ac.yu:2119/jobmanager-pbs-atlas TOTrun:14     TOTVOrun:0    TOTwait:16     TOTVOwait:2
            ===> CE:ce.ulakbim.gov.tr:2119/jobmanager-lcgpbs-atlas TOTrun:11     TOTVOrun:9    TOTwait:0     TOTVOwait:0
            ===> CE:ce00.hep.ph.ic.ac.uk:2119/jobmanager-sge-72hr TOTrun:177     TOTVOrun:3717    TOTwait:0     TOTVOwait:0
            ===> CE:ce001.grid.uni-sofia.bg:2119/jobmanager-lcgpbs-atlas TOTrun:6     TOTVOrun:5    TOTwait:0     TOTVOwait:0
            ===> CE:ce01-lcg.projects.cscs.ch:2119/jobmanager-lcgpbs-atlas TOTrun:8     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:ce01.afroditi.hellasgrid.gr:2119/jobmanager-pbs-atlas TOTrun:5     TOTVOrun:2    TOTwait:2     TOTVOwait:1
            ===> CE:ce01.ariagni.hellasgrid.gr:2119/jobmanager-lcgpbs-atlas TOTrun:6     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:ce01.athena.hellasgrid.gr:2119/jobmanager-pbs-atlas TOTrun:1     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:ce01.ific.uv.es:2119/jobmanager-pbs-atlas TOTrun:29     TOTVOrun:0    TOTwait:299     TOTVOwait:1
            ===> CE:ce01.ific.uv.es:2119/jobmanager-pbs-atlasL TOTrun:28     TOTVOrun:0    TOTwait:295     TOTVOwait:1
            ===> CE:ce01.kallisto.hellasgrid.gr:2119/jobmanager-pbs-atlas TOTrun:17     TOTVOrun:16    TOTwait:0     TOTVOwait:0
            ===> CE:ce01.marie.hellasgrid.gr:2119/jobmanager-pbs-atlas TOTrun:10     TOTVOrun:6    TOTwait:2     TOTVOwait:0
            ===> CE:ce02.athena.hellasgrid.gr:2119/blah-pbs-atlas TOTrun:1     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:ce02.lip.pt:2119/jobmanager-lcgsge-atlasgrid TOTrun:8     TOTVOrun:6    TOTwait:0     TOTVOwait:0
            ===> CE:ce02.marie.hellasgrid.gr:2119/jobmanager-pbs-atlas TOTrun:9     TOTVOrun:8    TOTwait:4     TOTVOwait:3
            ===> CE:ce03-lcg.cr.cnaf.infn.it:2119/jobmanager-lcglsf-atlas TOTrun:29     TOTVOrun:6    TOTwait:0     TOTVOwait:0
            ===> CE:ce04-lcg.cr.cnaf.infn.it:2119/blah-lsf-atlas TOTrun:10     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:ce05-lcg.cr.cnaf.infn.it:2119/jobmanager-lcglsf-slc4_debug TOTrun:881     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:ce05.pic.es:2119/jobmanager-lcgpbs-atlastest TOTrun:4     TOTVOrun:0    TOTwait:163     TOTVOwait:0
            ===> CE:ce05.pic.es:2119/jobmanager-lcgpbs-glong TOTrun:19     TOTVOrun:3    TOTwait:0     TOTVOwait:0
            ===> CE:ce05.pic.es:2119/jobmanager-lcgpbs-gshort TOTrun:9     TOTVOrun:8    TOTwait:4     TOTVOwait:4
            ===> CE:ce06-lcg.cr.cnaf.infn.it:2119/jobmanager-lcglsf-atlas TOTrun:9     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:ce06-lcg.cr.cnaf.infn.it:2119/jobmanager-lcglsf-debug TOTrun:9     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:ce06.pic.es:2119/jobmanager-lcgpbs-glong TOTrun:19     TOTVOrun:3    TOTwait:0     TOTVOwait:0
            ===> CE:ce06.pic.es:2119/jobmanager-lcgpbs-gshort TOTrun:9     TOTVOrun:8    TOTwait:4     TOTVOwait:4
            ===> CE:ce07.pic.es:2119/jobmanager-lcgpbs-glong TOTrun:19     TOTVOrun:3    TOTwait:0     TOTVOwait:0
            ===> CE:ce07.pic.es:2119/jobmanager-lcgpbs-gshort TOTrun:9     TOTVOrun:8    TOTwait:4     TOTVOwait:4
            ===> CE:ce1-egee.srce.hr:2119/jobmanager-sge-dteam TOTrun:12     TOTVOrun:12    TOTwait:0     TOTVOwait:4444
            ===> CE:ce1.egee.fr.cgg.com:2119/jobmanager-lcgpbs-atlas TOTrun:10     TOTVOrun:0    TOTwait:2     TOTVOwait:0
            ===> CE:ce1.triumf.ca:2119/jobmanager-lcgpbs-atlas TOTrun:18     TOTVOrun:9    TOTwait:0     TOTVOwait:0
            ===> CE:ce101.cern.ch:2119/jobmanager-lcglsf-grid_atlas TOTrun:10     TOTVOrun:7    TOTwait:2     TOTVOwait:2
            ===> CE:ce102.cern.ch:2119/jobmanager-lcglsf-grid_atlas TOTrun:10     TOTVOrun:7    TOTwait:2     TOTVOwait:2
            ===> CE:ce106.cern.ch:2119/jobmanager-lcglsf-grid_atlas TOTrun:10     TOTVOrun:7    TOTwait:0     TOTVOwait:0
            ===> CE:ce107.cern.ch:2119/jobmanager-lcglsf-grid_2nh_atlas TOTrun:100     TOTVOrun:99    TOTwait:0     TOTVOwait:0
            ===> CE:ce107.cern.ch:2119/jobmanager-lcglsf-grid_atlas TOTrun:6     TOTVOrun:3    TOTwait:11     TOTVOwait:5
            ===> CE:ce108.cern.ch:2119/jobmanager-lcglsf-grid_atlas TOTrun:10     TOTVOrun:7    TOTwait:0     TOTVOwait:0
            ===> CE:ce123.cern.ch:2119/jobmanager-lcglsf-grid_atlas TOTrun:10     TOTVOrun:7    TOTwait:0     TOTVOwait:0
            ===> CE:ce2.triumf.ca:2119/jobmanager-lcgpbs-atlas TOTrun:18     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:ceitep.itep.ru:2119/jobmanager-lcgpbs-atlas TOTrun:2     TOTVOrun:0    TOTwait:1     TOTVOwait:0
            ===> CE:clrlcgce02.in2p3.fr:2119/jobmanager-lcgpbs-atlas TOTrun:12     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:cs-grid0.bgu.ac.il:2119/jobmanager-lcgpbs-atlas TOTrun:0     TOTVOrun:0    TOTwait:31     TOTVOwait:0
            ===> CE:cs-grid1.bgu.ac.il:2119/blah-pbs-atlas TOTrun:0     TOTVOrun:0    TOTwait:31     TOTVOwait:0
            ===> CE:dgce0.icepp.jp:2119/jobmanager-lcgpbs-atlas TOTrun:16     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:egee.irb.hr:2119/jobmanager-lcgpbs-grid TOTrun:16     TOTVOrun:15    TOTwait:0     TOTVOwait:0
            ===> CE:epgce1.ph.bham.ac.uk:2119/jobmanager-lcgpbs-atlas TOTrun:6     TOTVOrun:0    TOTwait:457     TOTVOwait:0
            ===> CE:epgce1.ph.bham.ac.uk:2119/jobmanager-lcgpbs-short TOTrun:3     TOTVOrun:0    TOTwait:1     TOTVOwait:0
            ===> CE:fal-pygrid-18.lancs.ac.uk:2119/jobmanager-lcgpbs-atlas TOTrun:13     TOTVOrun:7    TOTwait:0     TOTVOwait:0
            ===> CE:fornax-ce.itwm.fhg.de:2119/jobmanager-lcgpbs-atlas TOTrun:8     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:g03n02.pdc.kth.se:2119/jobmanager-pbs-atlas TOTrun:1     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:gcn54.hep.physik.uni-siegen.de:2119/jobmanager-lcgpbs-atlas TOTrun:1     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:glite-ce-01.cnaf.infn.it:2119/blah-pbs-lcg TOTrun:3     TOTVOrun:3    TOTwait:3     TOTVOwait:0
            ===> CE:glite-ce01.marie.hellasgrid.gr:2119/blah-pbs-atlas TOTrun:10     TOTVOrun:6    TOTwait:2     TOTVOwait:0
            ===> CE:golias25.farm.particle.cz:2119/jobmanager-lcgpbs-lcgatlas TOTrun:2     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:grid-ce.physik.uni-wuppertal.de:2119/jobmanager-lcgpbs-dg_long TOTrun:0     TOTVOrun:0    TOTwait:2     TOTVOwait:0
            ===> CE:grid-ce.rzg.mpg.de:2119/jobmanager-sge-long TOTrun:1     TOTVOrun:0    TOTwait:0     TOTVOwait:26664
            ===> CE:grid-ce3.desy.de:2119/jobmanager-lcgpbs-default TOTrun:143     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:grid-ce3.desy.de:2119/jobmanager-lcgpbs-testing TOTrun:2     TOTVOrun:0    TOTwait:1     TOTVOwait:0
            ===> CE:grid.uibk.ac.at:2119/jobmanager-lcgpbs-atlas TOTrun:6     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:grid0.fe.infn.it:2119/jobmanager-lcgpbs-lcg TOTrun:7     TOTVOrun:4    TOTwait:0     TOTVOwait:0
            ===> CE:grid001.fi.infn.it:2119/jobmanager-lcgpbs-atlas TOTrun:13     TOTVOrun:12    TOTwait:0     TOTVOwait:0
            ===> CE:grid002.ca.infn.it:2119/jobmanager-lcglsf-atlas TOTrun:0     TOTVOrun:0    TOTwait:36     TOTVOwait:10
            ===> CE:grid002.jet.efda.org:2119/jobmanager-lcgpbs-atlas TOTrun:3     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:grid003.roma2.infn.it:2119/jobmanager-lcgpbs-atlas TOTrun:30     TOTVOrun:29    TOTwait:0     TOTVOwait:0
            ===> CE:grid01.cu.edu.tr:2119/jobmanager-lcgpbs-atlas TOTrun:6     TOTVOrun:5    TOTwait:0     TOTVOwait:0
            ===> CE:grid109.kfki.hu:2119/jobmanager-lcgpbs-atlas TOTrun:4     TOTVOrun:0    TOTwait:1     TOTVOwait:0
            ===> CE:gridba2.ba.infn.it:2119/jobmanager-lcgpbs-infinite TOTrun:51     TOTVOrun:0    TOTwait:287     TOTVOwait:0
            ===> CE:gridba2.ba.infn.it:2119/jobmanager-lcgpbs-long TOTrun:13     TOTVOrun:0    TOTwait:84     TOTVOwait:0
            ===> CE:gridba2.ba.infn.it:2119/jobmanager-lcgpbs-short TOTrun:0     TOTVOrun:0    TOTwait:1     TOTVOwait:0
            ===> CE:gridce.ilc.cnr.it:2119/jobmanager-lcgpbs-atlas TOTrun:2     TOTVOrun:1    TOTwait:3     TOTVOwait:1
            ===> CE:gridce.pi.infn.it:2119/jobmanager-lcglsf-atlas TOTrun:3     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:gridit-ce-001.cnaf.infn.it:2119/jobmanager-lcgpbs-lcg TOTrun:3     TOTVOrun:3    TOTwait:3     TOTVOwait:0
            ===> CE:grim-ce.iucc.ac.il:2119/jobmanager-lcgpbs-atlas TOTrun:0     TOTVOrun:0    TOTwait:1     TOTVOwait:0
            ===> CE:hep-ce.cx1.hpc.ic.ac.uk:2119/jobmanager-pbs-heplt2 TOTrun:274     TOTVOrun:5206    TOTwait:353     TOTVOwait:6707
            ===> CE:heplnx206.pp.rl.ac.uk:2119/jobmanager-lcgpbs-atlas TOTrun:19     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:heplnx206.pp.rl.ac.uk:2119/jobmanager-lcgpbs-short TOTrun:8     TOTVOrun:0    TOTwait:2     TOTVOwait:0
            ===> CE:heplnx207.pp.rl.ac.uk:2119/jobmanager-lcgpbs-atlas TOTrun:19     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:heplnx207.pp.rl.ac.uk:2119/jobmanager-lcgpbs-short TOTrun:8     TOTVOrun:0    TOTwait:2     TOTVOwait:0
            ===> CE:i101.hpc2n.umu.se:2119/jobmanager-lcgpbs-ngrid TOTrun:20     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:ifaece01.pic.es:2119/jobmanager-lcgpbs-atlas TOTrun:0     TOTVOrun:0    TOTwait:1304     TOTVOwait:0
            ===> CE:ifaece01.pic.es:2119/jobmanager-lcgpbs-atlas2 TOTrun:14     TOTVOrun:0    TOTwait:162     TOTVOwait:0
            ===> CE:ituce.grid.itu.edu.tr:2119/jobmanager-lcgpbs-atlas TOTrun:0     TOTVOrun:0    TOTwait:1     TOTVOwait:0
            ===> CE:lapp-ce01.in2p3.fr:2119/jobmanager-pbs-atlas TOTrun:42     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:lcg-ce.lps.umontreal.ca:2119/jobmanager-lcgpbs-atlas TOTrun:10     TOTVOrun:0    TOTwait:171     TOTVOwait:0
            ===> CE:lcg-ce.rcf.uvic.ca:2119/jobmanager-lcgpbs-general TOTrun:14     TOTVOrun:3    TOTwait:0     TOTVOwait:0
            ===> CE:lcg-ce0.ifh.de:2119/jobmanager-lcgpbs-atlas TOTrun:12     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:lcg-ce01.icepp.jp:2119/jobmanager-lcgpbs-atlas TOTrun:10     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:lcg-ce1.ifh.de:2119/jobmanager-lcgpbs-atlas_blade TOTrun:80     TOTVOrun:0    TOTwait:108     TOTVOwait:0
            ===> CE:lcg-lrz-ce.lrz-muenchen.de:2119/jobmanager-sge-atlas TOTrun:4     TOTVOrun:0    TOTwait:0     TOTVOwait:4444
            ===> CE:lcgce0.shef.ac.uk:2119/jobmanager-lcgpbs-atlas TOTrun:1     TOTVOrun:0    TOTwait:2     TOTVOwait:0
            ===> CE:lcgce01.jinr.ru:2119/jobmanager-lcgpbs-atlas TOTrun:8     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:lcgce01.phy.bris.ac.uk:2119/jobmanager-lcgpbs-atlas TOTrun:3     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:lcgce01.phy.bris.ac.uk:2119/jobmanager-lcgpbs-short TOTrun:13     TOTVOrun:0    TOTwait:24     TOTVOwait:0
            ===> CE:lcgrid.dnp.fmph.uniba.sk:2119/jobmanager-lcgpbs-atlas TOTrun:8     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:lgdce01.jinr.ru:2119/jobmanager-lcgpbs-atlas TOTrun:2     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:mars-ce2.mars.lesc.doc.ic.ac.uk:2119/jobmanager-sge-10min TOTrun:1     TOTVOrun:18    TOTwait:0     TOTVOwait:0
            ===> CE:mars-ce2.mars.lesc.doc.ic.ac.uk:2119/jobmanager-sge-12hr TOTrun:40     TOTVOrun:720    TOTwait:1     TOTVOwait:18
            ===> CE:mars-ce2.mars.lesc.doc.ic.ac.uk:2119/jobmanager-sge-1hr TOTrun:6     TOTVOrun:108    TOTwait:0     TOTVOwait:0
            ===> CE:mars-ce2.mars.lesc.doc.ic.ac.uk:2119/jobmanager-sge-24hr TOTrun:70     TOTVOrun:1260    TOTwait:33     TOTVOwait:594
            ===> CE:mars-ce2.mars.lesc.doc.ic.ac.uk:2119/jobmanager-sge-30min TOTrun:3     TOTVOrun:54    TOTwait:0     TOTVOwait:0
            ===> CE:mars-ce2.mars.lesc.doc.ic.ac.uk:2119/jobmanager-sge-3hr TOTrun:14     TOTVOrun:252    TOTwait:1     TOTVOwait:18
            ===> CE:mars-ce2.mars.lesc.doc.ic.ac.uk:2119/jobmanager-sge-6hr TOTrun:25     TOTVOrun:450    TOTwait:1     TOTVOwait:18
            ===> CE:mars-ce2.mars.lesc.doc.ic.ac.uk:2119/jobmanager-sge-72hr TOTrun:99     TOTVOrun:1782    TOTwait:33     TOTVOwait:594
            ===> CE:mu6.matrix.sara.nl:2119/jobmanager-pbs-medium TOTrun:0     TOTVOrun:0    TOTwait:117     TOTVOwait:0
            ===> CE:mu9.matrix.sara.nl:2119/jobmanager-pbs-batch TOTrun:354     TOTVOrun:0    TOTwait:529     TOTVOwait:0
            ===> CE:node001.grid.auth.gr:2119/jobmanager-pbs-atlas TOTrun:3     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:paugrid1.pamukkale.edu.tr:2119/jobmanager-lcgpbs-atlas TOTrun:0     TOTVOrun:0    TOTwait:1     TOTVOwait:0
            ===> CE:pc90.hep.ucl.ac.uk:2119/jobmanager-lcgpbs-lcgatlas TOTrun:13     TOTVOrun:0    TOTwait:32     TOTVOwait:0
            ===> CE:serv03.hep.phy.cam.ac.uk:2119/jobmanager-lcgcondor-atlas TOTrun:3     TOTVOrun:0    TOTwait:4     TOTVOwait:4444
            ===> CE:skurut17.cesnet.cz:2119/jobmanager-lcgpbs-atlas TOTrun:8     TOTVOrun:0    TOTwait:1     TOTVOwait:0
            ===> CE:snowpatch.hpc.sfu.ca:2119/jobmanager-lcgpbs-atlas TOTrun:8     TOTVOrun:0    TOTwait:274     TOTVOwait:0
            ===> CE:spacin-ce1.dma.unina.it:2119/jobmanager-lcgpbs-atlas TOTrun:0     TOTVOrun:0    TOTwait:2     TOTVOwait:1
            ===> CE:svr016.gla.scotgrid.ac.uk:2119/jobmanager-lcgpbs-atlas TOTrun:13     TOTVOrun:0    TOTwait:0     TOTVOwait:0
            ===> CE:t2-ce-01.mi.infn.it:2119/jobmanager-lcgpbs-atlas TOTrun:13     TOTVOrun:8    TOTwait:0     TOTVOwait:0
            ===> CE:t2-ce-02.lnl.infn.it:2119/jobmanager-lcglsf-atlas TOTrun:3     TOTVOrun:0    TOTwait:7     TOTVOwait:1
            ===> CE:t2ce02.physics.ox.ac.uk:2119/jobmanager-lcgpbs-atlas TOTrun:7     TOTVOrun:0    TOTwait:1     TOTVOwait:0
            ===> CE:t2ce02.physics.ox.ac.uk:2119/jobmanager-lcgpbs-short TOTrun:0     TOTVOrun:0    TOTwait:2     TOTVOwait:0
            ===> CE:tbat01.nipne.ro:2119/jobmanager-lcgpbs-atlas TOTrun:30     TOTVOrun:0    TOTwait:659     TOTVOwait:0
            ===> CE:tbit01.nipne.ro:2119/jobmanager-lcgpbs-atlas TOTrun:20     TOTVOrun:19    TOTwait:0     TOTVOwait:0
            ===> CE:tbn20.nikhef.nl:2119/jobmanager-pbs-atlas TOTrun:5     TOTVOrun:3    TOTwait:0     TOTVOwait:0
            ===> CE:tbn20.nikhef.nl:2119/jobmanager-pbs-qlong TOTrun:43     TOTVOrun:41    TOTwait:0     TOTVOwait:0
            ===> CE:tbn20.nikhef.nl:2119/jobmanager-pbs-qshort TOTrun:7     TOTVOrun:6    TOTwait:0     TOTVOwait:0
            ===> CE:yildirim.grid.boun.edu.tr:2119/jobmanager-lcgpbs-atlas TOTrun:0     TOTVOrun:0    TOTwait:1     TOTVOwait:0
            
        • <big>CMS service</big>
          • No report.
          Speaker: Mr Daniele Bonacorsi (CNAF-INFN BOLOGNA, ITALY)
        • <big> LHCb service </big>
          • Issue at CERN still waiting to be answered. (Remedy ticket from Philippe)

            When we run jobs reading files that are on lhcbdata (SRM endpoint srm-durable-lhcb.cern.ch) we expect that the files are actually on the lhcbdata pool and then suddenly available for being opened. However it seems that querying the stager for one of these files its status is STAGEIN We would like to know whether this is an expected behaviour of the CERN durable SE, in which case we shall pass all our jobs through the DIRAC stager in order to cope with this. Our assumption was that most of the analysis jobs accessing TxD1 data would not need to unduly overload the service.

          Speaker: Dr roberto santinelli (CERN/IT/GD)
        • <big> ALICE service </big>
          • No report.
          Speaker: Dr Patricia Mendez Lorenzo (CERN IT/GD)
        • <big> Service Coordination </big>
          The CMS CSA07 service challenge Tier 0 reconstruction and Tier 1 data export phase should now start on Tuesday 25 September and run for 30 days. See https://twiki.cern.ch/twiki/bin/view/CMS/CSA07Plan
          Speaker: Harry Renshall / Jamie Shiers
      • 16:55 17:00
        OSG Items 5m
      • 17:00 17:05
        Review of action items 5m
        list of actions
      • 17:10 17:15
        AOB 5m
        • .