announcements

--------------

 

* Improving CVMFS support

   * all software deployment team members are now watching the cvmfs savannah squad: cmscompinfrasup-cvmfs

   * questions and support requests about CVMFS should go through this squad

   * documentation is maintained here: [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/CernVMFS4cms]] and here: [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/CompOpsCVMFS]], to be consolidated

   

* Improving savannah ticket support

   * merged categories: "Data Operations" and "Facilities Operations" into "Facilities"

   * created squad acting as catch-all place: cmscompinfrasup-comp_ops: sites are asked to reassign ticket to this squad if they need reply from central operations and don't have a specific squad they would like to communicate with

 

* Support for /store/himc and /store/hidata

   * All T1 sites except FNAL and all T2 sites are asked to support /store/hidata and /store/himc for production use

   * These top level directories are used to cleanly separate Heavy Ion collision files from proton-proton collision files

   * Not all sites allow for directory creation in /store, therefore this announcement

 

* T2s using DPM: workarounds are needed to run CMSSW

   * Reminder that workarounds have to be re-applied and updated after upgrading middleware or OS

   * Documentation is kept up to date on TWiki: https://twiki.cern.ch/twiki/bin/view/CMSPublic/CompOpsT2DPMInstructions

   

developments

-------------

 

* ARCHIVE Castor Service Class at CERN ready: 10 TB per user

   * Documentation: [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/CompOpsCERNArchive]]

   * users can store and recall files from tape but cannot run CMSSW jobs against the files

   

issues

-------

 

* IN2P3 and other dCache T2 are suffering from SRM problem: 

   * known problem, the jGlobus fix for long proxies (fixed in jGlobus 2, but dCache uses jGlobus1) requires frequent restarts of frozen SRM

   * Currently IN2P3 uses cron job to check for responsiveness and restarts and waits for a fix from dCache developers (IN2P3 hasn't gotten any reply in a long time)

   * Best is to find problematic DN and stop user refreshing the proxy in a certain way which creates longer and longer proxies (problematic as DN is not in the log files as freeze happens before the log files are used)

 

* Not all downtimes declared in OIM for OSG sites are propagated to Dashboard for a change in OIM feed

   * OSG sites can get improperly marked in "error" during a downtime

   * Follow up in SAV:134221

 

* some instabilities with CERN CreamCEs:

   * CEs: HammerCloud test jobs aborting on CERN CREAM CEs ce206 and ce208 with reason "the endpoint is blacklisted", IN  PROGRESS GGUS:89124

   * CEs: low level job submission problem (< 5%), IN PROGRESS GGUS:88573


question

------------

 

   * UI recommendation: gLite is recommended but not supported anymore

      * How do we proceed?


   * Also related to the UI: for progress with the UI migration towards EMI, we _urgently_ need a relocatable "TAR distribution". This was always lowered in priority by EMI due to person power limitations. There was though an effort started by CERN IT and GridPP to provide this. Can we have an update on the status of a tar distribution?