EGEE-WLCG ops
*******************
From last week:
- Revisited questions about the availability calculation (e.g. Would it be possible to implement mechanisms for automatic removal of periods in which sites failed due to some monitoring-related problems).
- gLite3.1 Update16 was released to production today
The update contains:
* A new index on the attribute GlueServiceEndpoint, used by lcg-utils
* UI: Bug fixes to jdl API (bulk submission) and gfal clients
* dcache SE: Glue 1.3 clean ups and bug fixes
* DPM SE: version 1.6.7 (32-bit and 64-bit) fixing various configuration bugs; introducing new front-ends for Xroot and HTTP/HTTPS; upgrading the version of gSOAP from 2.6.2 -> 2.7.6b
* GFAL version 1.10.8-1: creation of subdirectories with lcg-utils
* lcgCE: bug fixing
- From France "A lesson learnt from CCRC08 is that some VOs don''t mind the status published by a CE queue, so that they can wrongly submit on queue with a non-Production status."
- Detailed report from CMS on: Data certification, T0 status and reprocessing; Re-processing (jobs take too long at FNAL due to a dCache issue); MC production (for details see http://khomich.web.cern.ch/khomich/csa07Signal.html); Data Transfers and Integrity, DDT-2/LT status (55/56 T[01]-T1 crosslinks (only ASGC->RAL is missing). Documentation associated with activities is improving - e.g. https://twiki.cern.ch/twiki/bin/view/CMS/DDTLinkExercising.
From this week:
- Glite 3.1.0 PPS Update 21 was released to PPS last Friday. No major issues found so far. Update contains:
* new VOMS-Admin server (2.0.13-1) and client (2.0.6-1): (Added ACL support to command-line client; 9 bugs fixed. Find yours in https://savannah.cern.ch/patch/index.php?1629)
* new vdt_globus_essentials to fix Globus bug 5771: Mainly of interest for CERN-PROD, fixing hanging processes on submission of SAM RB and WMS tests
* New version of lcg-tags: warning messages suppressed
* DPM 1.6.7-4 32 and 64 bit: SRM v2 and SRMv2.2 new (fixed) behaviour when creating subdirectories with srmMkdir
* new glite-AMGA_oracle metapackage
The Release notes are here:
https://twiki.cern.ch/twiki/bin/view/EGEE/PPSReleaseNotes_310_PPS_Update21
- Another discussion on SAM test result accuracy and how this affects the availability calculation result for a site.
- Italy reported that SAM was not functioning correctly 14th-16th March
- Russia reported a "Critical issue with unauthorized access to disk space via xrootd service. It does not depends on either DPM or dCache. Any person in the world who has an xrootd client can read and write everything. The single action which can not be done - delete files. This point completely violates "The Grid Traceability and Logging Policy" (https://edms.cern.ch/document/428037/). ... this bug is absolutely critical from security point of WLCG/EGEE infrastructure and xrootd service must be stoped until the bug will fixed.
See More: https://twiki.cern.ch/twiki/bin/view/LCG/DpmXrootAccess".
- FTS transfer-url-copy update for space tokens will be in gLite 3.0 update 41 due out shortly.
- CMS (again a long report submitted) are starting the discussion about T2 analysis associations.
ROC manager update
*************************
- An OLA between GGUS and TPMs has been proposed: http://edms.cern.ch/document/888089 . We need to comment!
- There is now a mandate for the EGEEIII Operations Automation Team: https://edms.cern.ch/document/901705.
- A GGUS site support survey is about to start
- Steve T. is proposing to use GlueSiteObject to increase the useful information published by sites. http://goc.grid.sinica.edu.tw/gocwiki/How_to_publish_my_site_information
Ticket status
***************
See linked document.
Other
*******
- TPM commitments and how we should organise shifts
- Rolling COD activities out beyond the T1