EGEE-SA1-SWE Meeting (postponed)

Name: EGEE-SA1-SWE Meeting (postponed)
Start: 2005-05-02T12:00:00+02:00
End: 2005-05-02T14:00:00+02:00
Location: Fog VRVS Virtual Room (http://www.vrvs.org)

Monday 2 May 2005, 12:00 → 14:00 Europe/Zurich

Fog VRVS Virtual Room (http://www.vrvs.org)

Pacheco, Andreu

Description

*** This meeting has been postponed due to a Spanish holiday, the extension of the May 1st Labour Day ***

- 12:00 → 12:15
  
  Briefing of activities 15m VRVS
  
  VRVS
  
  Speaker: Pacheco, Andreu (IFAE/PIC)
  
  more information
- 12:05 → 12:15
  Operational status from sites 10m VRVS
  
  VRVS
  
  * From February 1st 2005 all sites must report by mail their operational status each friday if they want Gonzalo Merino to report with accuracy their status to EGEE Operations Meeting * Since the people from CERN is asking the ROCs to write the monday's 11 report following a sort of template, I think it will be useful that the SWE Site Reports that site managers are sending on friday to this list, try to follow a similar template as well. I suggest then that these reports are produced following the template available at: http://services2.pic.org.es/devel/download/txt/SR-template.txt * Reports should be sent to: egee.swe.roc.contact@pic.es The reports from SWE are appended to the agenda page of the weekly operation meetings: http://agenda.cern.ch/displayLevel.php?fid=258
  
  Speaker: Merino, Gonzalo (PIC)
  
  more information
  - CESGA-EGEE 15m
    
    Speaker: Fontan, Javier
  - CIEMAT-LCG2 15m
    
    Weekly Operations Status Report: CIEMAT-LCG2 Reporting Period: 25/04/05 to 29/04/05 Availability of Core Services Hosted in the Site ================================================== There is no core services running at CIEMAT at this moment Non-Functional Time ==================== None Scheduled Down Time =================== * Scheduled Maintenance During the Reporting Period None * Scheduled Maintenance During the Next Reporting Period None Major Operational Issues Encountered During the Reporting Period ================================================================ We have corrected a misconfiguration in Appel (Our CE was not publishing user name information) Upgrade to Scientific Linux 3 (or equivalent) ============================================= By when will all service nodes in the Site be upgraded? Completed, now Scientific Linux 3.0.4 By when will the WNs in the Site be upgraded? Completed, now Scientific Linux 3.0.4
    
    Speaker: Calonge, Javier
  - CNB-LCG2 15m
    
    Speaker: Merino, Angel
  - IFCA-LCG2 15m
    
    Speaker: Cano, Dani
  - IFIC-LCG2 15m
    
    Speaker: Sanchez, Javier
  - INTA-CAB 15m
    
    -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Weekly Operations Status Report: INTA-CAB Reporting Period: 25/04/05 to 29/04/05 Availability of Core Services Hosted in the Site ================================================ No core services hosted in this site. Non-Functional Time =================== 0 days. Scheduled Down Time =================== * Scheduled Maintenance During the Reporting Period 0 days. * Scheduled Maintenance During the Next Reporting Period 0 days. Major Operational Issues Encountered During the Reporting Period ================================================================ We have resolved problems with R-GMA and Apel. Upgrade to Scientific Linux 3 (or equivalent) ============================================= Already done.
    
    Speaker: Huedo, Eduardo
  - LIP-LCG2 15m
    
    Weekly Operations Status Report: LIP Reporting Period: 23/04/05 to 29/04/05 Availability of Core Services Hosted in the Site ================================================== RB: rb02.lip.pt LCG-BDII: ii02.lip.pt RLS: se01.lip.pt LFC: se01.lip.pt Non-Functional Time ==================== [CT] [JS] [JL] Failed 2005-04-26/28 [OK] 5.5 days Scheduled Down Time =================== Major Operational Issues Encountered During the Reporting Period ================================================================ Upgrade to Scientific Linux 3 (or equivalent) ============================================= All nodes in SLC304
    
    Speaker: David, Mario
  - IFAE 15m
    
    Speaker: Borrego, Carlos
  - PIC 15m
    
    Speaker: Borrego, Carlos
  - UAM-LCG2 15m
    
    Speaker: Pardo Navarro, Juan Jose
  - UB-LCG2 15m
    
    Hola Gonzalo, Weekly Operations Status Report: UB Reporting Period: 25/04/2005 to 01/05/2005 Availability of Core Services Hosted in the Site ================================================ Non Core Service in the Site Non-Functional Time ==================== Scheduled Down Time =================== None Major Operational Issues Encountered During the Reporting Period ================================================================ None Upgrade to Scientific Linux 3 (or equivalent) ============================================= By when will all service nodes in the Site be upgraded? By when will the WNs in the Site be upgraded? (together with update to 2.4.0) Start during this week.
    
    Speaker: Graciani, Ricardo
  - UPV-GRyCAP 15m
    
    Weekly Operations Status Report: [ UPV-GRyCAP ] Reporting Period: [ 25/04/05 ] to [ 29/04/05 ] Availability of Core Services Hosted in the Site ================================================== RB, BDII: apis.dsic.upv.es [OK] Non-Functional Time ==================== - Total days unavailable: 5 days 25-04-2005 & 27-04-2005 -> Network Problems. 28-04-2005 -> Problem with GFAL infosys 29-04-2005 -> Test not made (at 14:00). Scheduled Down Time =================== * Scheduled Maintenance During the Reporting Period - Total days unavailable 0 * Scheduled Maintenance During the Next Reporting Period - Estimated number of days for which the site will be unavailable 0 days Major Operational Issues Encountered During the Reporting Period ================================================================ The last week we have this problem: Problems with the MDS service. The information published by the GRIS of the CE in the GIIS service, disappears intermittently. Now we have dicovered that the real problem was in the network traffic. The computing center of our university detected an excessive traffic in our CE machine, and automatically reduce its traffic (50 % of packets transmitted were discarded). From 20-04-2005 to 27-04-2005. In 29-04-2005 nigth we have a problem with the conditioned air system, and the CE machine was automatically switched off. Upgrade to Scientific Linux 3 (or equivalent) ============================================= By when will all service nodes in the Site be upgraded? Already done (SL 3.0.4 & FC2) By when will the WNs in the Site be upgraded? Already done (FC2)
    
    Speaker: Caballer, Miguel
  - USC-LCG2 15m
    
    Hi Gonzalo, Weekly Operations Status Report: USC Reporting Period: 25/04/2005 to 02/05/2005 Availability of Core Services Hosted in the Site ================================================ None Deployed Non-Functional Time =================== None Scheduled Down Time =================== From Wed 20 Apr 2005 20:00 to Fri 22 Apr 2005 09:00 Major Operational Issues Encountered During the Reporting Period ================================================================ User accounting information was not being published due to a missconfiguration of Apel. This problem seems solved SFT tests on 02/05/2005 report Critical Status for the site. The worker node where the test job ran does not belong to the USC cluster. A notification has been sent to Piotr so he can check if there is a configuration problem with the SFT tests. Upgrade to Scientific Linux 3 (or equivalent) ============================================= By when will all service nodes in the Site be upgraded? Already there By when will the WNs in the Site be upgraded? Already there
    
    Speaker: Sanchez, Manuel
  - BIFI-LCG 15m
- 12:10 → 12:15
  
  APEL Status Report 5m VRVS
  
  VRVS
  
  APEL STATUS REPORT == Period == From April 23 to April 29 == Summary == We want to emphasize that several sites that have updated to LCG2.4.0 are having problems with their accountind data: they are not publishing it. These sites have to update Apel to version 3.4.43 after the installation of LCG2.4.0 (we have sent an e-mail during last weeks about this issue) and modify the Apel startup script following the instructions of the e-mail that we have sent this week. They also have to check the configuration file of Apel (/opt/edg/etc/edg-rgma/apel.xml) because this file will be rewrite by the installation procedure. == Status of each SWE site == Site: CESGA Solved Problems of Last Report: Yes Missing Accounting Entries: None Publishing User Name Information: Yes Actions: None. Other Comments: They have updated to LCG2.4.0 and also they have updated to Apel 3.4.43 and configurated it properly. Site: CIEMAT Solved Problems of Last Report: Yes Missing Accounting Entries: Since 16/04/05 Publishing User Name Information: NO. April 15 have not published the user name. Actions: Check the configuration of Apel: They are reprocessing their data and they are not publishing user name. Other Comments: None. Site: ifae Solved Problems of Last Report: NO Missing Accounting Entries: Since 09/04/05 Publishing User Name Information: Yes Actions: Solved Problems of Last Report. JoinProcessor Reprocess Other Comments: None Site: IFCA Solved Problems of Last Report: NO Missing Accounting Entries: Since 15/01/05 Publishing User Name Information: Yes Actions: Solved Problems of Last Report. Reprocess Other Comments: None Site: IFIC Solved Problems of Last Report: NO Missing Accounting Entries: Since 21/04/05 Publishing User Name Information: Yes Actions: Solved Problems of Last Report. They have installed the new version of Apel so they should publish the old data the next time that Apel work. Other Comments: They have updated to LCG2.4.0 so they should update to Apel 3.4.43 and check the configuration of Apel before run it. Site: INTA Solved Problems of Last Report: Yes Missing Accounting Entries: None Publishing User Name Information: Yes Actions: None. Other Comments: They have updated to LCG2.4.0 and also they have updated to Apel 3.4.43 and configurated it properly. Site: LIP Solved Problems of Last Report: NO Missing Accounting Entries: Since 09/04/05 Publishing User Name Information: Yes Actions: None. They have installed the new version of Apel so they should publish the old data the next time that Apel work. Other Comments: They have updated to LCG2.4.0 and also they have update to Apel 3.4.43. They should check the configuration of Apel before run it. Site: pic Solved Problems of Last Report: NO Missing Accounting Entries: Since 13/04/05 Publishing User Name Information: Yes Actions: Solved Problems of Last Report. JoinProcessor Reprocess. Other Comments: They have updated to LCG2.4.0 so they should update to Apel 3.4.43 and check the configuration of Apel before run it. Site: PIC-LCG2 Solved Problems of Last Report: Since 02/03/05 they are publishing user information, but they have not publish the lost user information. They don't reprocess. Missing Accounting Entries: Since 13/03/05 Publishing User Name Information: Yes Actions: Solved Problems of Last Report. JoinProcessor Reprocess Other Comments: At this moment there are three sites names for PIC: this, pic and ifae. This site name was changed by ifae in the monitoring tools, so will you publish more data with this name?. Could we stop the cheching of this site name? Site: UAM Solved Problems of Last Report: NO Missing Accounting Entries: 29 Publishing User Name Information: Yes Actions: Solved Problems of Last Report. JoinProcessor Reprocess Other Comments: None Site: UB Solved Problems of Last Report: NO Missing Accounting Entries: None Publishing User Name Information: Yes Actions: Solved Problems of Last Report. Other Comments: None Site: UPV Solved Problems of Last Report: Yes Missing Accounting Entries: 29 Publishing User Name Information: Yes Actions: None. They have installed the new version of Apel so they should publish the old data the next time that Apel work. Other Comments: None Site: USC Solved Problems of Last Report: NO Missing Accounting Entries: Since 21/04/05 Publishing User Name Information: Yes Actions: None. They have installed the new version of Apel so they should publish the old data the next time that Apel work. Other Comments: They have updated to LCG2.4.0 so they should update to Apel 3.4.43 and check the configuration of Apel before run it. == Accounting Support == Accounting Web Page: http://www.egee.cesga.es/EGEE-SA1-SWE/accounting/ Accounting FAQ: http://www.egee.cesga.es/EGEE-SA1-SWE/accounting/faq.html Accounting Statistics: http://www.egee.cesga.es/EGEE-SA1-SWE/accounting/reports/ CPU Normalization: http://www.egee.cesga.es/EGEE-SA1-SWE/accounting/normalization.html Accounting Guides: http://www.egee.cesga.es/EGEE-SA1-SWE/accounting/guides/ Contact e-mail: egee-admin@cesga.es
  
  Speaker: Rey Mayo, Pablo (CESGA)
- 12:15 → 12:30
  Action lists 15m VRVS
  
  VRVS
  
  Speaker: Pacheco, Andreu (IFAE)
  - 5.1.17.4 Andreu Pacheco to update the SWE Execution Plan (PENDING) 15m
    
    Speaker: Pacheco, Andres (PIC)
  - 5.1.17.7 CIEMAT, CNB, IFIC, IFAE, UAM and USC to upgrade worker nodes to Scientific Linux 3.0.2 (PENDING) 15m
    
    Speaker: Merino, Gonzalo (PIC)
  - 5.1.17.8 CNB to upgrade Apel version 3.4.39-1 (PENDING) 15m
    
    Speaker: Lopez Cacheiro, Javier (CESGA)
  - 5.3.7.1 Miguel Cardenas to check if forensic analysis after intrusion is mandatory and who can do it (NEW) 15m
    
    Speaker: Cardenas, Miguel (CIEMAT)
  - 5.3.7.2 Farida Fassi and Mohammed Kaci to setup the initial user support web site (OPEN) 15m
    
    Speaker: Fassi, Farida (IFIC)
  - 5.3.7.4 Mario to deploy gmdat monitoring tools in SWE (OPEN) 15m
    
    Speaker: David, Mario (LIP)
- 12:20 → 12:35
  GGUS Open Tickets for SWE Federation 15m VRVS
  
  VRVS
  
  Speaker: David, Mario (PIC)
  - 1909 - CNB - GlueSARoot losed for biomed VO since March 21st 15m
    
    Read detailed description of the problem: 21.Mar. 2005, 16:25(UTC) by wsuser I have had to reinstall (using LCFGng and RH7.3) the CE because I had a hard disk problem the last week, and now I'm having problems with GlueSARoot: [root@mallarme root]# ldapsearch -x -H ldap://mallarme.cnb.uam.es:2135 -b mds-vo-name=cnblcg2,o=grid | grep GlueSARoot dn: GlueSARoot=lhcb:lhcb,GlueSEUniqueID=baudelaire.cnb.uam.es,Mds-Vo-name=cnbl GlueSARoot: lhcb:lhcb dn: GlueSARoot=dteam:dteam,GlueSEUniqueID=baudelaire.cnb.uam.es,Mds-Vo-name=cn GlueSARoot: dteam:dteam Why does the CE lose the GlueSARoot for biomed?!?! I have created (as the manual says) the biomed_cfg.conf and added the .h file in vos-cfg.h : (...) /* Add the include file for the new vo here like the one commented out, which is created by the tools "addvo.py" with the example of input file esr.conf like: ./addvo.py -i esr.conf > vo-esr-cfg.h Modify esr.conf to configure your new VO */ /* #include "vo-esr-cfg.h" */ #include "biomed_cfg.h" (...) Read latest entry in diary of steps It is an installation problem and not a VO specific one. Therefore, I reassign it to the corresponding ROC who should check with the relevant CIC if necessary. Anyway, here are some hints on the problem: The GlueSA root is probabaly being reject as this entry contains invalid ldif. If the entry is there when the information provider is run and not in the GRIS then this is usually the problem. For troubleshooting see. http://lfield.home.cern.ch/lfield/trouble.html
  - 1948 - CNB - lack of closeSE for biomed for 2 nodes since March 24th 15m
    
    Read detailed description of the problem: 24.Mar. 2005, 08:35(UTC) by wsuser the node001.grid.auth.gr and mallarme.cnb CE don't have a closeSE for biomed. Do I need to contact directly the nodes administrator ? I contacted directly other nodes administrator for the same problem before (cgg, sinica...) Read latest entry in diary of steps We assign this ticket to the corresponding ROC for the CE mallarme.cnb.uam.es. Nevertheless, we must say that the site administrator of the site has been already reported about this by the Biomed VO manager. ===================================== From: Yannick Legre To: Angel Merino Cc: Biomed-grid-support@cern.ch Subject: lack of close SE in configuration file ---------------------------------------- Dear Angel, We received a ticket from GGUS which says that you have a misconfiguration in your site. No close SE seems to be defined in your CE configuration... Please can you provide on eand let me know asap whent it is done ? :-) Thank you in advance, Best Regards, Yannick ^_^
  - 1979 - IFCA - No published end point for egeerb but it is present in GOCDB since March 29th 15m
    
    Read detailed description of the problem: 29.Mar. 2005, 10:53(UTC) by cicuser ----------Affected site: IFCA-LCG2---------- No published end point for egeerb but it is present in GOCDB. Read latest entry in diary of steps [29/03/2005 13:39] - steve traylen [07/04/2005 16:52] - philippa strange extended expiry date due to schedule downtime [13/04/2005 16:44] - Frederic Schaer Downtime until 14 [15/04/2005 11:11] - Frederic Schaer giis seems down - can not check [26/04/2005 17:19] - Alexander Berezhnoy (dteam)
  - 1981 - IFAE/PIC - closeSE for ifae and pic since March 29th 15m
    
    Read detailed description of the problem: 29.Mar. 2005, 11:09(UTC) by wsuser The biomed closeSEs for these 2 nodes are castor SE. Is it possible to have classical closeSE to improve performance of data access ? Thank you ce01.pic.es:2119/jobmanager-torque-biomed castorgrid.pic.es lcgce02.ifae.es:2119/jobmanager-lcgpbs-biomed castorgrid.ifae.es Read latest entry in diary of steps Those sites don't advertise any classical SE, and so they don't have any to be set as close SE. They could add some classical SEs supporting the Biomed VO. But this should be negotiated by the Biomed VO and the sites directly. For this reason I reassign this to the responsible ROC, which should pass the request to the sites (and maybe get in touch with the user and any Biomed representative). Nevertheless, I must say that classical SEs are meant to disappear in a not too distant future (and be replaced by DPMs). Therefore, it is questionable if the installation of new classical SEs makes sense; even as a temporary solution. In any case, this must be discussed with the sites.
  - 2037 - USC - Unspecified gridmanager error at USC-LCG2 (SouthWesternEurope) since April 5th 15m
    
    Read detailed description of the problem: 05.Apr. 2005, 09:06(UTC) by wsuser Too many jobs got failed today at lcg-ce.usc.cesga.es with the error: Got a job held event, reason: Unspecified gridmanager error The error means that there is some problem in the submission request in the LBS. Behind this message many things could happen: - a misleading information is published in the Information System with respect the real content on the local batch system (for instance all VOs are allowed to run in some queue which is dedicated just to a well defined VO) - some service is down on the CE or wronlgy configured Here culprit jobs IDs: https://lxn1182.cern.ch:9000/HGfR_cK2Qr7kcbcU9oehtA JM-Contact: https://lcg-ce.usc.cesga.es:20013/8267/1112653091/ (submitted to Globus on 04/05 00:18:18) https://lxn1182.cern.ch:9000/oRqMb6JsPOTp_5H1zDmlwQ JM-Contact: https://lcg-ce.usc.cesga.es:20004/12255/1112654112/ (submitted to Globus on 04/05 00:35:21) Cheers, R.
  - 2176 - Several SWE Sites / BIOMED - problem with copy on SE since April 18th 15m
    
    Read detailed description of the problem: 18.Apr. 2005, 08:10(UTC) by wsuser Hi, I'm a biomgrid user for biomed. I've got problem with the replication of a tarball on SE. First, I copy it on the closest SE ( which is not a problem) and then I replicate it on each SE. But for these following SE, I have this result: File replication on SE castorgrid.ifae.es: the server sent an error response: 553 553 /castor/ifae.es/lcg/biomed/generated/2005-04-18/filec82e8eff-96eb-48d8-abdc-b34424079627: No space left on device. File replication on SE castorsrm.ifae.es: SOAP FAULT: SOAP-ENV:Client "initFileStatuses: cannot create path /castor/ifae.es/lcg/biomed/generated/2005-04-18/file6b475501-013a-4b43-9909-32ef273bbc7e: Permission denied" File replication on SE castorgrid.ifae.es: the server sent an error response: 553 553 /castor/ifae.es/lcg/biomed/generated/2005-04-18/file643cac9e-393d-4b3e-b194-aec94cd97368: No space left on device. File replication on SE egeese.ifca.org.es: lcg_rep: Invalid argument File replication on SE scaise.scai.fraunhofer.de: the server sent an error response: 535 535-FTPD GSSAPI error: GSS Major Status: Authentication Failed 535-FTPD GSSAPI error: GSS Minor Status Error Chain: 535-FTPD GSSAPI error: 535-FTPD GSSAPI error: accept_sec_context.c:170: gss_accept_sec_context: SSLv3 handshake problems 535-FTPD GSSAPI error: globus_i_gsi_gss_utils.c:881: globus_i_gsi_gss_handshake: Unable to verify remote side's credentials 535-FTPD GSSAPI error: globus_i_gsi_gss_utils.c:854: globus_i_gsi_gss_handshake: SSLv3 handshake problems: Couldn't do ssl handshake 535-FTPD GSSAPI error: OpenSSL Error: s3_srvr.c:1816: in library: SSL routines, function SSL3_GET_CLIENT_CERTIFICATE: no certificate returned 535-FTPD GSSAPI error: globus_gsi_callback.c:351: globus_i_gsi_callback_handshake_callback: Could not verify credential 535-FTPD GSSAPI error: globus_gsi_callback.c:490: globus_i_gsi_callback_cred_verify: Could not verify credential 535-FTPD GSSAPI error: globus_gsi_callback.c:850: globus_i_gsi_callback_check_signing_policy: Error with signing policy 535-FTPD GSSAPI error: globus_gsi_callback.c:1058: globus_i_gsi_callback_check_gaa_auth: Error in OLD GAA code: CA policy violation: 535 FTPD GSSAPI error: accepting context File replication on SE testbed002.grid.ici.ro: the server sent an error response: 550 550 /storage/biomed/generated/2005-04-18: Permission denied. the server sent an error response: 553 553 /storage/biomed/generated/2005-04-18/file851156d0-04bd-4ee9-a804-e80c8a903b1f: No such file or directory. File replication on SE zeus03.cyf-kr.edu.pl: the server sent an error response: 530 530 Login incorrect. File replication on SE lcg00123.grid.sinica.edu.tw: the server sent an error response: 550 550 /flatfiles/SE00/biomed: Permission denied. the server sent an error response: 553 553 Could not determine cwdir: No such file or directory. File replication on SE castorsrm.ifae.es: SOAP FAULT: SOAP-ENV:Client "initFileStatuses: cannot create path /castor/ifae.es/lcg/biomed/generated/2005-04-18/file5e1f0fc3-2d05-4db0-8241-1d7461279c92: Permission denied" Could you explain me what I did wrong please, because I need to replicate this tarball on the maximum number of SE. Thank you very much, Matthieu Reichstadt 22.Apr. 2005, 09:06(UTC) by wsuser I used the following: lcg-rep --vo lfn: -d Read latest entry in diary of steps I used the following: lcg-rep --vo lfn: -d
  - 2238 - UAM - lcg-cr to grid002.ft.uam.es hangs since April 21st 15m
    
    Read detailed description of the problem: 21.Apr. 2005, 07:50(UTC) by wsuser lcg-cr command does not return and blocks many WNs used for atlas production /opt/lcg/bin/lcg-cr --verbose --vo atlas file:/tmp/rome.test -l lfn:chudoba.test.100 -d grid002.ft.uam.es Using grid catalog type: edg The same command to some other SE works well. grid002.ft.uam.es is in the Atlas BDII
  - 2242 - UPV-GryCAP - bad BDII configuration since Apil 21st 15m
    
    Read detailed description of the problem: 21.Apr. 2005, 14:56(UTC) by cicuser ----------Affected site: UPV-GRyCAP---------- 21.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: UPV-GRyCAP---------- 22.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: UPV-GRyCAP---------- 23.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: UPV-GRyCAP---------- 24.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: UPV-GRyCAP---------- 25.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: UPV-GRyCAP---------- 26.Apr. 2005, 16:08(UTC) by cicuser ----------Affected site: UPV-GRyCAP---------- 26.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: UPV-GRyCAP---------- 27.Apr. 2005, 08:44(UTC) by cicuser ----------Affected site: UPV-GRyCAP---------- 27.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: UPV-GRyCAP---------- 28.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: UPV-GRyCAP---------- Read latest entry in diary of steps [26/04/2005 18:06] - Gregory Shpiz (dteam) GIIS problems not fixed. CT failed [27/04/2005 10:42] - Victor Edneral (dteam) Replication problems
  - 2250 - UPV-gryCAP - GIIS Problem since April 22nd 15m
    
    Read detailed description of the problem: 22.Apr. 2005, 09:12(UTC) by wsuser The information published by the GRIS of our CE (ramses.dsic.upv.es) in the GIIS service (in the same machine), disappears intermittently, although the GRIS appears to work correctly. Read latest entry in diary of steps The problem has been detected. It is network traffic problem with this machine in the University network.
  - 2258 - PIC - Got a job held event, reason: Unspecified gridmanager error since April 22nd 15m
    
    Read detailed description of the problem: 22.Apr. 2005, 13:59(UTC) by wsuser Hi, Affected CE : ce01.pic.es:2119/jobmanager-lcgpbs-biomed My jobs cannot be execute on this CE, I have the following error message before the resubmission of the job : Got a job held event, reason: Unspecified gridmanager error Perhaps, a problem with the CA RPMs which are not up to date, but it is not the usual message for this problem. Best regards.
  - 2273 - UPV-GryCAP - Bad configuration of the /opt/edg/var/info/ area in the CE since April 25th 15m
    
    Read detailed description of the problem: 25.Apr. 2005, 09:07(UTC) by wsuser Making edg-gridftp-ls -v gsiftp://ramses.dsic.upv.es/opt/edg/var/info/dteam It shows: rw-r--r-- 1 root 0 Mar 14 11:33 dteam.list Exactly the same for the rest of VOs. This is bad configured. The 1st time the tool lcg-ManageVOTag is used creates this file with write permisions for the sgm persons. So after the configuration of the SE, this *.list files should not exist. In this sense experiments that want to install software in this site will not be able to publish the corresponding tag in the Information System and the following production in this site will not arrive.
  - 2279 - UB - Bad configuration of the /opt/edg/var/info/ area in the CE since April 25th 15m
    
    Read detailed description of the problem: 25.Apr. 2005, 09:46(UTC) by wsuser Making edg-gridftp-ls -v gsiftp://lcg-ce.ecm.ub.es/opt/edg/var/info/dteam It shows: -rw-r--r-- 1 root root 0 Mar 7 19:25 dteam.list Exactly the same for the rest of VOs. This is bad configured. The 1st time the tool lcg-ManageVOTag is used creates this file with write permisions for the sgm persons. So after the configuration of the SE, this *.list files should not exist. In this sense experiments that want to install software in this site will not be able to publish the corresponding tag in the Information System and the following production in this site will not arrive.
  - 2280 - CESGA - Bad configuration of the /opt/edg/var/info/ area in the CE 15m
    
    Read detailed description of the problem: 25.Apr. 2005, 09:49(UTC) by wsuser Making edg-gridftp-ls -v gsiftp://ce2.egee.cesga.es/opt/edg/var/info/dteam It shows: -rw-r--r-- 1 dteam 0 Apr 21 11:36 dteam.list Exactly the same for the rest of VOs. This is bad configured. The 1st time the tool lcg-ManageVOTag is used creates this file with write permisions for the sgm persons. So after the configuration of the SE, this *.list files should not exist. In this sense experiments that want to install software in this site will not be able to publish the corresponding tag in the Information System and the following production in this site will not arrive.
  - 2292 - IFCA - GIIS is down since 25 April 2005 15m
    
    25.Apr. 2005, 13:34(UTC) by cicuser ----------Affected site: IFCA-LCG2---------- 25.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: IFCA-LCG2---------- 26.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: IFCA-LCG2---------- 27.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: IFCA-LCG2---------- 28.Apr. 2005, 10:40(UTC) by cicuser ----------Affected site: IFCA-LCG2---------- 28.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: IFCA-LCG2---------- 29.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: IFCA-LCG2---------- 30.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: IFCA-LCG2---------- 01.May. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: IFCA-LCG2----------
  - 2318 - CNB - problems upgrading SE using RH7.3 and LCFGng since 27 April 15m
    
    Read detailed description of the problem: 27.Apr. 2005, 10:36(UTC) by wsuser This is the fail that I get when I try to upgrade (LCG2-4-0) my SE: [FAIL] updaterpms: rpmRunTransactions failed execution of %preun scriptlet from tomcat4-4.1.18-full.1jpp failed, exit status 6 [WARNING] updaterpms: updaterpms failed any idea? Read latest entry in diary of steps This bug is not application specific and should be reassigned to SA1 support... probably ROC_SW
  - 2325 - IFAE/PIC - NFS mount exp software area not visible at IFAE/PIC since April 28th 15m
    
    Read detailed description of the problem: 28.Apr. 2005, 01:14(UTC) by wsuser Misconfigured wn: td115.ifae.es Problem: ATLAS s/w not found at /nfs/sw/atlas S/W version: 10.0.1 Setup script: /nfs/sw/atlas/software/10.0.1/setup.sh and severl other WN\'s at pic/ifae
  - 2332 - PIC - Job list match fails since April 28th 15m
    
    Read detailed description of the problem: 28.Apr. 2005, 11:00(UTC) by cicuser ----------Affected site: pic---------- 28.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: pic---------- 29.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: pic---------- 30.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: pic---------- 01.May. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: pic----------
  - 2361 - LIP - Job submission failed since April 30th 15m
    
    Read detailed description of the problem: 30.Apr. 2005, 14:06(UTC) by cicuser ----------Affected site: LIP-LCG2---------- 30.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: LIP-LCG2---------- 01.May. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: LIP-LCG2---------- 02.May. 2005, 06:46(UTC) by cicuser ----------Affected site: LIP-LCG2---------- 02.May. 2005, 10:48(UTC) by cicuser ----------Affected site: LIP-LCG2----------

Choose timezone

EGEE-SA1-SWE Meeting (postponed)

Fog VRVS Virtual Room (http://www.vrvs.org)

VRVS

VRVS

VRVS

VRVS

VRVS