EGEE-SA1-SWE Meeting

Name: EGEE-SA1-SWE Meeting
Start: 2005-05-09T12:00:00+02:00
End: 2005-05-09T14:00:00+02:00
Location: Sky VRVS Virtual Room (http://www.vrvs.org)

Monday 9 May 2005, 12:00 → 14:00 Europe/Zurich

Sky VRVS Virtual Room (http://www.vrvs.org)

Pacheco, Andreu

Description

Present: Andres Pacheco (PIC), Adria Casajus (UB), Gonzalo Merino (PIC), Javier Fontan (CESGA), Guillermo Losilla (BIFI), Manuel Sanchez (USC), Dani Cano (IFCA), Kai Neuffer (INTA-CAB), Juan Jose Saborido (USC), Alvaro Fernandez (IFIC), Pepe Salt (IFIC), Carlos Fernandez (CESGA), Javier Lopez (CESGA), Mohammed Kaci (IFIC)

- 12:00 → 12:15
  
  Briefing of activities 15m VRVS
  
  VRVS
  
  Speaker: Pacheco, Andreu (IFAE/PIC)
  
  more information
- 12:05 → 12:15
  Operational status from sites 10m VRVS
  
  VRVS
  
  * From February 1st 2005 all sites must report by mail their operational status each friday if they want Gonzalo Merino to report with accuracy their status to EGEE Operations Meeting * Since the people from CERN is asking the ROCs to write the monday's 11 report following a sort of template, I think it will be useful that the SWE Site Reports that site managers are sending on friday to this list, try to follow a similar template as well. I suggest then that these reports are produced following the template available at: http://services2.pic.org.es/devel/download/txt/SR-template.txt * Reports should be sent to: egee.swe.roc.contact@pic.es The reports from SWE are appended to the agenda page of the weekly operation meetings: http://agenda.cern.ch/displayLevel.php?fid=258
  
  Speaker: Merino, Gonzalo (PIC)
  - CESGA-EGEE 15m
    
    * Javier Fontan reporting. * All OK.
    
    Speaker: Fontan, Javier
  - CIEMAT-LCG2 15m
    
    * Reported by mail. Weekly Operations Status Report: CIEMAT-LCG2 Reporting Period: 02/05/05 to 06/05/05 Availability of Core Services Hosted in the Site ================================================== There is no core services running at CIEMAT at this moment Non-Functional Time ==================== None Scheduled Down Time =================== * Scheduled Maintenance During the Reporting Period None * Scheduled Maintenance During the Next Reporting Period None Major Operational Issues Encountered During the Reporting Period ================================================================ None Upgrade to Scientific Linux 3 (or equivalent) ============================================= By when will all service nodes in the Site be upgraded? Completed, now Scientific Linux 3.0.4 in all our service nodes By when will the WNs in the Site be upgraded? Completed, now Scientific Linux 3.0.4 in all our WN Points to Raise at the Operations Meeting ========================================= None
    
    Speaker: Calonge, Javier
  - CNB-LCG2 15m
    
    * NO report
    
    Speaker: Merino, Angel
  - IFCA-LCG2 15m
    
    * No report. * Dani had problems connecting to VRVS.
    
    Speaker: Cano, Dani
  - IFIC-LCG2 15m
    
    * Javi Sanchez reporting. * Accounting is being reported well. * Small problem with tape unit blocking the castorsrm service. * Installing 6 machines with SCL3 preparing for gLite deployment.
    
    Speaker: Sanchez, Javier
  - INTA-CAB 15m
    
    * Kai Neuffer reporting. * Everthing ok. Pending upgrade of APEL. Weekly Operations Status Report: INTA-CAB Reporting Period: 02/05/05 to 06/05/05 Availability of Core Services Hosted in the Site ================================================ No core services hosted in this site. Non-Functional Time =================== 0 days. Scheduled Down Time =================== * Scheduled Maintenance During the Reporting Period 0 days. * Scheduled Maintenance During the Next Reporting Period 0 days. Major Operational Issues Encountered During the Reporting Period ================================================================ None. Upgrade to Scientific Linux 3 (or equivalent) ============================================= Already done. Points to Raise at the Operations Meeting ========================================= None. GridWay, The Way to Grid! http://www.gridway.org
    
    Speaker: Huedo, Eduardo
  - LIP-LCG2 15m
    
    Speaker: David, Mario
  - IFAE 15m
    
    * Carlos Borrego reporting. * OK.
    
    Speaker: Borrego, Carlos
  - PIC 15m
    
    * Carlos Borrego reporting. * Since last week in green status. * Installed 9 machines for all gLite services. * Latest status from http://egee-sa1-swe.web.cern.ch/egee-sa1-swe/status-reports/Status_egee_pic_glite_preprod.htm * The procedure for gLite installation is to run a script downloading the rpms and configuring the machine. * The current status is that the gLite system is not operative. Several problems reported and waiting answer. * Thanks to Javi from CESGA about his unconditional support.
    
    Speaker: Borrego, Carlos
  - UAM-LCG2 15m
    
    * NO reporting
    
    Speaker: Pardo Navarro, Juan Jose
  - UB-LCG2 15m
    
    * Adria Casajus Reporting * In downtime now. Weekly Operations Status Report: UB Reporting Period: 02/05/2005 to 08/05/2005 Availability of Core Services Hosted in the Site ================================================ Non Core Service in the Site Non-Functional Time ==================== We are under upgrade to ELC 3 (SLC3.0.4) and LCG-2_4_0 Scheduled Down Time =================== Full week Major Operational Issues Encountered During the Reporting Period ================================================================ None Upgrade to Scientific Linux 3 (or equivalent) ============================================= By when will all service nodes in the Site be upgraded? By when will the WNs in the Site be upgraded? In progress Points to Raise at the Operations Meeting ========================================= None
    
    Speaker: Casajus, Adria
  - UPV-GRyCAP 15m
    
    * Reported by mail. Weekly Operations Status Report: [ UPV-GRyCAP ] Reporting Period: [ 02/05/05 ] to [ 06/05/05 ] Availability of Core Services Hosted in the Site ================================================== RB, BDII: apis.dsic.upv.es [OK] Non-Functional Time ==================== - Total days unavailable: 5 days 02-05-2005 & 06-05-2005 -> Problem with GFAL infosys Scheduled Down Time =================== * Scheduled Maintenance During the Reporting Period - Total days unavailable 0.5 03-05-2005 -> Install New LCG Version 2.4.0 in RB * Scheduled Maintenance During the Next Reporting Period - Estimated number of days for which the site will be unavailable 0 days Major Operational Issues Encountered During the Reporting Period ================================================================ We are obtaining a strange result in the SFT tests. The last SFT tests made to our site are obtaining a GFAL infosys FAILED. If I try to see the error in the Site test report, it not appears, it seems to be incomplete. I send a mail to Piotr Nyczyk, to inform about this error and he is trying to find out what is happening. Upgrade to Scientific Linux 3 (or equivalent) ============================================= By when will all service nodes in the Site be upgraded? Already done (SL 3.0.4 & FC2) By when will the WNs in the Site be upgraded? Already done (FC2) Points to Raise at the Operations Meeting ========================================= none
    
    Speaker: Caballer, Miguel
  - USC-LCG2 15m
    
    * Manuel Sanchez reporting Reporting Period: 02/05/2005 to 09/05/2005 Availability of Core Services Hosted in the Site ================================================ None Deployed Non-Functional Time =================== None Scheduled Down Time =================== From Wed 20 Apr 2005 20:00 to Fri 22 Apr 2005 09:00 Major Operational Issues Encountered During the Reporting Period ================================================================ None Upgrade to Scientific Linux 3 (or equivalent) ============================================= By when will all service nodes in the Site be upgraded? Already there By when will the WNs in the Site be upgraded? Already there Points to Raise at the Operations Meeting ========================================= None
    
    Speaker: Sanchez, Manuel
  - BIFI-LCG 15m
- 12:10 → 12:15
  
  APEL Status Report 5m VRVS
  
  VRVS
  
  == Period == From April 30 to May 6 == Summary == Similarly to the last weeks, few sites has published properly their accounting data. The rest of sites either have not published any data or have missed some data. In the last days, a new version of Apel has been released (3.4.44). In this version, the problem with the RGMA_HOME has not been solved. Now we are testing a new version (3.4.45) which should solve this problem. == Status of each SWE site == Site: CESGA Solved Problems of Last Report: Yes Missing Accounting Entries: None Publishing User Name Information: Yes Actions: None Other Comments: None Site: CIEMAT Solved Problems of Last Report: Yes Missing Accounting Entries: None Publishing User Name Information: Yes. April 15 have not published the user name. Actions: None Other Comments: They are reprocessing their data every day. Probably they have setup the <republish> option of <JoinProcessor> to "all". Check the configuration of Apel and set this option to "missing" if it is necessary. In other case, check the Apel logs. Site: ifae Solved Problems of Last Report: NO Missing Accounting Entries: Since 09/04/05 Publishing User Name Information: Yes Actions: Solved Problems of Last Report. JoinProcessor Reprocess Other Comments: None Site: IFCA Solved Problems of Last Report: NO Missing Accounting Entries: Since 15/01/05 Publishing User Name Information: Yes Actions: Solved Problems of Last Report. Reprocess Other Comments: None Site: IFIC Solved Problems of Last Report: NO Missing Accounting Entries: Since 21/04/05 Publishing User Name Information: Yes Actions: Solved Problems of Last Report. They have installed the new version of Apel so they should publish the old data the next time that Apel work. Other Comments: They have updated to LCG2.4.0 so they should update to Apel 3.4.44 (or 3.4.45) and check the configuration of Apel before run it. Site: INTA Solved Problems of Last Report: Yes Missing Accounting Entries: None Publishing User Name Information: Yes Actions: None. Other Comments: None. Site: LIP Solved Problems of Last Report: NO Missing Accounting Entries: Since 09/04/05 Publishing User Name Information: Yes Actions: None. They have installed the new version of Apel so they should publish the old data the next time that Apel work. Other Comments: They have updated to LCG2.4.0 so they should update to Apel 3.4.44 (or 3.4.45) and check the configuration of Apel before run it. Site: pic Solved Problems of Last Report: NO Missing Accounting Entries: Since 13/04/05 Publishing User Name Information: Yes Actions: Solved Problems of Last Report. JoinProcessor Reprocess. Other Comments: They have updated to LCG2.4.0 so they should update to Apel 3.4.44 (or 3.4.45) and check the configuration of Apel before run it. Site: PIC-LCG2 Solved Problems of Last Report: Since 02/03/05 they are publishing user information, but they have not publish the lost user information. They don't reprocess. Missing Accounting Entries: Since 13/03/05 Publishing User Name Information: Yes Actions: Solved Problems of Last Report. JoinProcessor Reprocess Other Comments: At this moment there are three sites names for PIC: this, pic and ifae. This site name was changed by ifae in the monitoring tools, so will you publish more data with this name?. Could we stop the cheching of this site name? Site: UAM Solved Problems of Last Report: NO Missing Accounting Entries: Since 29/04/05 Publishing User Name Information: Yes Actions: Solved Problems of Last Report. JoinProcessor Reprocess Other Comments: None Site: UB Solved Problems of Last Report: NO Missing Accounting Entries: Since 03/05/05 Publishing User Name Information: Yes Actions: Solved Problems of Last Report. JoinProcessor Reprocess Other Comments: Since May 2 they are in maintenance Site: UPV Solved Problems of Last Report: NO Missing Accounting Entries: Since 01/05/05 Publishing User Name Information: Yes Actions: None Other Comments: None Site: USC Solved Problems of Last Report: Yes Missing Accounting Entries: None Publishing User Name Information: Yes Actions: None Other Comments: None == Accounting Support == Accounting Web Page: http://www.egee.cesga.es/EGEE-SA1-SWE/accounting/ Accounting FAQ: http://www.egee.cesga.es/EGEE-SA1-SWE/accounting/faq.html Accounting Statistics: http://www.egee.cesga.es/EGEE-SA1-SWE/accounting/reports/ CPU Normalization: http://www.egee.cesga.es/EGEE-SA1-SWE/accounting/normalization.html Accounting Guides: http://www.egee.cesga.es/EGEE-SA1-SWE/accounting/guides/ Contact e-mail: egee-admin@cesga.es
  
  Speaker: Rey Mayo, Pablo (CESGA)
- 12:15 → 12:30
  Action lists 15m VRVS
  
  VRVS
  
  Speaker: Pacheco, Andreu (IFAE)
  - 5.1.17.4 Andreu Pacheco to update the SWE Execution Plan (PENDING) 15m
    
    Speaker: Pacheco, Andres (PIC)
  - 5.1.17.7 CIEMAT, CNB, IFIC, IFAE, UAM and USC to upgrade worker nodes to Scientific Linux 3.0.2 (PENDING) 15m
    
    Speaker: Merino, Gonzalo (PIC)
  - 5.1.17.8 CNB to upgrade Apel version 3.4.39-1 (PENDING) 15m
    
    Speaker: Lopez Cacheiro, Javier (CESGA)
  - 5.3.7.1 Miguel Cardenas to check if forensic analysis after intrusion is mandatory and who can do it (NEW) 15m
    
    Speaker: Cardenas, Miguel (CIEMAT)
  - 5.3.7.2 Farida Fassi and Mohammed Kaci to setup the initial user support web site (OPEN) 15m
    
    * Mohammed Kaci. * GGUS to local support is working with no problem. * Update from local support to GGUS is also working. * New tickets from local support to GGUS does not work. * The new system has not the same structure as the italians. *
    
    Speaker: Fassi, Farida (IFIC)
  - 5.3.7.4 Mario to deploy gmdat monitoring tools in SWE (OPEN) 15m
    
    Speaker: David, Mario (LIP)
- 12:20 → 12:35
  Preproduction Activity Status 15m VRVS
  
  VRVS
  
  This section will contain an update about the preproduction activity in SWE where more and more sites will be involved.
  
  Speaker: Pacheco, Andreu (PIC)
  - PIC Report 15m
    
    Speaker: Andres Pacheco (PIC)
    
    more information
- 12:25 → 12:40
  GGUS Open Tickets for SWE Federation 15m VRVS
  
  VRVS
  
  Speaker: David, Mario (PIC)
  - 1909 - CNB - GlueSARoot losed for biomed VO since March 21st 15m
    
    Read detailed description of the problem: 21.Mar. 2005, 16:25(UTC) by wsuser I have had to reinstall (using LCFGng and RH7.3) the CE because I had a hard disk problem the last week, and now I'm having problems with GlueSARoot: [root@mallarme root]# ldapsearch -x -H ldap://mallarme.cnb.uam.es:2135 -b mds-vo-name=cnblcg2,o=grid | grep GlueSARoot dn: GlueSARoot=lhcb:lhcb,GlueSEUniqueID=baudelaire.cnb.uam.es,Mds-Vo-name=cnbl GlueSARoot: lhcb:lhcb dn: GlueSARoot=dteam:dteam,GlueSEUniqueID=baudelaire.cnb.uam.es,Mds-Vo-name=cn GlueSARoot: dteam:dteam Why does the CE lose the GlueSARoot for biomed?!?! I have created (as the manual says) the biomed_cfg.conf and added the .h file in vos-cfg.h : (...) /* Add the include file for the new vo here like the one commented out, which is created by the tools "addvo.py" with the example of input file esr.conf like: ./addvo.py -i esr.conf > vo-esr-cfg.h Modify esr.conf to configure your new VO */ /* #include "vo-esr-cfg.h" */ #include "biomed_cfg.h" (...) Read latest entry in diary of steps It is an installation problem and not a VO specific one. Therefore, I reassign it to the corresponding ROC who should check with the relevant CIC if necessary. Anyway, here are some hints on the problem: The GlueSA root is probabaly being reject as this entry contains invalid ldif. If the entry is there when the information provider is run and not in the GRIS then this is usually the problem. For troubleshooting see. http://lfield.home.cern.ch/lfield/trouble.html
  - 1948 - CNB - lack of closeSE for biomed for 2 nodes since March 24th 15m
    
    Read detailed description of the problem: 24.Mar. 2005, 08:35(UTC) by wsuser the node001.grid.auth.gr and mallarme.cnb CE don't have a closeSE for biomed. Do I need to contact directly the nodes administrator ? I contacted directly other nodes administrator for the same problem before (cgg, sinica...) Read latest entry in diary of steps We assign this ticket to the corresponding ROC for the CE mallarme.cnb.uam.es. Nevertheless, we must say that the site administrator of the site has been already reported about this by the Biomed VO manager. ===================================== From: Yannick Legre To: Angel Merino Cc: Biomed-grid-support@cern.ch Subject: lack of close SE in configuration file ---------------------------------------- Dear Angel, We received a ticket from GGUS which says that you have a misconfiguration in your site. No close SE seems to be defined in your CE configuration... Please can you provide on eand let me know asap whent it is done ? :-) Thank you in advance, Best Regards, Yannick ^_^
  - 1979 - IFCA - No published end point for egeerb but it is present in GOCDB since March 29th 15m
    
    Read detailed description of the problem: 29.Mar. 2005, 10:53(UTC) by cicuser ----------Affected site: IFCA-LCG2---------- No published end point for egeerb but it is present in GOCDB. Read latest entry in diary of steps [29/03/2005 13:39] - steve traylen [07/04/2005 16:52] - philippa strange extended expiry date due to schedule downtime [13/04/2005 16:44] - Frederic Schaer Downtime until 14 [15/04/2005 11:11] - Frederic Schaer giis seems down - can not check [26/04/2005 17:19] - Alexander Berezhnoy (dteam)
  - 2037 - USC - Unspecified gridmanager error at USC-LCG2 (SouthWesternEurope) since April 5th 15m
    
    Read detailed description of the problem: 05.Apr. 2005, 09:06(UTC) by wsuser Too many jobs got failed today at lcg-ce.usc.cesga.es with the error: Got a job held event, reason: Unspecified gridmanager error The error means that there is some problem in the submission request in the LBS. Behind this message many things could happen: - a misleading information is published in the Information System with respect the real content on the local batch system (for instance all VOs are allowed to run in some queue which is dedicated just to a well defined VO) - some service is down on the CE or wronlgy configured Here culprit jobs IDs: https://lxn1182.cern.ch:9000/HGfR_cK2Qr7kcbcU9oehtA JM-Contact: https://lcg-ce.usc.cesga.es:20013/8267/1112653091/ (submitted to Globus on 04/05 00:18:18) https://lxn1182.cern.ch:9000/oRqMb6JsPOTp_5H1zDmlwQ JM-Contact: https://lcg-ce.usc.cesga.es:20004/12255/1112654112/ (submitted to Globus on 04/05 00:35:21) Cheers, R.
  - 2176 - Several SWE Sites / BIOMED - problem with copy on SE since April 18th 15m
    
    Read detailed description of the problem: 18.Apr. 2005, 08:10(UTC) by wsuser Hi, I'm a biomgrid user for biomed. I've got problem with the replication of a tarball on SE. First, I copy it on the closest SE ( which is not a problem) and then I replicate it on each SE. But for these following SE, I have this result: File replication on SE castorgrid.ifae.es: the server sent an error response: 553 553 /castor/ifae.es/lcg/biomed/generated/2005-04-18/filec82e8eff-96eb-48d8-abdc-b34424079627: No space left on device. File replication on SE castorsrm.ifae.es: SOAP FAULT: SOAP-ENV:Client "initFileStatuses: cannot create path /castor/ifae.es/lcg/biomed/generated/2005-04-18/file6b475501-013a-4b43-9909-32ef273bbc7e: Permission denied" File replication on SE castorgrid.ifae.es: the server sent an error response: 553 553 /castor/ifae.es/lcg/biomed/generated/2005-04-18/file643cac9e-393d-4b3e-b194-aec94cd97368: No space left on device. File replication on SE egeese.ifca.org.es: lcg_rep: Invalid argument File replication on SE scaise.scai.fraunhofer.de: the server sent an error response: 535 535-FTPD GSSAPI error: GSS Major Status: Authentication Failed 535-FTPD GSSAPI error: GSS Minor Status Error Chain: 535-FTPD GSSAPI error: 535-FTPD GSSAPI error: accept_sec_context.c:170: gss_accept_sec_context: SSLv3 handshake problems 535-FTPD GSSAPI error: globus_i_gsi_gss_utils.c:881: globus_i_gsi_gss_handshake: Unable to verify remote side's credentials 535-FTPD GSSAPI error: globus_i_gsi_gss_utils.c:854: globus_i_gsi_gss_handshake: SSLv3 handshake problems: Couldn't do ssl handshake 535-FTPD GSSAPI error: OpenSSL Error: s3_srvr.c:1816: in library: SSL routines, function SSL3_GET_CLIENT_CERTIFICATE: no certificate returned 535-FTPD GSSAPI error: globus_gsi_callback.c:351: globus_i_gsi_callback_handshake_callback: Could not verify credential 535-FTPD GSSAPI error: globus_gsi_callback.c:490: globus_i_gsi_callback_cred_verify: Could not verify credential 535-FTPD GSSAPI error: globus_gsi_callback.c:850: globus_i_gsi_callback_check_signing_policy: Error with signing policy 535-FTPD GSSAPI error: globus_gsi_callback.c:1058: globus_i_gsi_callback_check_gaa_auth: Error in OLD GAA code: CA policy violation: 535 FTPD GSSAPI error: accepting context File replication on SE testbed002.grid.ici.ro: the server sent an error response: 550 550 /storage/biomed/generated/2005-04-18: Permission denied. the server sent an error response: 553 553 /storage/biomed/generated/2005-04-18/file851156d0-04bd-4ee9-a804-e80c8a903b1f: No such file or directory. File replication on SE zeus03.cyf-kr.edu.pl: the server sent an error response: 530 530 Login incorrect. File replication on SE lcg00123.grid.sinica.edu.tw: the server sent an error response: 550 550 /flatfiles/SE00/biomed: Permission denied. the server sent an error response: 553 553 Could not determine cwdir: No such file or directory. File replication on SE castorsrm.ifae.es: SOAP FAULT: SOAP-ENV:Client "initFileStatuses: cannot create path /castor/ifae.es/lcg/biomed/generated/2005-04-18/file5e1f0fc3-2d05-4db0-8241-1d7461279c92: Permission denied" Could you explain me what I did wrong please, because I need to replicate this tarball on the maximum number of SE. Thank you very much, Matthieu Reichstadt 22.Apr. 2005, 09:06(UTC) by wsuser I used the following: lcg-rep --vo lfn: -d Read latest entry in diary of steps I used the following: lcg-rep --vo lfn: -d
  - 2238 - UAM - lcg-cr to grid002.ft.uam.es hangs since April 21st 15m
    
    Read detailed description of the problem: 21.Apr. 2005, 07:50(UTC) by wsuser lcg-cr command does not return and blocks many WNs used for atlas production /opt/lcg/bin/lcg-cr --verbose --vo atlas file:/tmp/rome.test -l lfn:chudoba.test.100 -d grid002.ft.uam.es Using grid catalog type: edg The same command to some other SE works well. grid002.ft.uam.es is in the Atlas BDII
  - 2242 - UPV-GryCAP - bad BDII configuration since Apil 21st 15m
    
    Read detailed description of the problem: 21.Apr. 2005, 14:56(UTC) by cicuser ----------Affected site: UPV-GRyCAP---------- 21.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: UPV-GRyCAP---------- 22.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: UPV-GRyCAP---------- 23.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: UPV-GRyCAP---------- 24.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: UPV-GRyCAP---------- 25.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: UPV-GRyCAP---------- 26.Apr. 2005, 16:08(UTC) by cicuser ----------Affected site: UPV-GRyCAP---------- 26.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: UPV-GRyCAP---------- 27.Apr. 2005, 08:44(UTC) by cicuser ----------Affected site: UPV-GRyCAP---------- 27.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: UPV-GRyCAP---------- 28.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: UPV-GRyCAP---------- Read latest entry in diary of steps [26/04/2005 18:06] - Gregory Shpiz (dteam) GIIS problems not fixed. CT failed [27/04/2005 10:42] - Victor Edneral (dteam) Replication problems
  - 2250 - UPV-gryCAP - GIIS Problem since April 22nd 15m
    
    Read detailed description of the problem: 22.Apr. 2005, 09:12(UTC) by wsuser The information published by the GRIS of our CE (ramses.dsic.upv.es) in the GIIS service (in the same machine), disappears intermittently, although the GRIS appears to work correctly. Read latest entry in diary of steps The problem has been detected. It is network traffic problem with this machine in the University network.
  - 2258 - PIC - Got a job held event, reason: Unspecified gridmanager error since April 22nd 15m
    
    Read detailed description of the problem: 22.Apr. 2005, 13:59(UTC) by wsuser Hi, Affected CE : ce01.pic.es:2119/jobmanager-lcgpbs-biomed My jobs cannot be execute on this CE, I have the following error message before the resubmission of the job : Got a job held event, reason: Unspecified gridmanager error Perhaps, a problem with the CA RPMs which are not up to date, but it is not the usual message for this problem. Best regards.
  - 2273 - UPV-GryCAP - Bad configuration of the /opt/edg/var/info/ area in the CE since April 25th 15m
    
    Read detailed description of the problem: 25.Apr. 2005, 09:07(UTC) by wsuser Making edg-gridftp-ls -v gsiftp://ramses.dsic.upv.es/opt/edg/var/info/dteam It shows: rw-r--r-- 1 root 0 Mar 14 11:33 dteam.list Exactly the same for the rest of VOs. This is bad configured. The 1st time the tool lcg-ManageVOTag is used creates this file with write permisions for the sgm persons. So after the configuration of the SE, this *.list files should not exist. In this sense experiments that want to install software in this site will not be able to publish the corresponding tag in the Information System and the following production in this site will not arrive.
  - 2279 - UB - Bad configuration of the /opt/edg/var/info/ area in the CE since April 25th 15m
    
    Read detailed description of the problem: 25.Apr. 2005, 09:46(UTC) by wsuser Making edg-gridftp-ls -v gsiftp://lcg-ce.ecm.ub.es/opt/edg/var/info/dteam It shows: -rw-r--r-- 1 root root 0 Mar 7 19:25 dteam.list Exactly the same for the rest of VOs. This is bad configured. The 1st time the tool lcg-ManageVOTag is used creates this file with write permisions for the sgm persons. So after the configuration of the SE, this *.list files should not exist. In this sense experiments that want to install software in this site will not be able to publish the corresponding tag in the Information System and the following production in this site will not arrive.
  - 2292 - IFCA - GIIS is down since 25 April 2005 15m
    
    25.Apr. 2005, 13:34(UTC) by cicuser ----------Affected site: IFCA-LCG2---------- 25.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: IFCA-LCG2---------- 26.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: IFCA-LCG2---------- 27.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: IFCA-LCG2---------- 28.Apr. 2005, 10:40(UTC) by cicuser ----------Affected site: IFCA-LCG2---------- 28.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: IFCA-LCG2---------- 29.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: IFCA-LCG2---------- 30.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: IFCA-LCG2---------- 01.May. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: IFCA-LCG2----------
  - 2318 - CNB - problems upgrading SE using RH7.3 and LCFGng since 27 April 15m
    
    Read detailed description of the problem: 27.Apr. 2005, 10:36(UTC) by wsuser This is the fail that I get when I try to upgrade (LCG2-4-0) my SE: [FAIL] updaterpms: rpmRunTransactions failed execution of %preun scriptlet from tomcat4-4.1.18-full.1jpp failed, exit status 6 [WARNING] updaterpms: updaterpms failed any idea? Read latest entry in diary of steps This bug is not application specific and should be reassigned to SA1 support... probably ROC_SW
  - 2332 - PIC - Job list match fails since April 28th 15m
    
    Read detailed description of the problem: 28.Apr. 2005, 11:00(UTC) by cicuser ----------Affected site: pic---------- 28.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: pic---------- 29.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: pic---------- 30.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: pic---------- 01.May. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: pic----------
  - 2361 - LIP - Job submission failed since April 30th 15m
    
    Read detailed description of the problem: 30.Apr. 2005, 14:06(UTC) by cicuser ----------Affected site: LIP-LCG2---------- 30.Apr. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: LIP-LCG2---------- 01.May. 2005, 22:15(UTC) by AR_ESCALATOR ----------Affected site: LIP-LCG2---------- 02.May. 2005, 06:46(UTC) by cicuser ----------Affected site: LIP-LCG2---------- 02.May. 2005, 10:48(UTC) by cicuser ----------Affected site: LIP-LCG2----------

Choose timezone

EGEE-SA1-SWE Meeting

Sky VRVS Virtual Room (http://www.vrvs.org)

VRVS

VRVS

VRVS

VRVS

VRVS

VRVS