1. Experiment Requirements Short description of the behavior that the experiments expect: 1.1. All software for the experiment is installed relative to a path. The path is accessible through an environment variable. Example: $VO_ALICE_SW_DIR 1.2. No root access is required to install experiment software. 1.3. A subset of the experiments users can write into this area. This can be done by creating special users (we refer to these users as the Experiment Software Managers (ESM)). An ESM should be able to add/remove software at any time without communication with the site managers. 1.4. The software has to be accessible on the WN through POSIX calls. 1.5. The ESM should be able to install software on a per site base. 1.6. The ESM should be able to verify the installation in separate steps. Different kind of validation procedures can be run by the ESM at different moments. 1.7. The experiment software manager must have the possibility to publish as TAGs all installed versions in order to direct jobs to the sites. It is the ESM's responsibility to run certification tests before publishing the availability of the software on a site. Publishing of the TAG for a given release of the software installed can happen in separate steps with respect to the installation step. 1.8. It is understood that the experiment software depends on certain software. The experiments agree that they will manage the dependencies. In this context this means that they have to install missing packages together with their software. 2. Comments from the CERN Site Managers The requirements from the experiments seem to be easily implemented assuming a shared file system. However, there are strong doubts that this is an acceptable solution for all sites. In addition this is not seen as an optimal solution in terms of efficiency and duplication of data. There are hidden performance, reliability, and scalability requirements on the shared file-system used. Installation on the worker nodes should be an alternative solution. 3. Comments from the CERN Site Security Managers This user driven installation can include only user level code. No network services should be installed without informing site manager. 4. Comments from the LCG Deployment Group There are several reasons why the position of FIO and the experiments are very hard to fulfill at the same time. A short list of issues is given in appendix A. We propose a different solution for sites that do not provide a shared file system for software distribution. The experiments have to be aware that if they put software in a shared location the removal of the software has to be done at a time when the production manager is sure that this version is no longer in use. Managing the dependencies can be very hard for the experiments, especially if the system is not installed using RPM. 5. LCG/GD Proposal For the beginning (till the end of 2003) we suggest the following way to distribute the experiments software. We point out where we have to do some development. Sites can choose between providing space in a shared file system for the VOs or having no shared file system available. This is communicated to the jobs by the local environment variable VO__SW_DIR. The variable VO__SW_DIR is set to "." in case no shared file-system is available. In case there is a shared file system the path to it is published (for instance: "/opt/ALICE"). The software has to use the environment variable directly. There is no guarantee that the path is the same on all nodes. Currently in the GLUE schema there is not a CE attribute that allows for the publication of the information relative to the existence of a shared files system for software installation. We describe the steps that a ESM has to go through to distribute, certify and announce the installation of the experiments software. LCG will provide template scripts that illustrate the procedure. The scripts can be found in Appendix B. Step 1) The ESM moves the packed software to a SE on each site where the software is to be installed. The ESM uses the replica manager commands for this. Step 2) The ESM directs an installation and certification job to the sites where the software has been copied. Depending on VO__SW_DIR the job installs the software at the indicated location and runs the certification suite. This results in a report on which the ESM bases the next actions. Step 3) For sites on which the installation/certification has been successfully run the ESM adds the tag that indicates the version n the : GlueHostApplicationSoftwareRunTimeEnvironment attribute in the Information System using the GRIS running on the CE. Later we describe how the ESMs can manipulate this entry. 5.1 User Jobs Users specify in the JDL the tag corresponding to the software release they need. This will ensure that only those sites run their jobs where the software is either installed and certified, or the software is on the site's SE and has been certified to install and run correctly on the site WNs. The users job needs a script to be run at the beginning that tests the value of the environmental variable VO__SW_DIR. In case this is "." a script provided by the experiment retrieves the version wanted from the SE and installs it locally. There are several ways the script can be told what the wanted release is. It is up to the experiments to define conventions to steer this. The experiment's install script should test that in case of local installation the provided scratch space is sufficient for installing the software and running the job. At the end of the job the software is removed by the batch system. In case there is a path different than "." defined by the environmental variable the script does nothing but starts the user's program assuming the version in the path. For the sites that are using the non-shared-file-system approach the performance might suffer. If a cache management system for local software is in place the changes to the experiments scripts should be small. 5.2 Managing the contents of GlueHostApplicationSoftwareRunTimeEnvironment A tool for the ESMs will be provided to carry out basic operations on the list of values in the GlueHostApplicationSoftwareRunTimeEnvironment entry. The tool can be run on the UI or on a WN via standard grid job submission commands. Appendix B will give examples for this. The following name schema has to be used for Experiment TAGs published in the Information System: VO-- It is up to the VO to structure the information provided in the part. The strings are case sensitive and not all special characters are supported. For example if the alice VO publishes the installation of version 1.3 of their simulation software the tag could look like the following: VO-alice-AliceSim-v1.3 5.3 Commands provided 1) For listing the tags published by your VO: lcg-ManageVOTag -host -vo -list This will return all tags published by the VO in a comma separated list to std out. 2) For adding a tag: lcg-ManageVOTag -host -vo \ -add -tag [-tag ...] In case of a successful operation the following string will be returned to std out: lcg-ManageVOTag: [...] submitted for addition by to GlueHostApplicationSoftwareRunTimeEnvironment 3) For removing a tag: lcg-ManageVOTag -host -vo \ -remove -tag [-tag ...] In case of a successful operation the following string will be returned to std out: lcg-ManageVOTag: [...] removed by to GlueHostApplicationSoftwareRunTimeEnvironment The ESMs are advised to do keep track of the published tags on the various CEs and to add them to a CE if requested. The site managers are expected to do their best to keep backups of the files containing the tags. However due to the fact that the ESMs can at any time add and remove tags there is no easy way to ensure that the correct state has been saved and at a re-installation information might get lost. 6. Technical aspects The ESMs account: In case of the presence of a shared file system the ESMs of a VO have to be the only ones to have write access to the experiments space. This is done by using VO groups. A VO Manager creates the software administration group. The people belonging to this group will be mapped into the local grid-mapfile on all sites to a special local VO software administrative account (for instance, cmsswadm). This account belongs to the local group associated to this VO (for instance, cms). This is required to realize the desired file access authorization. The shared space will be owned by the local VO software administrative account and will be group readable by the VO group (for instance: cmsswadm:cms). The mapping of multiple ESMs to one local account doesn't create a traceability problem for auditing since the logging of the mapping between the ESM's subject and the job id in the gatekeeper guarantees to separate the actions of different ESMs of the same VO. Space management in the shared area: There is none. We assume that the sites provide sufficient space for the experiments to have several versions available (50-100GB). The ESM should check if there sufficient space and remove old versions if needed. For local installation ("." case) the install scripts have to verify that the scratch space is sufficiently large for the software to be installed AND the expected data produced by the job. 7. Different usages of the proposed software distribution scheme and the effect on efficiency The efficiency of the proposed system depends heavily on the level of cooperation between local site managers and the ESMs. Especially the ESMs can, to some degree, control how efficient the process is. We would like to illustrate this using example usage patterns that cover the extremes of the spectrum. 7.1 The self-sufficient usage pattern The ESM analyzes the dependencies of the experiments software and collects every software component it relies on. Then the software is packed including all these pieces (libs, compilers etc.). The installation script checks only if the version of the operating system is compliant with the software to be installed and everything is installed at the path given to the experiment. In the case of shared filesystem this means that all libs, even those present on the local node are moved over the network. If no shared filesystem is provided the space requirements are much larger than needed due to possible duplication of software. The advantage of this usage pattern is that there are no unknowns and the installation script has a minimal simple discovery phase. This is probably acceptable for initial trials. 7.2 The opportunistic usage pattern Like before the ESM analyzes the dependencies and builds a packed version that contains all software needed. The packaging can be modular, i.e. each needed package can be packaged independently so that the installation script can download from the software server (the SE in the proposal) only those packages which are missing. During the installation a more sophisticated discovery phase is added that tries to locate common used components. This could be general packages and libs, but can include HEP specific software. In the following installation process only those parts missing are installed and the environment variables are set accordingly for the experiments software to run. The efficiency of this pattern can vary depending on the effort put by the ESM in the discovery scripts and by the amount of software installed on the site. The chance that software can be located correctly will be increased by adhering to use standard locations for packages (like /opt/) In the case of a shared file system the load on the shared filesystem and the network will be reduced and the efficiency improved since more software is accessed locally. In case of the local installation the number of packages needed to be installed is smaller which reduces the time spent on this task and reduces the network and local disk capacity needed. The "opportunistic usage pattern" really works if there is some kind of agreement between the ESM and the site manager. If this is not the case, a site manager could decide to remove/update/modify any of the packges needed by a given version of the experiment sw AFTER the exploration/installation phase is over and WITHOUT notifying the ESM. This would result in sites that publish a tag but where the corresponding sw version does not work. 7.3 The coordinated usage pattern The ESM builds as before a complete distribution of the software needed and supplies a well crafted discovery script. In addition the ESMs communicate to the site managers the packages that they need. The site managers will then install packages locally. Especially for sites that support only a small number of VOs this can result in a situation where almost the complete software is locally available. Which for either mode of operation comes close to the optimum in terms if efficiency. 7.4 Summary Each experiment is free to choose the optimal operation point for them at any time. It has to be mentioned that this can change over time even for the same version of the software. Appendix A List of some issues with the local installation triggered by experiment activity. This refers to the issues mentioned in Section 4. This applies to the case LCG is requested by the experiments to develop a tool that allows for automatic software installation triggered by the experiments and supported at the farm level. - Automatic mechanism needed to trigger distribution to the WNs - Signal needed to tell the production manager that the nodes are ready - Then the ESM has to run certification job and set the variable in the information system - The experiment has to wait during the distribution before testing can take place - There can be only 1 or 2 version installed at the same time because the scratch space on the node is limited A system with this kind of complexity, even if desirable from several points of view, would require substantial development and organization and cannot be foreseen as a time 0 solution. Appendix B B.1 Sample scripts for experiment software distribution We describe here a procedure to install the experiment software in GRID using a generic sample scripts. B.1.1. How to run the script for the Experiment and Application Software Installation, Configuration and Validation in LCG1 1. The first thing to do is to prepare the tarball(s) with all software you want to install. The content of this (these) tarball(s) must include at least the following scripts/tools: - install_tool : this is an experiment-dependent tool to install the experiment software - install_sw : the script for the software installation. It is invoked only if the install_tool result was error free. - validation_sw : the script for the software validation. It is care of the Software Manager to provide/decide the way to validate an installation. - uninstall_sw : the script for software removal. - run_sw : a script to run over a sample of data. To use it the software should be already installed and validated. Togheter with these scripts, the tarball(s) file will contain all needed files to make them function properly. For example the input data used into a validation script and, of course the experiment software distribution. Special care should be used in the name of the tarball. We adopted the following convention: nameOfTarball= experimentName­SWname­Version­Release.tar.gz We distinguish between the single tarball and the multiple tarball case. The former requires exactly the above convention. Nevertheless, if you have more than one tarball, the name convention is different: nameOfTarball=experimentName­SWname­Version­Release_index.tar.gz where ``_index'' is a counter running from 1 to totalNumberOfTarball. 2. Once you have made your tarball, you have to use the copyAndRegister edg­rm command to upload the tarball into an SE . In this step you have to set a logical file name with ­l option. The logical file name, for every tarball is the physical name of the file. Example: edg­rm --vo= file:////physicalname ­d srm:///physicalname ­l lfn:pysicalname 3 Build your JDL file. Here an example of working jdl Executable = "lcgSwExp.sh"; InputSandbox = {"lcgSwExp.sh", "lcgCheck.sh", "compare.pl", "parse.pl", "lcgCopy.sh", "lcgTar.sh"}; OutputSandbox = {"stdout", "stderror", "ntuple.hbook"}; stdoutput = "stdout"; stderror = "stderror"; Arguments = "­v ­E dteam ­V 6.5.0 ­H adc0033.cern.ch ­S athena ­R 2 ­N 1"; InputData = {"lfn:dteam­athena­6.5.0­1_1.tar.gz", "lfn:dteam­athena­6.5.0­1_2.tar.gz"}; DataAccessProtocol = {"gridftp", "gridftp"}; The Arguments field must presents 7 entries corresponding to the 7 calling options: # the action you want permorm (­ivruh see the help with ­h option) # The VO you belong to (­E option) # The software version you want to install (­V option) # The SE where you want to put the tarball(s) (­H option) # The SW name (­S option) # The number of tarballs your installation need (1 is the default) (­N option) # The number of release (1 is the default) (­R option) All these parameters will be used in the steering script (wrote in bash) thereafter described ( lcgSwExp.sh ). 4.Submit the script install.jdl with the command: edg­job­submit ­o myJob install.jdl B.1.2 What does the script do for you? By defining one of the [ivru] options, the script follows different behaviors. A general­common part of the script does the following: 1. It checks for the validity of the options you inserted. 2. It checks for an environment variable on the WN called VO__SW_DIR 3. It creates (if not yet present) a subdirectory called : experimentName­SWname­Version­Release_install This is the installation area or in general the Working Area for that version/release of experiment software. It copes then all scripts you sent through the input sandbox and that are needed in the process into this area. After these general actions, we differentiate among the possibilities: If you are in installation/validation/running mode on a WN (without a shared file system) option: 1. It checks, using the rls server, the presence on GRID of the whole set of tarball your installation needs. Otherwise it stops the process. 2. It replicates the tarballs in the SE closest to the WN whenever they are not there already present. 3. It copies from the SE to the WN the tarballs. 4. It calls the experiment installation script discussed above (provided by the experiment) 5. It calls a validation or a running job once the software has been installed successfully. 6. You will have back in the stdout and stderr files the report of all that has happened on the WN. 7. Depending on your setting in the JDL file you can ask to have some additional output file (hbook file or something like). You must add in the field "OutputSandbox" the name of the file you want. If the action option is set to ``u'' it will uninstall a preexisting version of the software calling the uninstall_sw script the experiment has provided. If you need to do a run and the VO__DIR_SW is not ``.'' but an existing path in the shared file system of the Local Farm, you don't need to install again. You have simply to launch your command (a job file usually with the WildCard set properly in the file run_sw). To do: 1. Differentiate the behavior for the two different topologies of CE (with or whitout a shared file system) through the use of the VO_SW_DIR variable. 2. Think about an automatic integration of the validation process with the experiment TAG publication in the Information Server. Appendix C C.1 Implementation of the modified informationprovider The information provider that will fill the entry: GlueHostApplicationSoftwareRunTimeEnvironment will work by regularly scanning the contents of a directory on the CE. The location of this directory is: $EDG_LOCATION_VAR/info/ In this directory for each VO there is a directory named after the VO with a file that contains the information that the VO needs to publish. The names of these files are: .ldif The vo name is in lowercase. The access privileges to the VO subdirectories are : world read access write access only for the accounts which the ESMs are mapped to. The file containing the static information will be written by the LCFGng object. The VO specific file will not be created to avoid loosing information. The format of the files should be very simplistic (entries comma separated). Checking of the naming convention should be done in the tools that are used by the end users. For the list of commands, please refer to section 5.3. Since these files contain important status information that can't be regenerated in an easy way at reinstallation by the local administrator these files have to be save before reinstalling a node.