Deployment and Management of Grid Services for a data intensive complex infrastructure
Presented by Alvaro FERNANDEZ CASANI on 15 Apr 2010 from 09:20 to 09:40
Session: Data Management
Track: Software services exploiting and/or extending grid middleware (gLite, ARC, UNICORE etc)
Services needed to access scientific data and computational resources in a distributed computing infrastructure are getting more and more complex. Our Atlas Tier2/Tier3 users, and National e-Science and GRID-CSIC initiatives researchers, have different requirements but all need ubiquitous and efficient access to these resources. We will provide details on how we use WLCG and Glite middleware, Storm, Lustre filesystem and our developments within this software to meet our specific necessities, to allow users direct access to the data, and to reach high availability in the provided services.
We are currently providing 590 TB of online data and potential capacity of 560 TB in offline data tapes, available to several communities which users in some cases intersect. For example we have Atlas Tier2 users that are also members of our Tier3, and we need to apply group and user quotas for the different spaces. National Grid Initiative and Grid-CSIC communities have also common members and applications, in some cases parallel computational jobs that also need direct access to data. Authorization for these data is done at the filesystem level with the usage of ACLs, and to be respected by higher levels of middleware we had to develop custom plugins for Storm for example. Lustre is mounted in the WNs to directly access with posix file calls, and local users have direct read-only access from the UI. We also provide X.509 secured web access for commodity. The data access patterns for newer applications outside the HEP scope is being analyzed, as well as optimal parameters like the read ahead for different types of workloads. As we are reaching higher levels of served data, we are scaling the number of Lustre OSTs, and distributing grid middleware services among several machines.
Recent tests as the STEP09 and UAT tests successfully exercised our infrastructure for the Atlas Tier2 in conjunction with other LHC experiments. The usage of Lustre as the backend cluster filesystem has proven to reach excellent performance levels, in addition to the configuration that allow jobs at the WN to directly access data in the servers. These tests also showed some problems mainly related with underestimation of the amount of stored data, which are being easily solved adding more storage servers thanks to the scalability of the services. The inclusion of several newer e-science Vos in the context on NGI and GRID-CSIC projects have reinforced the need for group and user quotas, and it has been adopted the new pool paradigm of Lustre 1.8 to group several OST, proving to work fine. Development of newer software plugins for Storm to allow local authorization policy based on ACLs for the different spaces, was also adopted in the main source code, making it available for other Grid Sites to use it and watched as a proof of concept. Also Grid services are improving, and storm team is aware of the need to support group and user quotas at the srm level.
The presented approach has proven to work fine and meets the required quality levels for the operation of the different supported virtual organizations, but in order to constantly improve, we are analyzing the usage of Lustre capabilities to access the geographically distributed data stored at the 3 sites of our distributed Tier-2. We are also planning to support 3rd level data tape storage, using the future developments of the Lustre/CEA with HSM when mature. We also intend also to continue supporting our developments and making it available to the community.
Grid, Lustre, Data Access, SRM, Middleware development
Location: Uppsala University
Room: Room X
- Javier SANCHEZ MARTINEZ Instituto de Fisica Corpuscular (IFIC) UV-CSIC
- Alejandro LAMAS DAVIñA Instituto de Fisica Corpuscular (IFIC) UV-CSIC
- Gabriel AMOROS VICENTE Instituto de Fisica Corpuscular (IFIC) UV-CSIC
- Javier NADAL DURA Instituto de Fisica Corpuscular (IFIC) UV-CSIC