21-25 May 2012
New York City, NY, USA
US/Eastern timezone

Supporting Shared Resource Usage for a Diverse User Community: the OSG experience and lessons learned

22 May 2012, 13:30
4h 45m
Rosenthal Pavilion (10th floor) (Kimmel Center)

Rosenthal Pavilion (10th floor)

Kimmel Center

Poster Distributed Processing and Analysis on Grids and Clouds (track 3) Poster Session




The Open Science Grid (OSG) supports a diverse community of new and existing users to adopt and make effective use of the Distributed High Throughput Computing (DHTC) model. The LHC user community has deep local support within the experiments. For other smaller communities and individual users the OSG provides a suite of consulting and technical services through the User Support organization. We describe these sometimes successful and sometimes not so successful experiences and analyze lessons learned that are helping us improve our services. The services offered include forums to enable shared learning and mutual support, tutorials and documentation for new technology, and troubleshooting of problematic or systemic failure modes. For new communities and users, we bootstrap their use of the distributed high throughput computing technologies and resources available on the OSG by following a phased approach. We first adapt the application and run a small production campaign on a subset of "friendly" sites. Only then we move the user to run full production campaigns across the many remote sites on the OSG, where they face new hindrances including no determinism in the time to job completion, diverse errors due to the heterogeneity of the configurations and environments, lack of support for direct login to troubleshoot application crashes, etc. We cover recent experiences with image simulation for the Large Survey Synoptic Telescope (LSST), small-file large volume data movement for the Dark Energy Survey (DES), civil engineering simulation with the Network for Earthquake Engineering Simulation (NEES), and accelerator modeling with the Electron Ion Collider group at BNL. We will categorize and analyze the use cases and describe how our processes are evolving based on lessons learned.

Primary authors

Mr Chander Sehgal (Fermi National Accelerator Laboratory) Dr Gabriele Garzoglio (FERMI NATIONAL ACCELERATOR LABORATORY) Dr Marko Slyz (Fermi National Accelerator Laboratory) Mats Rynge (Information Sciences Institute (ISI)) Mrs Tanya Levshina (Fermi National Accelerator Laboratory)

Presentation Materials