27 September 2004 to 1 October 2004
Interlaken, Switzerland
Europe/Zurich timezone

Gfarm v2: A Grid file system that supports high-performance distributed and parallel data computing

27 Sep 2004, 17:10
20m
Harder (Interlaken, Switzerland)

Harder

Interlaken, Switzerland

oral presentation Track 6 - Computer Fabrics Computer Fabrics

Speaker

O. Tatebe (GRID TECHNOLOGY RESEARCH CENTER, AIST)

Description

Gfarm v2 is designed for facilitating reliable file sharing and high-performance distributed and parallel data computing in a Grid across administrative domains by providing a Grid file system. A Grid file system is a virtual file system that federates multiple file systems. It is possible to share files or data by mounting the virtual file system. This paper discusses the design and implementation of secure, robust, scalable and high-performance Grid file system. The most time-consuming, but also the most typical, task in data computing such as high energy physics, astronomy, space exploration, human genome analysis, is to process a set of files in the same way. Such a process can be typically performed independently on every file in parallel, or at least have good locality. Gfarm v2 supports high-performance distributed and parallel computing for such a process by introducing a "Gfarm file", a new "file-affinity" process scheduling based on file locations, and new parallel file access semantics. An arbitrary group of files possibly dispersed across administrative domains can be managed as a single Gfarm file. Each member file will be accessed in parallel in a new file view called "local file view" by a parallel process possibly allocated by file-affinity scheduling based on replica locations of the member files. File-affinity scheduling and new file view enable the ``owner computes'' strategy, or ``move the computation to data'' approach for parallel and distributed data computing of member files of a Gfarm file in a single system image.

Primary authors

N. Soda (SRA) O. Tatebe (GRID TECHNOLOGY RESEARCH CENTER, AIST) S. Matsuoka (Tokyo Institute of Technology/National Institute of Informatics) S. Sekiguchi (Grid Technology Research Center, AIST) Y. Morita (KEK)

Presentation materials