11–14 Feb 2008
<a href="http://www.polydome.org">Le Polydôme</a>, Clermont-Ferrand, FRANCE
Europe/Zurich timezone

MPI Support on the Grid

13 Feb 2008, 16:30
30m
Champagne (<a href="http://www.polydome.org">Le Polydôme</a>, Clermont-Ferrand, FRANCE)

Champagne

<a href="http://www.polydome.org">Le Polydôme</a>, Clermont-Ferrand, FRANCE

Oral Existing or Prospective Grid Services Workflow and Parallelism

Speaker

Mr Kiril Dichev (High Performance Computing Center Stuttgart)

Description

MPI-Start was developed for the Interactive European Grid project in order to improve MPI support for its infrastructure. MPI-Start supports different MPI implementations (currently Open MPI, MPICH, MPICH2, LAM-MPI). Also, it offers support to different batch systems (currently PBS, SGE, LSF). In addition, support for MPI tools like Marmot is already integrated into MPI-Start. PACX-MPI supports any implementation of the MPI 1.2 standard and delivers the support for seamlessly running one large MPI application on heterogeneous clusters or supercomputers. Marmot can be useful for different MPI correctness checks at runtime like using correct data types, deadlocks etc.

Provide a set of generic keywords that define your contribution (e.g. Data Management, Workflows, High Energy Physics)

MPI, inter-cluster, scheduler, workload management system, runtime checks

4. Conclusions / Future plans

MPI-Start will be further used to integrate other MPI oriented tools into the Grid like some tools for performance measurement or debugging. Open MPI is being actively developed. Marmot is currently implementing better support for graphical viewers and fixing bugs.

1. Short overview

MPI-Start is a layer of scripts to support the workload management system in running MPI applications on different clusters with different configurations. Open MPI is an open-source implementation of the MPI 2 standard. PACX-MPI is a library for support of inter-cluster MPI applications. Marmot is a correctness checker for MPI applications.

URL for further information:

http://www.open-mpi.org/
http://www.hlrs.de/organization/amt/projects/

3. Impact

MPI-Start greatly improves the MPI support on the Grid. Previous solutions for MPI support required the workload management system to use a hard-coded approach. This approach was not flexible and it also required a complete test and validation of the middleware for configuration changes of MPI/scheduler of a site. Currently, MPI-Start is successully integrated into the EGEE middleware.
Regarding the use of different MPI tools, support for such tools could be integrated into MPI-Start as well, which spares the user from sending additional instructions along with every job.
Open MPI is a modern MPI 2 implementation with a component-based design and many features.
PACX-MPI can optimally be used when running large-scale MPI applications which do not fit a single cluster.

Author

Mr Kiril Dichev (High Performance Computing Center Stuttgart)

Co-author

Mr Rainer Keller (High Performance Computing Center Stuttgart)

Presentation materials