Speaker
Report on the experience (or the proposed activity). It would be very important to mention key services which are essential for the success of your activity on the EGEE infrastructure.
We have tried diverse ways to carry out the port of our applications within the Grid infrastructure, generating a good body of experience.
- As a first approach, we tried to send independent jobs with our executables to the Grid to observe the results.
- Next we tried partitioning the data sets which provided the bases for subsequent works..
- Thanks to the group ARDA inside CERN, we used a script integrated in the excellent DIANE framework, but in the end we had some problems relative to the operational structure of our programs and other issues.
- Since we already have MPI-optimized versions of MLaligng2D and MLrefine3D, we tried to use them on MPI clusters within EGEE. However this approach encountered many practical problems related to MPI adoption in EGEE.
- Currently, we are developing our own solution for the jobs management with satisfactory results. We hope to improve its performance in the next future.
Describe the added value of the Grid for the scientific/technical activity you (plan to) do on the Grid. This should include the scale of the activity and of the potential user community and the relevance for other scientific or business applications
There are various steps in the 3D-EM refinement process that may benefit from Grid computing. To start with, large numbers of experimental images need to be averaged. Nowadays, typically tens of thousands of images are used, while future studies may routinely employ millions of images. Our group has been developing Xmipp, a package for single-particle 3D-EM image processing.
Each of the EM images can be regarded as a projection image of the specimen 3D structure from an unknown projection direction. A key task is, therefore, to determine a posteriori the projection direction. Furthermore, in many cases there is a mixture of different conformations of the same macromolecule, and a “structural class sorting” has to be accomplished at the same time that the orientation search. Probably the most advanced methods is the one REF, refered as ML2d/ML3d, included in the package of xmipp. Typical runs are several single CPU months or years, making it a good target for parallelization over t
Describe the scientific/technical community and the scientific/technical activity using (planning to use) the EGEE infrastructure. A high-level description is needed (neither a detailed specialist report nor a list of references).
Electron microscopy (EM) is a crucial technique, which allows Structural Biology researchers to characterize macromolecular assemblies in distinct functional states. Image processing in three dimensional EM (3D-EM) is used by a flourishing community (exemplarized by the EU funded 3D-EM NoE) and is characterized by voluminous data and large computing requirements, making this a problem well suited for Grid computing and the EGEE infrastructure.