Speaker
Description
Keywords
Computational Grid, EGEE; Bioinformatics, Genomics , Proteomics, Federate Database, Grid-DBMS
Detailed analysis
In the LIBI Grid Problem Solving environment, a set of services for the resource and data management have been developed. This system allows submitting millions of jobs on the Grid, guaranteeing the interoperability among gLite, Unicore and Globus, their monitoring and the retrieval of the results. Several biological applications, developed by the project partners or available from the research community, have been analyzed and reengineered for their submission in a distributed environment, developing general and ad hoc solutions. Several successful stories will be presented as LIBI case studies. These regard important issues in bioinformatics such as the large scale proteome comparison, the Genome-wide identification of coding and non-coding conserved sequence tags in human and mouse genomes by using an application developed by the project partners, the identification of the regulatory elements in mRNA untranslated regions, the tertiary structure prediction and so on.
URL for further information
http://www.libi.it/biotools
Conclusions and Future Work
The EGEE Grid infrastructure and the federated database have allowed finding solutions to bioinformatics problems otherwise irresolvable. The future work will pursue few distinct objectives:
1) enlarge the portfolio of the Bioinformatics applications ported on the grid;
2) expand the federated database with the inclusion of new data sources;
3) make the user interaction with the portal GRID even easier trying to mask all the complexity introduced by the use of the GRID.
Impact
A user, through the LIBI Grid portal, can acces a variety of biological applications, that have allowed large scale simulation of complex experiments in Bioinformatics. A resource management solution, developed in the project, has allowed the submission and the monitoring of batch, MPI, parameter sweep and workflow jobs on gLite, Unicore and Globus. The EGEE infrastructure has been successfully utilized in many bioinformatics experiments, by using general and ad hoc solutions. Regarding data management, a solution to federate and access several biological data banks on the Grid has been provided.