Speaker
Dr
Ignacio Blanquer
(UPV)
Description
The execution of BLAST runs of metagenomes with respect to annotation
databases, such as the Non Redundant (nr) from the NCBI provides many
interesting results, such as the incoherencies in the annotation of entries
in the reference databases [1] or the definition of more precise
phylogenetic trees.
In the frame of EGEE, the UPV has developed in collaboration with the
Institute Cavanilles for Biodiversity, a framework for splitting, data
distributing, execution and monitoring of BLAST runs on metagenomes.
The framework enables replicating reference databases, compiling BLAST
executable on the fly and automatic resubmission.
The large size of data blocks to be retrieved and analyzed (on the order of
2Gb for the reference databases) and the need for downloading, compiling and
locally installing software is a difficult test for many resources, which
could fail on executing parts of the experiment. Since all need to be
finished, the resubmission engine has been intensively tested.
Currently, the system has executed experiments consuming more than 5 CPU
years with metagenomes of gut, virus, soil and oceanic samples.
[1] Miguel Pignatelli, Gabriel Aparicio, Ignacio Blanquer, Vicente
Hernández, Andrés Moya and Javier Tamames, "Metagenomics Reveals our
Incomplete Knowledge of Global Diversity", Bioinformatics, ISSN 1367-4803,
Oxford University Press 2008.
Authors
Dr
Ignacio Blanquer
(UPV)
J. Tamames
(Instituto Cavanilles de la Biodiversidad - Universidad de Valencia)
M. Pignatelli
(Instituto Cavanilles de la Biodiversidad - Universidad de Valencia)
Vicente Hernandez-Garcia
(Polytechnical University of Valencia)