12–16 Apr 2010
Uppsala University
Europe/Stockholm timezone

The Nordic BioGrid project – Bioinformatics for the grid

13 Apr 2010, 12:00
15m
Room X (Uppsala University)

Room X

Uppsala University

Oral Experiences from application porting and deployment Bioinformatics

Speakers

Prof. Bengt Persson (IFM Bioinformatics, Linköping University, S-581 83 Linköping, Sweden; Dept of Cell and Molecular Biology, Karolinska Institutet, S-171 77 Stockholm, Sweden; National Supercomputer Centre (NSC), S-581 83 Linköping, Sweden) Joel Hedlund (IFM Bioinformatics, Linköping University, S-581 83 Linköping, Sweden)

Description

Life sciences have undergone an immense transformation during the recent years, where advances in genomics, proteomics and other high-throughput techniques produce floods of raw data that need to be stored, analysed and interpreted in various ways. Bioinformatics is crucial by providing tools to efficiently utilize these gold mines of data in order to better understand the roles of proteins and genes and to spark ideas for new experiments.

Impact

Regarding databases, the frequently used databases UniProtKB and UniRef have been made available on the distributed and cached storage system within the Nordic grid. A system for database updating has been deployed in a virtual machine hosted by NDGF. The database PairsDB updates have been run on BioGrid & M-grid resources. Further applications are in the pipeline to be gridified including molecule dynamics and phylogeny calculations.

Detailed analysis

BioGrid is an effort to establish a Nordic grid infrastructure for bioinformatics, supported by NDGF (Nordic DataGrid Facility). BioGrid aims both to gridify computationally heavy tasks and to coordinate bioinformatics infrastructure efforts in order to use the Nordic resources more efficiently. Hitherto, the widely used bioinformatics software packages BLAST and HMMer have been gridified. Furthermore, the multiple sequence alignment programs ClustalW, MAFFT and MUSCLE have been made available on the grid.

Conclusions and Future Work

The BioGrid has already contributed to provide computational power for analysis of the medium-chain dehydrogenase/reductase (MDR) superfamily. The size and complexity of this superfamily has recently been shown to far surpass the means of subclassification that have traditionally been employed for this task. Instead, more computationally demanding methods must be employed, such as profile Hidden Markov Models, implemented in the HMMer package.

URL for further information http://wiki.ndgf.org/display/ndgfwiki/BioGrid
Keywords bioinformatics, grid, hidden Markov models, large-scale analyses, distributed storage

Primary authors

Dr Ann-Charlotte Berglund Sonnhammer (Linnaeus Center for Bioinformatics (LCB), Uppsala University, S-751 05 Uppsala, Sweden) Prof. Bengt Persson (IFM Bioinformatics, Linköping University, S-581 83 Linköping, Sweden; Dept of Cell and Molecular Biology, Karolinska Institutet, S-171 77 Stockholm, Sweden; National Supercomputer Centre (NSC), S-581 83 Linköping, Sweden) Prof. Erik Sonnhammer (Stockholm Bioinformatics Center (SBC), Stockholm University, S-106 91 Stockholm, Sweden) Prof. Inge Jonassen (CBU, Bergen Centre for Computational Science, N-5020 Bergen, Norway) Joel Hedlund (IFM Bioinformatics, Linköping University, S-581 83 Linköping, Sweden) Dr Josva Kleist (Nordic Data Grid Facility, Kastruplundgade 22, DK-2770 Kastrup, Denmark) Dr Kimmo Mattila (CSC – IT Center for Science Ltd, P.O. Box 405, FI-02101 Espoo, Finland) Dr Michael Grønager (Nordic Data Grid Facility, Kastruplundgade 22, DK-2770 Kastrup, Denmark) Olli Tourunen (Nordic Data Grid Facility) Dr Steffen Möller (Institut für Neuro- und Bioinformatik, University of Lübeck, D-23538 Lübeck, Germany)

Presentation materials

There are no materials yet.