12-16 April 2010
Uppsala University
Europe/Stockholm timezone

Grid-based International Network for Flu Observation

Apr 13, 2010, 3:00 PM
20m
Room X (Uppsala University)

Room X

Uppsala University

Oral Scientific results obtained using distributed computing technologies Bioinformatics

Speaker

Ms Ana Lucia DA COSTA (HealthGrid)

Description

Since the H1N1 outbreak lately, there has been a worldwide effort to isolate and sequence flu virus genomes. Specimens with a positive result are sequenced and deposited in influenza databases. The present EUAsiaGrid application, called g-INFO (Grid-based International Network for Flu Observation), shows the integration of existing data sources towards a global surveillance network for molecular epidemiology, based on Service Oriented Architecture and Grid technologies. Its relevance is being tested through the current H1N1 outbreak.

Conclusions and Future Work

Future developments will involve additional influenza databases within the network. By being constantly attentive to the virologists and epidemiologists requirements, the data processing can be adapted accordingly. The final goal is to have the grid-based surveillance network ready to impact the next pandemics.

Detailed analysis

The current prototype is using the NCBI database (National Center for Biotechnology Information). Everyday the NCBI-FTP server is updated with new sequences of H1N1 segments, with 7 files: sequences of nucleotide, protein and coding region and corresponding metadata. A grid database (AMGA) is populated with such data through an automatic synchronisation. The pipeline starts with a sequence preparation in correct format, then a multiple alignment using Muscle followed by a curation with G-blocks to identify conserved blocks. From this step a phylogenetic analysis is performed to obtain a branching diagram. Based on virologists requirements, the selected sequences from this diagram are subjected to further analysis in order to identify key features related to pathogenicity such as the site for protease cleavage, the glycosylation sites, the epitopes or the binding site.

Impact

Results are made available to the research community in the corresponding website: http://g-info.healthgrid.org/, providing a real identity card of the concerned virus strains. Thanks to the molecular specificities highlighted (site for protease cleavage, glycosylation sites, epitopes and binding site), experts have in their possession promptly elements allowing them to take the most appropriate decisions relating to the transmission and the geographical expansion of the epidemic.

Keywords flu, surveillance network, grid, Service Oriented Architecture, epidemiology
URL for further information http://g-info.healthgrid.org/

Primary authors

Ms Ana Lucia DA COSTA (HealthGrid) Mr Tung DOAN TRUNG (IFI, Vietnam)

Co-authors

Mr Aurélien BERNARD (CNRS-IN2P3, France) Dr Quang Nguyen Hong (IFI, Vietnam) Dr Thanh-hoa LE (Institute of Biotechnology, Vietnam) Mr Vincent BRETON (CNRS-IN2P3, France) Mr Yannick LEGRE (HeallthGrid, France)

Presentation materials