25–29 Sept 2006
CICG
Europe/Zurich timezone

MediGRID - Medical Grid Computing

26 Sept 2006, 14:00
5h 30m
CICG

CICG

CICG, 17 rue de Varembé, CH - 1211 Geneva 20 Switzerland
Board: 9
Poster Users & Applications Poster session

Speaker

Mr Yassene Mohammed (MediGRID)

Description

MediGRID - Medical Grid Computing Sax U1, Viezens F1, Mohammed Y1, Lingner Th2, Morgenstern B2, Vossberg M3, Krefting, D3, Rienhoff O1 1Department of Medical Informatics, University of Göttingen, Germany 2Department of Bioinformatics, University of Göttingen, Germany 3Department of Medical Informatics, Charité, Humboldt University Berlin, Germany fred.viezens@med.uni-goettingen.de Introduction The project MediGRID [1] combines research institutes from various areas of Medicine, Biomedical Informatics, and other Life Sciences. Numerous associated partners from industry, healthcare and scientific institutions ensure a broad representation of this large community. The main goal of MediGRID is the development of a grid-middleware-platform with Globus Toolkit 4 as a basis for eScience Services for the community and to help researchers to use these services. Here we present the first results of the projects consisting of the infrastructure, first community applications and basic privacy rules. Materials and Methods Concerning the data flow in grids, most projects are similar in the lower layers (s. Fig. 1), but the biomedical community has to face particular challenges in the upper layers. The top layer represents the heterogeneous biomedical data sources. Beyond the problem to find the relevant data sets via metadata description, access control to the data is of paramount importance, as the owner of the data are foremost the patients. Due to the heterogeneity of the data we need an additional ontology layer to homogenize the data. Given semantic interoperability the researcher can correlate and analyze the data with biomedical informatics methods. Biomedical data in medical grids are heterogeneous, contain different kinds of information and have different levels of privacy. The data might include information about [2]: • Population: Epidemiology • Diseases: Clinical practice, clinical trials • Patient data: Health record, clinical history, physical exams, lab/imaging studies • Organ/tissue: pathology • Cellular: histology • Molecular: genetic test results and genomic data. Having these data online with the suitable tools to connect, combine and analyze creates new challenges for data protection and privacy. Results The current privacy concepts do not cover the aspects and abilities of grid computing. Especially the re-identification risk with the combination of different data types has to be assessed. There are severe privacy concerns related to genomic-wide association studies [3-5]. These are the main reasons for the development of an enhanced security concept for MediGRID. Four methodological modules are responsible to construct the suitable infrastructure: ontology, resource fusion, middleware and eScience. On the other hand, three research modules take the initiative to use this national grid infrastructure to assist their work: biomedical Informatics, image processing and clinical research. The MediGRID consortium developed a middleware component as a connector for the medical community to resources of the integration project in D-Grid [6]. The community modules use this middleware to “gridify” their applications in order to show the advantage of grid computing in medicine. Ontology Using OGSA-DAI as a standard of Data Access and Integration in grids, the ontology module has successfully developed an ontology tool and implemented it as a first step to be a Gridsphere-Portlet in the MediGRID portal being available for all project partners. Bioinformatics Dialign is a widely used software tool for multiple alignments of nucleic acid and protein sequences. Within the MediGRID portal a parallelized version of the software is used to speed up the computationally expensive procedure. In that way distributed computing allows the user to obtain high-quality alignments of bigger databases and longer sequences. AUGUSTUS is a program that predicts gene structures in eukaryotic genomic sequences with high accuracy. Since AUGUSTUS performs a successive analysis of overlapping sequence sections, it is easy to parallelize. Therefore users benefit from distributed computing with several instances of the program. Grid resources also allow frequent update and centralized storage of huge EST databases. Image Processing A 3D image reconstruction for prostate biopsy will register the different ultrasonic scans helping the physician to have a new viewing of the prostate that was not possible using the traditional 2D methods. The virtual vascular surgery helps to calculate and present the animated 3D blood flow field in the brain vessels, which could be used to anticipate the pressure on the walls of the vessels and for example to predict a bleeding risk. The Brain 4D MRI image processing application in MediGRID assists the identification of the brain areas research. All three applications are demanding the possibility of massive data volume storage and processing. Because of the constant increase of the data volume coherent with the development in the imaging techniques, a dynamic extendable computing and storage infrastructure is needed, which ideally will be a grid environment. Privacy In the first step we are working with non person-related data and set up the first generation of privacy rules. As a preparation for the next phase – dealing with person related data – we defined the additionally necessary advanced security methods like audit and tracking. Furthermore some legal issues concerning virtual organizations and the ownership of data and material have to be addressed. Further perspective As we gained experience with our first medical grid applications, we will be able to “gridify” other community applications more easily. The initial incarnation of the infrastructure is set up; some details are still to come. As genotyping constantly gets cheaper, many formerly phenome-related projects around complex diseases consider genotyping within the next couple of years. Beyond the indisputable opportunities of these studies there are quite some challenges to be faced. MediGRID addresses issues like the homogenization of heterogeneous data sources and how do we deal with the well-known privacy problems. Solving those issues will enhance the portability of life science grid services to other projects and other communities. References 1 www.medigrid.de 2 Martin-Sanchez, F., V. Maojo, and G. Lopez-Campos, Integrating genomics into health information systems. Methods Inf Med, 2002. 41(1): p. 25-30. 3 Butte, A.J. and I.S. Kohane, Creation and implications of a phenome-genome network. Nat Biotechnol, 2006. 24(1): p. 55-62. 4 Lin, Z., A.B. Owen, and R.B. Altman, Genetics. Genomic research and human subject privacy. Science, 2004. 305(5681): p. 183. 5 Kohane, I.S. and Altman R.B., Health-Information Altruists — A Potentially Critical Resource. NEJM, 2005. 353 (19): p. 2074-2077 6 www.d-grid.de This work was supported by the D-Grid Project MediGRID, funded by the Federal Ministry of Education and Research (BMBF), FKZ 01AK803A-H.

Author

Prof. Ulrich Sax (MediGRID)

Co-authors

Mr Fred Viezens (MediGRID) Mr Yassene Mohammed (MediGRID)

Presentation materials