Speaker
Description
1. Short overview
GRid-aware Optimal data Warehouse design uses Gridified genetic algorithm to solve the problem of optimal data warehouse design. The main problem is to select the optimal set of physical objects (Views and Indexes) materialization (VIS) of a data warehouse for a specified database design, considering specified queries and additional parameters. This can significantly increase the performance of any large database. The Grid is used for parallelization of genetic algorithm optimizations.
3. Impact
The framework uses the following Java grid features: WMProxy job submission, VOMS proxy init, DAG (Workflow) execution and LBProxy. Because the framework is implemented in Java, it makes the applications implemented in it portable on all operating systems supporting java 1.5. Also by using Java implementation of the Grid job management functions, the developed applications does not need an installed UI machine. For the Java Grid tools to work the application user needs to have: his certificate in p12 format, CA certificates, VOMS certificates and specification. For the implemented GROW application, the user needs to put the formerly mentioned files in different folders and specify their location in the application properties file. When the application loads, and the user wants to submit a job, he first must generate a VOMS proxy. For this he provides a password for the p12 file, VOMS name and FQAN. After this the other functionalities for the Grid tools are available.
Provide a set of generic keywords that define your contribution (e.g. Data Management, Workflows, High Energy Physics)
Workflow, Java WMProxy, Java LBProxy, Genetic Algorithms
4. Conclusions / Future plans
The porting process was in two phases. The first phase was the implementation of the Genetic algorithm framework. This was mainly to enable researchers reuse the already implemented GA structures. The second phase consisted of implementation of tools for automatic generation of JDL workflows, job submission, job status reporting and job output retrieval. Further development should enable automatic retrieval of CA certificates, VOMS configuration and infrastructure information (BDII).