The proposed solution will reduce job submission and scheduling latencies. The user will not have to break the workflow into jobs and then submit and schedule them individually. The user will write the workflows in a workflow authoring environment which will be translated into appropriate SAGA structures. Users will also be able to monitor and manage hundreds of jobs in a workflow as a single entity. This will help the users to keep track of execution, failures and outcome of dependencies. Scheduling will also become effective as a higher level enactor, Digedag, will coordinate with a Grid wide scheduler to plan and schedule the jobs.
As a consequence of the proposed project, user workflows can be developed programmatically, and can be enacted to use any available middleware without restricting the specification to a particular execution engine. This flexibility and extensibility along with robustness is critical to enable applications to utilize infrastructure at scale. Users will be able to get features such as “write once and run anywhere”.
Conclusions and Future Work
Using neuroimaging as an exemplar, SAGA will provide support for workflow enactment. Users will translate workflow descriptions into a SAGA API, which can then be enacted using the appropriate execution engine and scheduled, distributed and executed in multi-site environments.
Digedag will be extended to support the efficient execution of broad range of applications such as nueroimaging in neugrid, in an extensible,flexible and robust manner. Currently digedag does not have support to coordinate with a EGEE workload management system to schedule a series/sequence of jobs as per the requirements of a workflow. This limits the effective management, monitoring and executions of workflows. The proposed system will implement the following in the Digedag workflow engine:
Users may write workflows using the SAGA workflow package which can be enacted and scheduled using appropriate middleware adapters.
The submitted workflows can be managed and monitored as a single entity.
SAGA workflow package will take care of all the dependencies and execution will be subject to user preferences and dependencies
The sequence of jobs and order will remain intact even if a workflow is distributed across sites for execution.
Although we will initially establish and demonstrate these benefits in the context of neugrid based neuroimaging, the advantages will extend to a range of DAG-based and other workflow applications, thus impacting a number of communities.
|URL for further information||http://forge.gridforum.org/sf/projects/saga-rg|
|Keywords||SAGA, Workflow Description, Monitoring and Enactment|