25–29 Sept 2006
CICG
Europe/Zurich timezone

The Diligent prototype and the experiences gained joining the EGEE PPS infrastructure

26 Sept 2006, 15:20
10m
Conf. Room 3 (CICG)

Conf. Room 3

CICG

CICG, 17 rue de Varembé, CH - 1211 Geneva 20 Switzerland
Oral Users & Applications Grid Applications (NA4)

Speakers

Dr Pasquale Pagano (CNR-ISTI)Dr Pedro Andrade (CERN)

Description

Diligent is an ongoing IST project that aims to combine Grid and Digital Library (DL) technologies in order to provide an advanced test-bed DL infrastructure allowing members of dynamic virtual e-Science organizations to access shared knowledge and to collaborate in a secure, coordinated, dynamic and cost-effective way. In particular, Diligent builds on top of the Enabling Grid for E-sciencE (EGEE) project which is providing the one of the largest European Grid Infrastructure and the support to the gLite Grid middleware. From an abstract point of view, the Diligent infrastructure acts as a DL broker, where the clients of the broker are DL resource providers and consumers. The providers are the individuals and the organisations that decide to make available, under the supervision of the infrastructure, their resources according to certain access and use policies. The consumers are the user communities that want to build their own DLs. The resources managed by this broker are of different types: collections (i.e., set of information objects searchable and accessible through a single “access point”), services (i.e., software tools implementing a specific functionality and whose descriptions, interfaces and bindings are defined and publicly available), hosting nodes (i.e., networked entities that offer computing and storage capabilities and supply an environment for hosting collections and services), and EGEE resources (i.e., computing elements and storage elements). In order to support the controlled sharing of resources among providers and consumers, the Diligent infrastructure relies on the virtual organizations (VOs) mechanism that has been introduced in the Grid research area. This mechanism models set of users and resources aggregated together by highly controlled sharing rules, usually based on an authentication framework. By exploiting appropriate mechanisms provided by the infrastructure, providers register their Diligent resources by supplying a description of them. According to the type of resources provided, the infrastructure also automatically extracts other properties that are used to enrich the explicit description. The infrastructure takes care of the management of the registered resources by supporting their discovery, monitoring, reservation, and by implementing the functionality needed to support the required controlled sharing and quality of service. A user community can create one or more DLs by specifying a set of requirements and by appropriately combining the available resources. These requirements specify conditions on the information space (e.g., the set of collections, subject of the content, documents type), on the services for supporting the work of the users (e.g., type of search), on the quality of service (e.g., availability, performance, security) and on many other aspects, like the maximum cost, lifetime, etc. The DL broker satisfies the given requirements by selecting, and in many cases also deploying, a number of resources among those accessible to the community, gluing them appropriately and, finally, making the new DL application accessible through a portal. The composition of a DL is dynamic since the infrastructure continuously monitors the status of the DL resources and, if necessary, changes them in order to offer the best quality of service. Therefore, DLs (possibly serving different communities) can be created and modified on-the-fly, without considerable investments and changes in the organisations that set them up. The potential of the Diligent infrastructure is being demonstrated and validated over two complementary real-life application scenarios deriving from the environmental e-Science, named ImpECt, and the cultural heritage, named ARTE, domains. ImpECt (Implementation of Environmental Conventions) includes leading actors in the environmental sector, and is represented by the European Space Agency (ESA). This community exploits the DILIGENT to support conference organisation and the preparation of projects and periodical reports. International and regional conventions related to earth observation represent the framework for formulating international environmental agreements. These conventions are continuously evolving and thematic areas are specialising. Yet, information sources are dispersed among environmental agencies and a DILIGENT-based DL could be the most appropriate tool to enable this community to more effectively coordinate actions. ARTE is a community of scholars located all over the world, working together to establish a new discipline that merges experiences from research in humanities, social sciences and communication. In order to achieve their objectives these researchers require a common background knowledge base. The DILIGENT platform provides them, in a short time framework, a cost-effective instrument for setting up DLs, i.e. common multimedia knowledge repositories equipped with a number of services, specifically tailored to the needs of this community. The ARTE community is represented in DILIGENT by the Scuola Normale Superiore (SNS), one of the partners contributing rich archives of texts and images. Audio-video content is being provided by the Italian National Broadcasting RAI Educational. The Diligent project is currently testing the first prototype that will be delivered by the end of September ’06. According to the Diligent implementation plan, this version is not fully-fledged. Rather it supports the basic DL functionalities required to satisfy the main user communities’ requirements. All the DL functionalities: (i) have been designed in accordance with the Service Oriented Architecture paradigm, (ii) have been implemented as WSRF compliant service elements, and (iii) are powered by the EGEE gLite middleware components. In particular, this prototype provides: • On-demand services deployment on nodes equipped with the Diligent VO box; • Content security handling (access and watermarking policies); • Semantic content management over a gLite superimposed storage management layer; • Metadata management and indexing; • Annotation management and visualisation; • Complex process visual design, verification, optimisation and execution; • Text, image, sound, video and multimedia content processing; • Information visualisation; • Information retrieval out of structured, semi-structured and unstructured data; • Support for application-specific extensions. In addition to the above functionalities, application specific ones have been integrated to better support the two user communities. The aim of this talk is to present the Diligent prototype. In particular, the talk focuses on a specific functionality, relevant to the ARTE community, dealing with the management of copyrighted videos on the gLite based Grid infrastructure. It has been implemented exploiting the process visual design, verification, and optimisation capabilities to combine content and metadata management, content security handling, and indexing. The so designed workflow allows to import, secure, and make available generic set of videos originally stored in a storage device. This complex workflow has been executed on the Diligent sites belonging to the EGEE PPS infrastructure and a set of statistics have been collected.

Summary

The aim of this talk is to present the Diligent prototype. In particular, the talk
focuses on a specific functionality, relevant to the ARTE community, dealing with the
management of copyrighted videos on the gLite based Grid infrastructure. It has been
implemented exploiting the process visual design, verification, and optimisation
capabilities to combine content and metadata management, content security handling,
and indexing. The so designed workflow allows to import, secure, and make available
generic set of videos originally stored in a storage device. This complex workflow
has been executed on the Diligent sites belonging to the EGEE PPS infrastructure and
a set of statistics have been collected.

Authors

Dr Pasquale Pagano (CNR-ISTI) Dr Pedro Andrade (CERN)

Co-author

Dr Leonardo Candela (CNR-ISTI)

Presentation materials