Speaker
Description
Provide a set of generic keywords that define your contribution (e.g. Data Management, Workflows, High Energy Physics)
Metadata Management, gCube, DILIGENT, D4Science, XML
URL for further information:
http://www.gcube-system.org/
4. Conclusions / Future plans
The design and the implementation of the framework will evolve to simplify the plug-ability of different storage systems as backend support. Eventually storing metadata directly on the gLite Data Management System, overcoming the gCube Data Management layer, will become feasible.
Furthermore new services dedicated to the management of specialized object-to-object relationships will be analysed and integrated in the framework to serve application specific needs.
1. Short overview
A metadata object is any kind of data about other data. Any system aiming at managing content has to deal with them. Typically, systems are targeted on a limited set of metadata formats and they built their own semantics for such formats.
The gCube Metadata Framework provides an efficient and generic API, exploitable by domain-specific services, that does not care about the format or semantics of the metadata. It rather focuses on management along with efficient storage and retrieval facilities.
3. Impact
The gCube Metadata Framework has been successfully adopted within the DILIGENT infrastructure, where, hundreds of thousands of metadata objects, potentially outsourced onto the gLite Data Management System through the gCube Data Management API, have been stored and manipulated along the project’s lifetime. It proved to be scalable and efficient, well capable of serving the needs of heterogeneous communities.
The XML Indexer has been exploited in the query workflows by gCube’s native Search Engine, satisfying complex queries in acceptable response times, thanks to the transparent partitioning mechanism implemented. Alongside this, several transformation programs have been employed by the Metadata Broker, in order to generate new metadata collections, in different formats, towards facing interoperability and presentation challenges. On top of these, semi-structured annotations, over diverse types of content, add new potential to the exploitation of annotations in Information Retrieval.