Vicomtech is visiting CERN to expose their activities and explore possible lines of collaboration. As part of the programme they will be offering a presentation, staged in three parts:
The full programme to the visit is here
Mining semi-structured data is fundamental for archive monitoring, understanding and exploitation.
Typical analysis systems are based on a three-tiered architecture, in which efficient databases feed highly parallelised application servers that in turn feed client user interfaces. Yet the sharing of analysis, content identification and semantic level summarization tasks among the two bottom layers of the architecture – the highly parallel application server and a shared database – are key to allowing the user interface deal interactively with large data volumes.
In this framework, we present approaches based on NoSQL/MapReduce, experiences with column-based scientific DBMSs and considerations on graph databases for the efficient processing of large repositories. Specific attention is devoted to Visual Analytics methodologies resulting in multi-user interfaces centring on multiple linked representations implementing interaction techniques based on concurrent brushing in multiple dimensions.
These considerations and experiences provide the core for operational and pre-operational systems for mining multimedia archives, for analysing social streams for cyber-security applications and for search engines dedicated to Earth observation products.
Extensions to content-based global and local image retrieval, to mining compressed streams, to multi-modal search and to mining activity data collected by distributed mobile networks are introduced.