Speakers
Description
The Copernicus Programme of the European Union with its fleet of Sentinel satellites will generate up to 10 terabyte of Earth Observation (EO) data per day once in full operational capacity. These data, combined with other geo-spatial data sources, form the basis of many JRC knowledge production activities. In order to handle this big amount of data and their processing, the JRC Earth Observation Data and Processing Platform (JEODPP) was implemented. This platform is built upon commodity hardware. It consists of processing servers amounting to a total of currently 500 cores and 8 TB of RAM using 10 Gb/s ethernet connectivity to access the EOS storage back-end. The EOS instance is running on currently 10 storage servers with a total gross capacity of 1.4 petabyte, with a scaling-up foreseen in 2017. EOS was deployed on the JEODPP thanks to the CERN-JRC collaboration. In conjunction with HTCondor as workload manager EOS allows for optimal load distribution during massive processing. The processing jobs are containerised with Docker technology to support different requirements in terms of software libraries and processing tools.
This presentation details the JEODPP platform with emphasis on its EOS instance, using the FUSE client on the processing servers for all data access tasks. Low-level I/O benchmarking and real-world applications of EO data processing tasks show a good scalability of the storage system in a cluster environment. Issues encountered during data processing and service set-up are also described together with their current solutions.