Data Mining and Integration of Environmental Applications

Ladislav Hluchy (Institute of Informatics, Slovakia)


In this paper we presents the data mining and integration of environmental applications in EU IST project ADMIRE. It briefly presents the project ADMIRE and data mining of spatio-temporal data in general. The application, originally targeting flood simulation and prediction is now being extended into the broader context of environmental studies. We describe several interesting scenarios, in which data mining and integration of distributed environmental data can improve our knowledge of the relations between various hydro-meteorological variables.


The whole data integration and mining processing of scenarios are described in DISPEL, the language proposed in ADMIRE for data integration and mining. Data are stored in different forms (files with different formats, databases) and accessed via OGSA-DAI. A gateway will be responsible for processing the DISPEL file, sending request to the corresponding OGSA-DAI servers and collecting results. An easy to use graphical interface is being developed for the ADMIRE platform. The users can select existing components (activities), connect them together and make workflows of data integration and data mining with a visual editor. The development of environmental applications in the ADMIRE project is tightly collaborated with the VEGA project 2009-2011 and APVV project DMM.

Conclusions and Future Work

In this paper we have demonstrated the data integration and mining platform developed in the ADMIRE project for environmental risk management. We have also shown complex meteorological and hydrological scenarios with different data sources in different formats and a simple way to make workflows in DISPEL for data integration and mining. The graphical editor for workflows is in progress and should allow experts in environmental applications to make data integration easily. This work is partially supported also by VEGA project 2009-2011 and APVV project DMM.

Detailed analysis

The project ADMIRE aims to deliver a consistent and easy-to-use technology for extracting information and knowledge. Its main target is to provide advanced data mining and integration techniques for a distributed environment. In this paper, we will focus on one of its pilot applications with the target domain as environmental risk management. Several scenarios have been proposed including short-term weather forecasting using radar images, complex hydrological scenarios with waterworks, measured data from water stations and meteorological data from models. Historical data for mining are supplied mainly by the Slovak Hydrometeorological Institute and the Slovak Water Enterprise.
The main characteristics of data sets describing phenomena from environment applications are spatial and temporal dimensions. Integration of spatio-temporal data from different sources is a challenging task due to those dimensions. Different spatio-temporal data sets contain data at different resolutions and frequencies. This heterogeneity is the principal challenge of geo-spatial and temporal data sets integration – the integrated data set should hold homogeneous data of the same resolution and frequency.

Martin Seleng (Institute of Informatics, Slovakia) Ondrej Habala (Institute of Informatics, Slovakia) Viet Tran (Institute of Informatics, Slovakia)

