Speaker
Description
Provide a set of generic keywords that define your contribution (e.g. Data Management, Workflows, High Energy Physics)
WSx-architecture QoS Finance general data-intensive analysis
3. Impact
The application is organised in two tiers: the first one handles the grid infrastructure, while the second one is exclusively concerned with the analysis of the data. The analysis is run in the Worker Node; it expects to have locally available a set of data files for processing, and it will produce a predefined set of local output files. The grid infrastructure code in turn consists of two parts: one to launch and monitor the analysis, and one to prepare the local environment in the WNs for the analysis. The launching and monitoring part is installed in a UI host; it accepts: a file containing the list of data to process, the analysis code to execute, and the grid output directory in a predefined secure SE. The code that prepares the WN local environment: fetches data files from the secure SE, pre-processes them, launches the analysis, clears any local temporary files, and saves them back in the SE.
4. Conclusions / Future plans
Currently the application facilitates processes that could also be achieved by grid-scripting. This is only a starting point towards a fully fledged distributed grid-application architecture WSx-compliant, integrated in the Information System and ready for QoS as an application-level grid service for financial research. The “second tier” of the application described in (3) can be viewed as a general purpose tool that is useful to any researcher wishing to perform similarly intensive analysis.
URL for further information:
https://euindia.ictp.it/stock-analysis-application
1. Short overview
The primary objective is of analysing a massive financial databases on an instrument-by-instrument basis (one instrument’s data analysed at each node) but may have many other application domains. It may be a valuable tool for the grid community at large: transfers and unzips large quantities of data from secure storage to each node, performs identical computationally intensive statistical analysis of the data at each node and then zips and securely stores the voluminous results of this analysis.