Speaker
Description
The LHAASO(Large High Altitude Air Shower Observatory) experiment of IHEP is located in Daocheng, Sichuan province (at the altitude of 4410 m). The main scientific goals of LHAASO are searching for galactic cosmic ray origins by extensive spectroscopy investigations of gamma ray sources above 30TeV. To accomplish these goals, LHAASO contains four detector arrays, which generates huge amounts of data and requires mass storage and high performance computing system. And the dedicated computing resource of LHAASO locates in Beijing, Daocheng and Chengdu as well as resources from collaborated organizations. How to establish a distributed computing system making the distributed resources work together and provide a good computing service for LHAASO is very important and urgent. However, it faces high operation and maintenance costs, system instability and other issues especially from remote sites.
In this paper we will describe the evolution of LHAASO distributed computing system based on virtualization and cloud computing technologies. Particularly, we discuss the key points of integrating distributed resources. A solution of integrating cross-domain resources is proposed, which adopt the Openstack+HTCondor to make the distributed resource work as a whole resource pool. A flexible resource scheduling strategy and a job scheduling policy are presented to realize the resource expansion on demand and the efficient job scheduling to remote sites transparently, so as to improve the overall resource utilization. We will also introduce the deployment of the computing system located in Daocheng, the LHAASO observation base using cloud-based architecture (Openstack+Kubernetes), which greatly helps to reduce the operation and maintenance cost as well as to make sure the system availability and stability. Finally, how to monitor the distributed computing system will be illustrated.
Consider for promotion | No |
---|