It is now a well-known fact in the HEPiX community that the Elastic stack (FKA ELK) is
an extremely useful tool to dive into huge log data entries. It has also been presented multiple times
as lacking the security features so often needed in multi-user environments. Although it now provides
a plugin addressing some of those concerns, it requires the acquisition of a commercial...
In this presentation, I will go over CERN's efforts in improving the security and usability of the management interfaces for various server manufacturers.
We present riemann: a low-latency transient shared state stream processor.
This opensource monitoring tool is written by Kyle Kingsbury and
maintained by the community. Its unique design makes it as flexible as
it gets by melting the walls between configuration and code. Whenever its rich API
doesn't fit the use-case, it's as simple as using any library in the clojure or java
ecosystem...
Various cluster monitoring tools are adapted or developed at IHEP, which show the health status of each device or aspect of IHEP computing platform separately. For example, Ganglia shows the machine load, Nagios monitors the service status, and Job-monitor tool developed by IHEP counts the job success rate and so on. But those monitoring data from different tools are independent and not easy...
Our cloud deployment at Wigner Datacenter (WDC) is undergoing significant changes. We are adapting a new infrastructure, an automated OpenStack deployment using TripleO and configuration management tools like Puppet and Ansible. Over the past few months, our team at WDC have been testing TripleO as the base of our OpenStack deployment. We are also planning a centralized monitoring and logging...
China Spallation Neutron Source (CSNS) is a neutron source facility for studying neutron characteristics and exploring microstructure of matter,it will also serve as a high-level scientific research platform oriented to dimensional academic subjects.Scientific research on CSNS requires the support of a high-performance computing environment.So from the research and practice...
CERN has a great number of applications that rely on a database for their daily operations and the IT Database Services group is responsible for current and future databases and their platform for accelerators, experiments and administrative services as well as for scale-out analytics services including Hadoop, Spark and Kafka. This presentation aims to give a summary of the current state of...
Following various A/C incidents in an Oxford Computer room, we developed a solution to automatically shutdown servers.
The solution has two parts the service which monitors the temperatures and publishes on a web page and the client which runs on the servers, queries the result to determine if shutdown is required. Digitemp software and one wire temperature sensors are used.