Service Availability Monitoring (SAM) is distributed monitoring framework used to monitor availability and reliability of EGI resources within the production infrastructure. SAM consists of following components: Nagios system for executing probes, databases containing topology of grid and monitoring results, message broker network for communication and MyEGI web interface.
SAM does not maintain the actual probes used for checking the status of grid services. Set of probes currently integrated into SAM consists of following probes:
- probes developed during the EGEE for the previous instance of SAM system (e.g. CE, SRM, WN tests)
- probes contributed by sites (e.g. SRCE, CERN)
- probes maintained by EMI product teams (BDII, ARC)
- native Nagios probes (e.g. TCP checks, FTP checks).
The session will consist of brief update of status of SAM framework, status of currently integrated probes and general Nagios probes development guidelines. Furthermore, existing probes will be described in details in order to ease handover to EMI product teams.