14–18 Oct 2013
Amsterdam, Beurs van Berlage
Europe/Amsterdam timezone

SynapSense Wireless Environmental Monitoring System of the RHIC & ATLAS Computing Facility at BNL

14 Oct 2013, 14:14
22m
Veilingzaal (Amsterdam, Beurs van Berlage)

Veilingzaal

Amsterdam, Beurs van Berlage

Oral presentation to parallel session Facilities, Production Infrastructures, Networking and Collaborative Tools Facilities, Infrastructures, Networking and Collaborative Tools

Speakers

Mr Alexandr Zaytsev (Brookhaven National Laboratory (US))Mr Kevin CASELLA (Brookhaven National Laboratory (US))

Description

RHIC & ATLAS Computing Facility (RACF) at BNL is a 15000 sq. ft. facility hosting the IT equipment of the BNL ATLAS WLCG Tier-1 site, offline farms for the STAR and PHENIX experiments operating at the Relativistic Heavy Ion Collider (RHIC), BNL Cloud installation, various Open Science Grid (OSG) resources, and many other small physics research oriented IT installations. The facility originated in 1990 and grew steadily up to the present configuration with 4 physically isolated IT areas with a combined rack capacity of about 2000 racks and the total peak power consumption of 1.5 MW. Since the infrastructural components of the RACF were deployed over such a long period of time (the oldest parts of physical infrastructure were built in late 1960s while the newest ones were added in 2010) a multitude of various environmental monitoring systems were eventually inherited in different areas. These various groups of equipment that in the end required costly maintenance and support were lacking a high level integration mechanism as well as a centralized web interface. In June 2012 a project was initiated with the primary goal to replace all these environmental monitoring systems with a single commercial hardware and software solution by SynapSense Corp. based on wireless sensor groups and proprietary SynapSense MapSense (TM) software that offers a unified solution for monitoring the temperature and humidity within the rack/CRAC units as well as pressure distribution underneath the raised floor across the entire facility. The new system also supports a set of additional features such as capacity planning based on measurements of total heat load, power consumption monitoring and control, CRAC unit power consumption optimization based on feedback from the temperature measurements and overall power usage efficiency estimations that are not currently implemented within RACF but may be deployed in the future. This contribution gives a detailed review of all the stages of deployment of the system and its integration with the existing personnel notification mechanisms and emergency management/disaster recovery protocols of the RACF. The experience gathered while operating the system is summarized and a comparative review of functionality/maintenance costs for the RACF environmental monitoring system before and after transition is given.

Primary authors

Mr Alexandr Zaytsev (Brookhaven National Laboratory (US)) Mr Kevin CASELLA (Brookhaven National Laboratory (US))

Co-authors

Mr Antonio WONG (Brookhaven National Laboratory (US)) Christopher Hollowell (Brookhaven National Laboratory (US)) Mr Enrique GARCIA (Brookhaven National Laboratory (US)) Mr Richard HOGUE (Brookhaven National Laboratory (US)) Mr William Strecker-Kellogg (Brookhaven National Laboratory (US))

Presentation materials