HEPiX Fall 2015 Workshop

Name: HEPiX Fall 2015 Workshop
Start: 2015-10-12T08:00:00-04:00
End: 2015-10-16T13:10:00-04:00
Location: Brookhaven National Laboratory

12–16 Oct 2015

Brookhaven National Laboratory

America/New_York timezone

Local organisers

Contribution List

82. Welcome to BNL

Michael Ernst (Unknown)

12/10/2015, 09:00

Miscellaneous

Welcome to BNL
13. CERN Site Report

Arne Wiebalck (CERN)

12/10/2015, 09:15

Site Reports

Site Reports

News from CERN since the Oxford workshop.
Go to contribution page
55. ATLAS Great Lakes Tier-2 (AGLT2) Update

Shawn Mc Kee (University of Michigan (US))

12/10/2015, 09:35

Site Reports

Site Reports

We will present an update on our site since the Fall 2014 report and cover our work with various storage technologies, ATLAS Muon Calibration and our use of the ELK stack for central sysloging. We will also report on our recent hardware purchases for 2015 as well as the status of our new networking configuration and 100G connection to the WAN. I conclude with a summary of what has worked...
Go to contribution page
0. INFN-T1 Site report

Andrea Chierici (INFN-CNAF)

12/10/2015, 09:55

Site Reports

Site Reports

Update on Italian Tier1 center status.
Go to contribution page
39. Fermilab Site Report

Rennie Scott (Fermilab)

12/10/2015, 10:15

Site Reports

Site Reports

News and updates from Fermilab since the Spring HEPiX Workshop.
Go to contribution page
32. Oxford and SouthGrid Site Status Report

Peter Gronbech (University of Oxford (GB))

12/10/2015, 11:00

Site Reports

Site Reports

Update on Oxford University Particle Physics group computing setup, including short updates from the other member sites of SouthGrid.
Go to contribution page
7. PIC Site Report

Jose Flix Molina (Centro de Investigaciones Energ. Medioambientales y Tecn. - (ES)

12/10/2015, 11:20

Site Reports

Site Reports

We will be revising the status of PIC by Fall 2015. News from Oxford meeting will be reported.
Go to contribution page
34. PDSF Site Report and Relocation

Tony Quan (LBL)

12/10/2015, 11:40

Site Reports

Site Reports

PDSF, the Parallel Distributed Systems Facility, has been in continuous operation since 1996 serving high energy physics research. It is currently a tier-1 site for Star, a tier-2 site for Alice and a tier-3 site for Atlas. The PDSF cluster will move early next year from its current site at Oakland to a new building on the LBNL campus. Several racks have already been installed at the new...
Go to contribution page
38. RAL Site Report

Martin Bly (STFC-RAL)

12/10/2015, 12:00

Site Reports

Site Reports

Update from RAL Tier1.
Go to contribution page
64. BEIJING Site Report

Dr Qiulan Huang (Institute of High Energy of Physics Chinese Academy of Science)

12/10/2015, 12:20

Site Reports

Site Reports

News and updates from IHEP since the Spring HEPiX Workshop. In this talk we will present a brief status of IHEP site including computing farm, Grid, data storage ,network and so on.
Go to contribution page
31. Simulating 5 Trillion Events on the Open Science Grid

Martin Lothar Purschke (Brookhaven National Laboratory (US))

12/10/2015, 14:00

Grid, Cloud & Virtualisation

Grid, Cloud and Virtualization

We report on a simulation effort using the Open Science Grid which utilized a large fraction of the available OSG resources for about 13 weeks in the first half of 2015. sPHENIX is a proposed upgrade of the PHENIX experiment at the Relativistic Heavy Ion Collider. We have collected large data sets of proton-proton collision data in 2012 and 2013, and plan to carry out a similar study...
Go to contribution page
40. Scientific Linux Status Update

Bonnie King (Fermilab)

12/10/2015, 14:20

End-User IT Services & Operating Systems

End-user Services and Operating Systems

Scientific Linux status and news.
Go to contribution page
2. CERN CentOS 7 and community update

Arne Wiebalck (CERN)

12/10/2015, 14:40

End-User IT Services & Operating Systems

End-user Services and Operating Systems

In this talk we will present a brief status update on CERN's work on CentOS 7, the uptake by the various IT services, and the interaction with the upstream CentOS community. We will talk about the SIGs status, new projects and work done over the last months, presenting a list of our contributions and feedback about the experience.
Go to contribution page
18. Software collaboration tools as a stack of services

Borja Aparicio Cotarelo (CERN)

12/10/2015, 15:00

End-User IT Services & Operating Systems

End-user Services and Operating Systems

The Version Control and Issue Tracking services team at CERN, is facing a transition and consolidation of the services provided in order to fulfill the needs of the CERN community as well as maintain the current deployments that are used. Software and services development is a key activity at CERN that currently is widely carried out using Agile methodologies. Code hosting and review,...
Go to contribution page
66. Using ownCloud to provide mobile access to existing storage

Chris Brew (STFC - Rutherford Appleton Lab. (GB))

12/10/2015, 15:50

End-User IT Services & Operating Systems

End-user Services and Operating Systems

We present our work deploying on ownCloud gateway to our existing home file storage to provided access to mobile clients.
Go to contribution page
28. Self service kiosk for Mac and mobiles

Tim Bell (CERN)

12/10/2015, 16:10

End-User IT Services & Operating Systems

End-user Services and Operating Systems

CERN has recently deployed a Mac self service portal to allow users to easily select software and perform standard configuration steps. This talk will review the requirements, product selection and potential evolution for mobile device management.
Go to contribution page
45. NERSC Plans for the Computational Research and Theory Building

Elizabeth Bautista (Lawrence Berkeley National Lab)

12/10/2015, 16:30

IT Facilities & Business Continuity

IT Facilities and Business Continuity

The NERSC facility is transitioning to a new building housed at the LBNL main campus in the 2015 timeframe. This state of the art facility is energy efficient, providing year-round free air and water cooling, is initially provisioned for 12.5 MW power and capable of up to 42 MW power, has two office floors, a 20K square foot HPC floor with seismic isolation and a mechanical floor. Substantial...
Go to contribution page
50. Growing our own systems administrators

Ian Peter Collier (STFC - Rutherford Appleton Lab. (GB))

12/10/2015, 16:50

Miscellaneous

Miscellaneous

Introducing a non-graduate recruitment path in scientific computing at the Rutherford Appleton Laboratory. Recruiting and retaining high quality staff is an increasing challenge at STFC. We traditionally recruit people with relevant degrees and/or industry experience. But this becomes increasingly difficult, as does recruiting to our graduate recruitment program. At the same time steep...
Go to contribution page
84. LAL Site Report

Michel Jouvin (Laboratoire de l'Accelerateur Lineaire (FR))

12/10/2015, 17:10

Site Reports

Site Reports

Changes at LAL and GRIF in the last 18 months.
Go to contribution page
62. Jefferson Lab Scientific and High Performance Computing

Sandy Philpott

13/10/2015, 09:00

Site Reports

Site Reports

Current high performance and experimental physics computing environment updates: core exchanges between USQCD and Experimental Physics clusters for load balancing, job efficiency, and 12GeV data challenges; Nvidia K80 GPU experiences and updated Intel MIC environment; update on locally developed workflow tools and write-through to tape cache filesystem; status of LTO6 integration into our MSS;...
Go to contribution page
47. University of Wisconsin Madison CMS T2 site report

Ajit Kumar Mohapatra (University of Wisconsin (US))

13/10/2015, 09:20

Site Reports

Site Reports

The University of Wisconsin Madison CMS T2 is a major WLCG/OSG T2 site. It has consistently delivered highly reliable and productive services for CMS MC production/processing, and large scale CMS physics analysis using high throughput computing (HTCondor), highly available storage system (Hadoop), efficient data access using xrootd/AAA, and scalable distributed software systems (CVMFS). The...
Go to contribution page
63. Australia-ATLAS site report

Lucien Philip Boland (University of Melbourne (AU))

13/10/2015, 09:40

Site Reports

Site Reports

Update on activities at Australia's HEP Tier 2 grid facility.
Go to contribution page
27. NDGF Site Report

Erik Mattias Wadenstein (University of Umeå (SE))

13/10/2015, 10:00

Site Reports

Site Reports

Update on recent events in the Nordic countries
Go to contribution page
74. Wigner Datacenter - Site report

Szabolcs Hernath (Hungarian Academy of Sciences (HU))

13/10/2015, 10:50

Site Reports

Site Reports

We give an update on the infrastructure, Tier-0 hosting services, Wigner Cloud and other recent developments at the Wigner Datacenter. We will also include a short summary on the Budapest WLCG Tier-2 site status as well.
Go to contribution page
73. Site report from KEK

Tomoaki Nakamura (High Energy Accelerator Research Organization (JP))

13/10/2015, 11:10

Site Reports

Site Reports

KEK computing research center supports various project of accelerator based science in Japan. Hadron and Neutrino experiments (T2K) at J-PARC have started with good rate after the recovery of earthquake damage at Fukushima. Belle II experiment is going to collect 100PB of raw data within the several years. In this talk, we would like to report the current status of our computing facility and...
Go to contribution page
77. KIT Site Report

Andreas Petzold (KIT - Karlsruhe Institute of Technology (DE))

13/10/2015, 11:30

Site Reports

Site Reports

News about GridKa Tier-1 and other KIT IT projects and infrastructure.
Go to contribution page
80. CSC.fi - Site Report

Johan Henrik Guldmyr (Helsinki Institute of Physics (FI))

13/10/2015, 11:50

Site Reports

Site Reports

- CSC - Ansible and HPC - Slurm as an interactive shell load balancer
Go to contribution page
69. SLAC Scientific Computing Services

Yemi Adesanya

13/10/2015, 12:10

Site Reports

Site Reports

An update on SLAC's central Unix services in support of Scientific Computing and Core Infrastructure. New funding model for FY15 identifies indirect vs direct-funded effort. Socializing the concept of service and service lifecycle. Sustainable business models to address hardware lifecycle replacement: Storage-as-a-Service with GPFS, OpenStack for dev/test environments and cluster provisioning.
Go to contribution page
86. DESY Site Report

Wolfgang Friebel (Deutsches Elektronen-Synchrotron Hamburg and Zeuthen (DE))

13/10/2015, 14:00

Site Reports

Site Reports

Updates from DESY Zeuthen
Go to contribution page
83. BNL Site Report

Dr Shigeki Misawa (Brookhaven National Laboratory)

13/10/2015, 14:20

Site Reports

Site Reports

This site report will discuss the latest developments at the RHIC-ATLAS Computing Facility (RACF).
Go to contribution page
20. News from the HEPiX IPv6 Working Group

Dave Kelsey (STFC - Rutherford Appleton Lab. (GB))

13/10/2015, 14:40

Security & Networking

Security and Networking

This talk will present a status update from the IPv6 working group, including recent testing and the deployment of (some) dual-stack services and monitoring in WLCG.
Go to contribution page
49. Status on the IPv6 OSG Software stack tests

Edgar Fajardo Hernandez (Univ. of California San Diego (US))

13/10/2015, 15:00

Security & Networking

Security and Networking

This talk will present the latest results from the IPv6 compatibility tests performed on the OSG Software stack.
Go to contribution page
44. Network infrastructure for the CERN Datacentre

Eric Sallaz (CERN)

13/10/2015, 15:50

Security & Networking

Security and Networking

With the evolution of transmission technologies, going above 10Gb Ethernet requires a complete renewal of the fibre infrastructure. In the last year the CERN Datacentre has evolved to deal with the expansion of the physical infrastructure inside and outside the local site, to support higher speeds like 40GbE and 100GbE, to be ready for any other future requirement. We will explain the choice...
Go to contribution page
58. WLCG Network and Transfer Metrics WG After One Year

Shawn Mc Kee (University of Michigan (US))

13/10/2015, 16:10

Security & Networking

Security and Networking

It has been approximately one year since the WLCG Network and Transfer Metrics working group was initiated and we would like provide a summary what has been achieved during this first year and discuss future activities planned for the group. The working group as chartered had a number of objectives: - Identify and make continuously available relevant network and transfer...
Go to contribution page
56. Update on WLCG/OSG perfSONAR Infrastructure

Shawn Mc Kee (University of Michigan (US))

13/10/2015, 16:30

Security & Networking

Security and Networking

WLCG relies on the network as a critical part of its infrastructure and therefore needs to guarantee effective network usage and prompt detection and resolution of any network issues, including connection failures, congestion and traffic routing. The WLCG Network and Transfer Metrics working group was established to ensure sites and experiments can better understand and fix networking...
Go to contribution page
43. Using VPLS for VM mobility

Carles Kishimoto Bisbe (CERN)

13/10/2015, 16:50

Security & Networking

Security and Networking

With the virtualization of the data centre, there is a need to move virtual machines transparently across racks when the physical servers are being decommissioned. We will present the solution being tested at CERN using VPLS in an MPLS network
Go to contribution page
36. Computer Security update

Liviu Valsan (CERN)

13/10/2015, 17:10

Security & Networking

Security and Networking

This presentation provides an update on the global security landscape since the last HEPiX meeting. It describes the main vectors of compromises in the academic community including lessons learnt, presents interesting recent attacks while providing recommendations on how to best protect ourselves. It also covers security risks management in general, as well as the security aspects of the...
Go to contribution page
65. Building a large scale Security Operations Centre

Liviu Valsan (CERN)

13/10/2015, 17:30

Security & Networking

Security and Networking

The HEP community is facing an ever increasing wave of computer security threats, with more and more recent attacks showing a very high level of complexity. Having a centralised Security Operations Centre (SOC) in place is paramount for the early detection and remediation of such threats. Key components and recommendations to build an appropriate monitoring and detection Security Operation...
Go to contribution page
85. Introduction to GDB

Michel Jouvin (Laboratoire de l'Accelerateur Lineaire (FR))

14/10/2015, 08:50

Grid, Cloud & Virtualisation

Grid, Cloud and Virtualization

This presentation is a general introduction, and it will also describe the opportunities for cooperation with HEPIX and how sites can participate in GDB.
Go to contribution page
14. CERN Cloud Status

Arne Wiebalck (CERN)

14/10/2015, 09:00

Grid, Cloud & Virtualisation

Grid, Cloud and Virtualization

This presentation will provide an update of the activities in the CERN Cloud service since the Oxford workshop.
Go to contribution page
15. Optimisations of the Compute Resources in the CERN Cloud Service

Arne Wiebalck (CERN)

14/10/2015, 09:20

Grid, Cloud & Virtualisation

Grid, Cloud and Virtualization

This talk will summarise our activities related to the optimisation of the virtualised compute resources in our OpenStack-based infrastructure. In particular, we will discuss some of the issues we've encountered and the various optimisations we've applied to bring the virtual resources as close as possible to bare-metal performance.
Go to contribution page
67. Automated performance testing framework

Sean Crosby (University of Melbourne (AU))

14/10/2015, 09:40

Grid, Cloud & Virtualisation

Grid, Cloud and Virtualization

In the CERN Cloud Computing project, there is a need to ensure that the overall performance of hypervisors and virtual machines does not decrease due to configuration changes, or just because of the passage of time. This talk will outline an automated performance framework currently being developed, which will allow performance of virtual machines and hypervisors to be graphed and linked.
Go to contribution page
12. An initial evaluation of Docker at the RACF

Christopher Hollowell (Brookhaven National Laboratory)

14/10/2015, 10:00

Grid, Cloud & Virtualisation

Grid, Cloud and Virtualization

Application containers have become a competitive alternative to virtualized servers. Containers allow applications to be written once, distributed across a heterogeneous environment (ie, cloud, remote data centers) and executed transparently on multiple platforms without the performance overhead commonly found on virtual systems. We present an initial evaluation of Docker, along with a...
Go to contribution page
25. Benchmarking commercial cloud resources

Domenico Giordano (CERN)

14/10/2015, 10:50

Grid, Cloud & Virtualisation

Grid, Cloud and Virtualization

Performance measurements and monitoring are essential for the efficient use of computing resources as they allow selecting and validating the most effective resources for a given processing workflow. In a commercial cloud environment an exhaustive resource profiling has additional benefits due to the intrinsic variability of a virtualised environment. In this context resource profiling via...
Go to contribution page
42. A comparison of performance between KVM and Docker instances in OpenStack

Wataru Takase (KEK)

14/10/2015, 11:10

Grid, Cloud & Virtualisation

Grid, Cloud and Virtualization

Cloud computing enables a flexible provisioning of computing resources by utilizing virtual machines on demand, and can provide an elastic data analysis environment. At KEK we plan to integrate cloud-computing technology into our batch cluster system in order to provide heterogeneous clusters dynamically, to support various kinds of data analyses, and to enable elastic resource allocation...
Go to contribution page
30. The Open Science Grid: Physics Experiments to Campus Researchers

Chander Sehgal (Fermilab)

14/10/2015, 11:30

Grid, Cloud & Virtualisation

Grid, Cloud and Virtualization

The Open Science Grid (OSG) ties together computing resources from a broad set of research communities, connecting their resources to create a large, robust computing grid; this computing infrastructure started with large HEP experiments such as ATLAS, CDF, CMS, and Dzero and it has evolved so now it is also enabling the scientific computation of many non-physics researchers. OSG has been...
Go to contribution page
54. Improving IaaS resources to accommodate scientific applications

Andrea Chierici (INFN-CNAF)

14/10/2015, 11:50

Grid, Cloud & Virtualisation

Grid, Cloud and Virtualization

INDIGO-DataCloud aims at developing a data and computing platform targeted at scientific communities, integrating existing standards and open software solutions. INDIGO proposes: - to build-up a PaaS solution leveraging existing resources and e-Infrastructures, since the mere access to IaaS resources has been demonstrated as not being a realistic option for most Research Communities -...
Go to contribution page
33. First experiences with Mesos at RAL

Andrew David Lahiff (STFC - Rutherford Appleton Lab. (GB))

14/10/2015, 12:10

Grid, Cloud & Virtualisation

Grid, Cloud and Virtualization

Running services in containers managed by a scheduler offers a number of potential benefits compared to traditional infrastructure, such as increased resource utilisation through multi-tenancy, the ability to have elastic services, and improved site availability due to self-healing. At the RAL Tier-1 we have been investigating migration of services to an Apache Mesos cluster running on bare...
Go to contribution page
61. HTCondor-CE: Managing the Grid with HTCondor

Brian Lin (University of Wisconsin - Madison)

14/10/2015, 14:00

Grid, Cloud & Virtualisation

Grid, Cloud and Virtualization

HTCondor-CE is a special configuration of HTCondor designed to connect compute resources to the wider grid. Leveraging the power of HTCondor, HTCondor-CE is able to provide built-in security measures, end-to-end job tracking, and better integration with overlay job systems. This talk will present an overview of the HTCondor-CE software, its deployment in the Open Science Grid (OSG), and...
Go to contribution page
59. Running ATLAS, CMS, ALICE Workloads on the NERSC Cray XC30

James Botts (LBNL)

14/10/2015, 14:20

Grid, Cloud & Virtualisation

Grid, Cloud and Virtualization

This presentation will describe work that has been done to make NERSC Cray systems friendlier to data intensive workflows in anticipation of the availability of the NERSC-8 system this Autumn. Using shifter, a docker-like container technology developed at NERSC by Doug Jacobsen and Shane Canon, the process of delivering software stacks to Cray compute nodes has been greatly...
Go to contribution page
78. computing resources virtualization in IHEP

Dr Qiulan Huang (Institute of High Energy of Physics Chinese Academy of Science)

14/10/2015, 14:40

Grid, Cloud & Virtualisation

Grid, Cloud and Virtualization

The report will introduce our plan of computing resources virtualization. We will discuss progress of this project such as Openstack-based infrastructure, virtual scheduling and measure. And we will share experience of our test-bed which deployed with Openstack icehouse.
Go to contribution page
17. CvmFS deployment status and trends

Alberto Rodriguez Peon (Universidad de Oviedo (ES))

14/10/2015, 15:00

Storage & Filesystems

Storage and Filesystems

CvmFS is a network file system based on HTTP and optimised to deliver experiment software in a fast, scalable, and reliable way. This presentation will review the status of the stratum 0 deployment at CERN, mentioning some of the challenges faced during its migration to ZFS as the underlying file system.
Go to contribution page
21. Future home directory at CERN

Alberto Pace (CERN)

14/10/2015, 15:50

Storage & Filesystems

Storage and Filesystems

Several discussions are ongoing at CERN concerning the future of AFS and a possible replacement using the existing CERNBOX service based on OwnCloud and the CERN disk storage solution developed for the LHC Computing (EOS). The talk will mainly review the future plans for teh CERNBOX Services ant the various use case that have been identified.
Go to contribution page
9. Ceph Based Storage Systems at the RACF

Alexandr Zaytsev (Brookhaven National Laboratory (US))

14/10/2015, 16:10

Storage & Filesystems

Storage and Filesystems

We give a report on the status of Ceph based storage systems deployed in RACF that are currently providing 1 PB of data storage capacity for the object store (with Amazon S3 compliant RADOS Gateway front end), block storage (RBD), and shared file system (CephFS with dCache front end) layers of Ceph storage system. The hardware deployment and software upgrade procedures performed in the year of...
Go to contribution page
6. Ceph object storage at RAL

George Vasilakakos (STFC)

14/10/2015, 16:30

Storage & Filesystems

Storage and Filesystems

RAL is currently developing storage services powered by a Ceph object storage system. We review the test results and experiences of the newly-installed 5 PB cluster at RAL, as well as our plans for it. Since the aim is to provide large scale storage for experimental data with minimal space overhead, we focus on testing a variety of erasure coding techniques and schemes. We look at...
Go to contribution page
29. Scientific Data Storage at FNAL

Gerard Bernabeu Altayo (Fermilab)

14/10/2015, 16:50

Storage & Filesystems

Storage and Filesystems

Fermilab stores more than 110PB of data employing different technologies (dCache, EOS, BlueArc) to address a wide variety of use cases and application domains. This presentation captures present state of data storage at Fermilab and maps out future directions in storage technology choices at the lab.
Go to contribution page
23. Accelerating High Performance Cluster Computing Through the Reduction of File System Latency

Mr David Fellinger (DataDirect Networks, Inc)

14/10/2015, 17:10

Storage & Filesystems

Storage and Filesystems

The acceleration of high performance computing applications in large clusters has primarily been achieved with a focus on the cluster itself. Lower latency interconnects, more efficient message passing structures, higher performance processors, and general purpose graphics processing units have been incorporated in recent cluster designs. There has also been a great deal of study regarding...
Go to contribution page
68. Supervised Machine Learning

Mr Amit Chattopadhyay (WD)

15/10/2015, 09:00

Storage & Filesystems

Storage and Filesystems

Given the amount of inline data generated, large volume hard-drive manufacturing is an appropriate environment to employ contemporary 'Big Data' techniques. It can be used to generate a feedback loop, as well as a feed-forward path. The feedback loop helps fix what is broken. The feed-forward path can be used to predict drive health. In this presentation, I will focus on some work we have...
Go to contribution page
37. Quality of Service in storage and the INDIGO-DataCloud project.

Patrick Fuhrmann (Deutsches Elektronen-Synchrotron Hamburg and Zeuthen (DE))

15/10/2015, 09:20

Storage & Filesystems

Storage and Filesystems

The pressure to provide cheap, reliable and unlimited cloud storage space in the commercial area has provided science with affordable storage hardware and open source storage solutions with low maintenance costs and tuneable performance and durability properties, resulting in different cost models per storage unit. Those models, already introduced by WLCG a decade ago (disk vs tape) are now...
Go to contribution page
76. USATLAS dCache storage system at BNL

Zhenping Liu (Brookhaven National Laboratory (US))

15/10/2015, 09:40

Storage & Filesystems

Storage and Filesystems

As ATLAS tier-1 computing facility center in US, RHIC and ATLAS Computing Facility (RACF) at Brookhaven National Lab has been operating a very large scale disk storage system dCache with tape back-end to serve a geographically diverse, worldwide ATLAS scientific community. This talk will present the current state of USATLAS dCache storage system at BNL. It will describe structure,...
Go to contribution page
35. Space usage monitoring for distributed heterogeneous data storage systems.

Natalia Ratnikova (Fermilab)

15/10/2015, 10:00

Storage & Filesystems

Storage and Filesystems

Prior to LHC Run 2 CMS collected over 100 PB of physics data on the distributed storage facilities outside CERN, and the storage capacity will considerably increase in the next years. During the Run 2 the amount of storage allocated to individual users analysis data will reach up to 40% of the total space pledged by the CMS sites. CMS Space Monitoring system is developed to give a...
Go to contribution page
4. The HP IT Data Center Transformation Journey

Dave Rotheroe (HP)

15/10/2015, 10:20

IT Facilities & Business Continuity

IT Facilities and Business Continuity

HP IT consolidated from 85 ancient data centers running old IT technology to six new mega data centers with modern IT running transformed applications in the late 2000’s. That achievement resulted in (literally) billions saved through a meshing of significantly more efficient data centers, utilization of then-current IT technology, and application transformation. Since achieving that, HP IT...
Go to contribution page
1. Proposal for a new Data Center at BNL

Mr Imran Latif (Brookhaven National Laboratory)

15/10/2015, 11:10

IT Facilities & Business Continuity

IT Facilities and Business Continuity

The methods and techniques of scientific research at Brookhaven National Laboratory are increasingly dependent upon the ability to acquire, analyze, and store vast quantities of data. The needs for data processing equipment and supporting infrastructure are anticipated to grow significantly in the next few years, soon exceeding the capacity of existing data center resources, currently located...
Go to contribution page
53. Asset management in CERN data centres

Eric Bonfillou (CERN)

15/10/2015, 11:30

IT Facilities & Business Continuity

IT Facilities and Business Continuity

This presentation will give an overview of the recent efforts to establish an accurate and consistent inventory and stock management for CERN data centre assets. The underlying tool, Infor EAM (http://www.infor.com/solutions/eam/), was selected because of its wider usage over many years in other areas at CERN. The presentation will focus on the structuring of the IT assets data and how it is...
Go to contribution page
8. Energy efficiency upgrades at PIC

Jose Flix Molina (Centro de Investigaciones Energ. Medioambientales y Tecn. - (ES)

15/10/2015, 11:50

IT Facilities & Business Continuity

IT Facilities and Business Continuity

Energy consumption is an increasing concern for data centers. This contribution summarizes recent energy efficiency upgrades at the Port d’Informació Científica (PIC) in Barcelona, Spain which have considerably lowered energy consumption. The upgrades were particularly challenging, as they involved modifying the already existing machine room, which is shared by PIC with the general IT services...
Go to contribution page
5. Energy Services Performance Contracting Abstract - HEPIX Fall 2015

Mr Michael Ross (HP)

15/10/2015, 12:10

IT Facilities & Business Continuity

IT Facilities and Business Continuity

In today’s Federal Information Technology world you are face with many challenges. To name a few Cyber-attacks, un-funded mandates to consolidate data centers, new IT acquisitions laws requiring de-duplication with agency level CIO oversight implemented by the Federal Information Technology Acquisition Reform Act (FITARA), aging equipment and data center infrastructure. Each of these...
Go to contribution page
24. Document oriented database infrastructure for monitoring HEP data systems applications

Carlos Fernando Gamboa (Brookhaven National Laboratory (US))

15/10/2015, 14:00

Basic IT Services

Basic IT Services

Within the High Energy Physics (HEP) software infrastructure a diverse Data storage and Distribution software technologies are used. Despite their heterogeneity they all provide a capability to trace application event data to be able to troubleshoot problems related to the software stack or software usage. The subsequent data is written once and stored in the majority of cases in a file,...
Go to contribution page
46. ELK at NERSC

Cary Whitney (LBNL)

15/10/2015, 14:20

Basic IT Services

Basic IT Services

The ELK (Elastic, Logstash, Kibana) stack has been chosen to be one of the key component for our new centerwide monitoring project. I'll discuss our overall philosophy on monitoring and how ELK fits in. The current structure and how it is performing. Define centerwide: Everything in the data centers. All hosts, filesystems, most applications, power, cooling, water flow, temperature,...
Go to contribution page
51. CERN Monitoring Status Update

Miguel Coelho dos Santos (CERN)

15/10/2015, 14:40

Basic IT Services

Basic IT Services

This presentation will provide an update of the activities concerning IT Monitoring. This includes the monitoring of data centre infrastructure, hardware monitoring, host monitoring and application monitoring; as well as the tools being used or tested.
Go to contribution page
16. Update on Configuration Management at CERN

Alberto Rodriguez Peon (Universidad de Oviedo (ES))

15/10/2015, 15:00

Basic IT Services

Basic IT Services

An update on CERN’s Configuration Service will be presented. This presentation will review the current status of the infrastructure and describe some of the ongoing work and future plans, with a particular focus on automation and continuous integration. Recent effort to scale and accommodate a higher number of puppet clients will also be mentioned.
Go to contribution page
3. ITIL Service models in Detector DAQ Computing

Bonnie King (Fermilab)

15/10/2015, 15:50

Basic IT Services

Basic IT Services

Fermilab has moved from the era of two large multi-decade experiments to hosting several smaller experiments with a shorter lifecycle. Improvements in micrcontroller performance have moved computers closer to the experiment Data Acquisition systems where custom electronics have previously been used. There are also efforts to standardize DAQ software into reuseable products in alignment with...
Go to contribution page
57. Quattor Update

James Adams (STFC RAL)

15/10/2015, 16:10

Basic IT Services

Basic IT Services

An update of developments and activities in the Quattor community over the last six months.
Go to contribution page
10. NVM Express (NVMe): An Overview and Performance Study

Christopher Hollowell (Brookhaven National Laboratory)

15/10/2015, 16:30

Computing & Batch Services

Computing and Batch Services

NVMe (Non-Volatile Memory express) is a leading-edge SSD technology where drives are directly attached to the PCI-e bus. Typical SAS/SATA controllers are optimized for use with traditional rotating hard drives, and as such can increase latency, and reduce the bandwidth available to attached SSDs. Since NVMe drives bypass a SAS/SATA controller, they can help minimize/eliminate...
Go to contribution page
26. Status of DESY Batch Infrastructures

Thomas Finnern (DESY)

15/10/2015, 16:50

Computing & Batch Services

Computing and Batch Services

This presentation will provide information on the status of the batch systems at DESY Hamburg. This includes the clusters for GRID, HPC and local batch purposes showing the current state and the activities for upcoming enhancements.
Go to contribution page
75. Update on benchmarking

Helge Meinhard (CERN), Michele Michelotto (Universita e INFN, Padova (IT))

15/10/2015, 17:10

Computing & Batch Services

Computing and Batch Services

Status of the benchmarking working group and work going on in WLCG around benchmarking
Go to contribution page
19. An Easy HTCondor Configuration to Support all Workloads (for some definition of all)

William Edward Strecker-Kellogg (Brookhaven National Laboratory (US)), William Strecker-Kellogg (Brookhaven National Lab)

16/10/2015, 09:00

Computing & Batch Services

Computing and Batch Services

Scheduling jobs with heterogeneous requirements to a heterogeneous pool of computers is a challenging task. HTCondor does a great job supporting such a general-purpose setup with features like Hierarchical Group Quotas and Partitionable Slots. At BNL we have a model, configuration, and software to handle the administration of such a pool, and in this talk we will share our experience...
Go to contribution page
72. Upgrade to UGE 8.2 : Positives effects at IN2P3-CC

Vanessa HAMAR (CC-IN2P3)

16/10/2015, 09:20

Computing & Batch Services

Computing and Batch Services

We are using Univa Grid Engine as BATCH scheduling system to our satisfaction since four years. We focus on the latest major version 8.2.1, which was deployed at IN2P3-CC 4 months ago, and provides further scalability improvements. We are supporting about 200 groups and experiments running up to 17.000 jobs simultaneously. The requirements, in terms of computing resources, storage or...
Go to contribution page
60. HTCondor Recent Enhancement and Future Directions

Todd Tannenbaum (Univ of Wisconsin-Madison, Wisconsin, USA)

16/10/2015, 09:40

Computing & Batch Services

Computing and Batch Services

The goal of the HTCondor team is to to develop, implement, deploy, and evaluate mechanisms and policies that support High Throughput Computing (HTC) on large collections of distributively owned computing resources. Increasingly, the work performed by the HTCondor developers is being driven by its partnership with the High Energy Physics (HEP) community. This talk will present recent changes...
Go to contribution page
11. Non-traditional workloads at the RACF

William Edward Strecker-Kellogg (Brookhaven National Laboratory (US)), William Strecker-Kellogg (Brookhaven National Lab)

16/10/2015, 10:00

Computing & Batch Services

Computing and Batch Services

The RACF is a key component in BNL's new Computational Science Initiative (CSI). One of CSI's goals is to leverage the RACF's expertise to shorten the time and effort needed to archive, process and analyze data from non-traditional fields at BNL. This presentation describes a concrete example of how the RACF has helped non-traditional workloads run in the RACF computing environment, and...
Go to contribution page
52. Future of Batch Processing at CERN

Jerome Belleman (CERN)

16/10/2015, 10:50

Computing & Batch Services

Computing and Batch Services

So as to have our Batch Service at CERN answer increasingly challenging scalability and flexibility needs, we have chosen to set up a new batch system based on HTCondor. We have set up a Grid-only pilot service and major LHC experiments have started trying it out. While the pilot is slowly becoming production-ready, we're laying out a plan for our next major milestone: to run local jobs too,...
Go to contribution page
70. Foreman - A full automation solution for a growing facility

Mizuki Karasawa (BNL)

16/10/2015, 11:10

Basic IT Services

Basic IT Services

In a rapidly growing facility as NSLS-II, we use foreman as an automation tool that integrates to DNS, DHCP, TFTP, Puppet which makes installation & provisioning processes much easier and help to bring the service/server components online in a short timely manner. For those who use Puppet Enterprise as an paid version ENC, Foreman can also substitute of that. This talk will present the detail...
Go to contribution page
71. Gitlab and its CI - an intelligent solution for hosting Git repos

Mizuki Karasawa (BNL)

16/10/2015, 11:30

Basic IT Services

Basic IT Services

Gitlab - a MIT licensed open source tool, that cooperates a set of rich features managing Git repositories, code reviews, issue tracking, activity feeds and wikis. The most powerful feature - CI for continuous integration makes the code developing much more efficient and cost saving, it's also a great tool to enhance the communication and collaboration. In NSLS-II, we have a great number of...
Go to contribution page
79. Host deployment and configuration technologies at SCC.

Dmitry Nilsen

16/10/2015, 11:50

Basic IT Services

Basic IT Services

Host deployment and configuration technologies at SCC. The Steinbuch Centre for Computing(SCC) at Karlsruhe Institute Of Technology (KIT) serves a number of projects, including the WLCG Tier-1 GridKa, the Large Scale Data Facility (LSDF), and the Smart Data Innovation Lab (SDIL) using bare metal and virtual compute resources and provides a variety of storage and computing services to the...
Go to contribution page
81. Monitoring with InfluxDB and Grafana

Andrew David Lahiff (STFC - Rutherford Appleton Lab. (GB))

16/10/2015, 12:10

Basic IT Services

Basic IT Services

At RAL we have been considering InfluxDB and Grafana as a possible replacement for Ganglia, in particular for application-specific metrics. Here we present our experiences with setting up monitoring for services such as Ceph, FTS3 and HTCondor, and discuss the advantages and disadvantages of InfluxDB and Grafana over Ganglia.
Go to contribution page
87. HEPIX Wrap-up

Helge Meinhard (CERN), Tony Wong (Brookhaven National Laboratory)

16/10/2015, 12:30

Closing and HEPIX Business