HEPiX Fall/Autumn 2017 Workshop

Name: HEPiX Fall/Autumn 2017 Workshop
Start: 2017-10-16T08:00:00+09:00
End: 2017-10-20T15:00:00+09:00
Location: KEK

16–20 Oct 2017

KEK

Asia/Tokyo timezone

Organisers

hepix-2017fall-support@hepix.org

Contribution List

4. Welcome to KEK

Yasuhiro Okada (KEK)

16/10/2017, 09:00

Miscellaneous

5. Logistics information

Tomoaki Nakamura (High Energy Accelerator Research Organization (JP))

16/10/2017, 09:20

Miscellaneous

6. The HEPiX workshop series

Helge Meinhard (CERN)

16/10/2017, 09:30

Miscellaneous

8. LHCOPN and LHCONE

Edoardo Martelli (CERN)

16/10/2017, 09:40

Miscellaneous

7. The HPSS User Forum (HUF)

Jim Gerry (IBM)

16/10/2017, 09:50

Miscellaneous

1. Particulate magnetic tape for data storage and future technologies

Mr Masahito Oyanagi (Recording Media Research Laboratories, Fujifilm Corporation)

16/10/2017, 10:00

Storage & Filesystems

Storage and file systems

Enterprise tape drives are widely used at major laboratories in the world, such as CERN, US DoE Labs, KEK and so on as well as data centers in commercial companies. Demands on capacity and speed of I/O inflate infinitely in the tape market. Not only drive technology but also media technology is the key for answering such future requirements. Fujifilm is the world-leading company in the market...

55. KEK Site Report

Atsushi Manabe (High Energy Accelerator Research Organization (KEK))

16/10/2017, 11:00

Site Reports

Site reports

We would like to introduce the brief history and report the current status of Computing Research Center at KEK. Many activities and near future plans on R&D, for example, networking, computer security, and private cloud deployment, which are submitted to the HEPiX workshop this time, will be summarized.

28. Tokyo Tier-2 Site Report

Tomoe Kishimoto (University of Tokyo (JP))

16/10/2017, 11:15

Site Reports

Site reports

The Tokyo Tier-2 site, which is located in International Center for Elementary Particle Physics (ICEPP) at the University of Tokyo, is providing computing resources for the ATLAS experiment in the WLCG.
Updates on the site since the Spring 2017 meeting and a migration plan for the next system upgrade will be presented.

56. Australia site report

Mr Sean Crosby (University of Melbourne (AU))

16/10/2017, 11:30

Site Reports

Site reports

2017 has been a year of change for the Australian HEP site. The loss of a staff member, migration of batch system, and increased use of cloud are just some of the changes happening in Australia. We will provide an update on the happenings in Australia.

65. ASGC Site Report

Mr Eric YEN (ASGC)

16/10/2017, 11:45

Site Reports

Site reports

ASGC site report on facility deployment, recent activities, collaborations and plans.

74. IHEP Site Report

Mr Xiaowei Jiang (IHEP（中国科学院高能物理研究所）)

16/10/2017, 12:00

Site Reports

Site reports

This report will talk about the current status and recent updates at IHEP Site since the Spring 2017 report, covering computing, network, storage and other related work.

82. KR-KISTI-GSDC-01 Tier-1 Site Reports

Jeongheon Kim (Korea Institute of Science and Technology Information)

16/10/2017, 12:15

Site Reports

Site reports

We will present the latest status of the GSDC. And migration plan of administrative system will be presented.

15. The LHCONE network

Mr Vincenzo Capone (GÉANT)

16/10/2017, 14:00

Security & Networking

Networking and security

LHCONE is a worldwide network dedicated to the data transfers of HEP experiments. The presentation will explain the origin and the architecture of the network, the services and advantages it provides, the benefits achieved so far. It will also include an update with the latest achievements

25. WLCG/OSG Networking Update

Marian Babik (CERN)

16/10/2017, 14:25

Security & Networking

Networking and security

WLCG relies on the network as a critical part of its infrastructure and therefore needs to guarantee effective network usage and prompt detection and resolution of any network issues, including connection failures, congestion and traffic outing. The OSG Networking Area is a partner of the WLCG effort and is focused on being the primary source of networking information for its partners and...

57. The TransPAC project

Andrew Lee (Indiana University)

16/10/2017, 14:50

Security & Networking

Networking and security

The TransPAC project has a long history of supporting R&E networking, connecting the Asia Pacific region to the United States to facilitate research. This talk will give an overview of the project for those who may not be familiar with it or its activities and a brief sketch of future plans. Then the talk will cover LHCONE connectivity from our perspective and lay out options for how TransPAC...

18. AutoGOLE bringing high speed data transfers

Joe Mambretti (Northwestern University)

16/10/2017, 15:15

Security & Networking

Networking and security

The Automated GOLE (AutoGOLE) fabric enables research and education networks worldwide to automate their inter-domain service provisioning. By using the AutoGOLE control plane infrastructure, services to other countries can be setup in minutes. Besides automated provisioning we experiment with connecting high-speed Data Transfer Nodes (DTNs) to the AutoGOLE environment. This talk will discuss...

54. Next Generation Software Defined Services and the Global Research Platform

Joe Mambretti (Northwestern University)

16/10/2017, 16:10

Security & Networking

Networking and security

The Global Research Platform is a world-wide software defined distributed environment designed specifically for data intensive science. The talk will show how this environment could be used for experiments like the LHC

58. NetSage, a unified network measurement and visualization service

Andrew Lee (Indiana University)

16/10/2017, 16:35

Security & Networking

Networking and security

Modern science is increasingly data-driven and collaborative in nature, producing petabytes of data that can be shared by tens to thousands of scientists all over the world. NetSage is a project to develop a unified open, privacy-aware network measurement, and visualization service to better understand network usage in support of these large scale applications. New capabilities to measure and...

24. Tier-1 networking cost and optimizations

Erik Mattias Wadenstein

16/10/2017, 17:00

Security & Networking

Networking and security

As the WLCG data sets grow ever bigger, so will network usage. For those of us with limited budgets, it would be nice if network costs won't get ever bigger too.

As NDGF is one of the few tier-1 sites in WLCG required to pay full networking costs, including transit, we'll look at the cost breakdown of networking for a tier-1 site and talk about where optimizations might be found.

32. Network Functions Virtualisation Working Group Proposal

Marian Babik (CERN)

16/10/2017, 17:25

Security & Networking

Networking and security

High Energy Physics (HEP) experiments have greatly benefited from a strong relationship with Research and Education Network (REN) providers and thanks to the projects such as LHCOPN/LHCONE and REN contributions, have enjoyed significant capacities and high performance networks for some time. RENs have been able to continually expand their capacities to over-provision the networks relative to...

88. HEP Community White Paper

Michel Jouvin (Université Paris-Saclay (FR))

16/10/2017, 17:50

Computing & Batch Services

Computing and batch systems

A short introduction and status report

19. CERN Site Report

Jerome Belleman (CERN)

17/10/2017, 09:00

Site Reports

Site reports

News from CERN since the workshop at the Hungarian Academy of Sciences.

30. PIC report

Jose Flix Molina (Centro de Investigaciones Energéti cas Medioambientales y Tecno)

17/10/2017, 09:15

Site Reports

Site reports

This is the PIC report to HEPIX Fall 2017.

31. BNL RACF Site Report

Ofer Rind

17/10/2017, 09:30

Site Reports

Site reports

BNL's RHIC/ATLAS Computing Facility (RACF) serves the computing needs of experiments at RHIC, while also serving as the US ATLAS Tier-1 facility. In recent years, RACF has been expanding to serve a growing list of scientific communities at BNL. This presentation provides an overview of the RACF, highlighting significant developments since the last HEPiX meeting in Budapest.

40. INFN-T1 site report

Mr Andrea Chierici

17/10/2017, 09:45

Site Reports

Site reports

A brief report on Italian T1 activities.

61. KIT Site Report

Andreas Petzold

17/10/2017, 10:00

Site Reports

Site reports

News about GridKa Tier-1 and other KIT IT projects and infrastructure. We'll focus on our experiences with our new 20+PB online storage installation.

70. RAL Site Report

Martin Bly (STFC-RAL)

17/10/2017, 10:15

Site Reports

Site reports

An update on activities at the UK Tier1 @ RAL

23. NDGF Site Report

Erik Mattias Wadenstein

17/10/2017, 11:00

Site Reports

Site reports

News and updates from the NDGF Tier-1 site.

Focus on this report will be new disk and tape resources and some performance numbers from both.

Also some site news from our distributed sites.

14. PDSF Site Report and Transition

Tony Quan (LBL)

17/10/2017, 11:15

Site Reports

Site reports

PDSF, the Parallel Distributed Systems Facility, was moved to Lawrence Berkeley National Lab from Oakland CA in 2016. The cluster has been in continuous operation since 1996 serving high energy physics research. The cluster is a tier-1 site for Star, a tier-2 site for Alice and a tier-3 site for Atlas.

The PDSF cluster is in transition this year, moving the batch system from UGE to SLURM...

21. Swiss National Supercomputing Centre - T2 Site Report

Dario Petrusic (Eidgenoessische Technische Hochschule Zuerich (CH))

17/10/2017, 11:30

Site Reports

Site reports

Site report, news and ongoing activities at the Swiss National Supercomputing Centre (CSCS-LCG2), running ATLAS, CMS and LHCb.

62. AGLT2 Site Report

Shawn Mc Kee (University of Michigan (US))

17/10/2017, 11:45

Site Reports

Site reports

We will present an update on the ATLAS Great Lakes Tier-2 (AGLT2) site since the Spring 2017 report including changes to our networking, storage and deployed middleware. This will include the status of our transition to CentOS/SL7 for both our servers and worker nodes, our upgrade of VMware from 5.5 to 6.5 and our upgrade of Lustre to 2.10.1 + ZFS 0.7.1 as well as our work to install Open...

48. US ATLAS SWT2 Site Report

Horst Severini (University of Oklahoma (US))

17/10/2017, 12:00

Site Reports

Site reports

We will present an update on our sites and cover our work with various efforts
like xrootd storage elements, opportunistic usage of general HPC resources,
and containerization.

We will also report on our latest hardware purchases, as well as
the status of network updates.

We conclude with a summary of successes and problems we encountered
and indicate directions for future work.

37. University of Wisconsin-Madison CMS T2 site report

Ajit Kumar Mohapatra (University of Wisconsin Madison (US))

17/10/2017, 12:15

Site Reports

Site reports

As a major WLCG/OSG T2 site, the University of Wisconsin-Madison CMS T2 has consistently been delivering highly reliable and productive services towards large scale CMS MC production/processing, data storage, and physics analysis for last 11 years. The site utilises high throughput computing (HTCondor), highly available storage system (Hadoop), scalable distributed software systems (CVMFS),...

63. Recent network connectivity around KEK

Soh Suzuki

17/10/2017, 14:00

Security & Networking

Networking and security

Last year, KEK had upgraded the upstream link to 100Gbps in Apr.
then officially started the peer with LHCONE since Sep.
Then KEK can distribute huge data to WLCG sites by adequate
throughput altough this upgrade didn't made large impact on
the firewalls for the ordinary internet usage from the campus
network.

We will report changes by the LHCONE peer and
how we connect our campus network and...

75. Network related updates in IHEP

Mr zhihui sun (Institute of High Energy Physics Chinese Academy of Sciences)

17/10/2017, 14:20

Security & Networking

Networking and security

We give the design and plan of network architecture updates in IHEP at HEPIX Spring 2017, and it has been finished in August 2017. This report talks about the network architecture upgrades, Dual stack ipv6 test, network measurement and morning at IHEP and network security upgrades.

66. Netbench –testing network devices with real-life traffic patterns

Stefan Nicolae Stancu (CERN)

17/10/2017, 14:40

Security & Networking

Networking and security

Network performance is key to the correct operation of any modern datacentre or campus infrastructure. Hence, it is crucial to ensure the devices employed in the network are carefully selected to meet the required needs.

The established benchmarking methodology [1,2] consists of various tests that create perfectly reproducible traffic patterns. This has the advantage of being able to...

34. Deployment of IPv6-only CPU on WLCG - an update from the HEPiX IPv6 Working Group

Dave Kelsey (STFC - Rutherford Appleton Lab. (GB))

17/10/2017, 15:05

Security & Networking

Networking and security

This update from the HEPiX IPv6 Working Group will present the activities of the last 6-12 months. In September 2016, the WLCG Management Board approved the group’s plan for the support of IPv6-only CPU, together with the linked requirement for the deployment of production Tier 1 dual-stack storage and other services. A reminder of the requirements for support of IPv6 and the deployment...

17. Using Configuration Management to deploy and manage network services

Quentin Barrand (CERN)

17/10/2017, 16:00

Security & Networking

Networking and security

Configuration Release Management (CRM) is rapidly gaining popularity among service managers, as it brings version control, automation and lifecycle management to system administrators. At CERN, most of the virtual and physical machines are managed through the Puppet framework, and the networking team is now starting to use it for some of its services.
This presentation will focus on the...

12. Follow-up about Wi-Fi service enhancement at CERN

Vincent Ducret (CERN)

17/10/2017, 16:25

Security & Networking

Networking and security

As presented during HEPiX Fall 2016, a full renewal of the CERN Wi-Fi network was launched in 2016 in order to provide a state-of-the-art Campus-wide Wi-Fi Infrastructure. This year, the presentation will give a status and feedback about this overall deployment. It will provide information about the technical choices made, the methodology used for such a deployment, the issues we faced and how...

16. Configuration automation for CERN's new Wi-Fi infrastructure

Quentin Barrand (CERN)

17/10/2017, 16:50

Security & Networking

Networking and security

As presented at HEPiX Fall 2016, CERN is currently in the process of renewing its standalone Wi-Fi Access Points with a new state-of-the-art, controller-based infrastructure. With more than 4000 new Access Points to be installed, it is desirable to keep the existing deployment procedures and tools to avoid repetitive and error-prone actions during configuration and maintenance steps.
This...

13. Firewall Load-Balancing solution at CERN

Vincent Ducret (CERN)

17/10/2017, 17:15

Security & Networking

Networking and security

The CERN network infrastructure has several links to the outside world. Some are well identified and dedicated for experiments and research traffic (LHCOPN/LHCONE), some are more generics (general internet). For the latter, a specific firewall inspection is required for obvious security reasons, but with tens of gigabits per second of traffic, the firewalls capacities are highly challenged....

35. DESY site report

Dirk Jahnke-Zumbusch (DESY)

18/10/2017, 09:00

Site Reports

Site reports

news about what happened at DESY during the last months

76. GSI Site Report

Mr Jan Trautmann (GSI Darmstadt)

18/10/2017, 09:15

Site Reports

Site reports

News and updates from GSI IT, e.g.:
- status GreenITCube
- new asset management system

53. Integrating HPC and HTC at BNL -- a year later

Dr Tony Wong (Brookhaven National Laboratory)

18/10/2017, 09:30

Computing & Batch Services

Computing and batch systems

This presentation discusses the new responsibilities of the Scientific Data & Computing Center (SDCC) in high-performance computing (HPC) and how we are leveraging effort and resources to improve BNL community's access to local and leadership-class facilities (LCF's).

51. Techlab update

Romain Wartel (CERN)

18/10/2017, 09:55

Computing & Batch Services

Computing and batch systems

Techlab is a CERN IT activity aimed at providing facilities for studies improving the efficiency of the computing architecture and making better utilisation of the processors available today.
It enables HEP experiments, communities and project to gain access to machines of modern architectures, for example Power 8, GPUs and ARM64 systems.
The hardware is periodically updated based on community...

52. HEPiX Benchmarking Working Group - Status Report October 2017

Manfred Alef (Karlsruhe Institute of Technology (KIT))

18/10/2017, 10:20

Computing & Batch Services

Computing and batch systems

The HEPiX Benchmarking Working Group has worked on a fast benchmark to estimate the compute power provided job slot or a IaaS VM. The Dirac Benchmark 2012 (DB12) is scaling well with the performance at least of Alice and LHCb when running within a batch job. Now the group has started the development of a next generation long running benchmark as a successor of the current HS06 metric.

85. Running Jobs Everywhere: an Overview of Batch Services at CERN

Jerome Belleman (CERN)

18/10/2017, 11:15

Computing & Batch Services

Computing and batch systems

Batch services at CERN have diversified such that computing jobs can
be run everywhere, from traditional batch farms, to disk servers, to
people's laptops, to commercial clouds. This talk offers an overview
of the technologies and tools involved.

3. Migration from Grid Engine to HTCondor

Thomas Finnern (DESY)

18/10/2017, 11:25

Computing & Batch Services

Computing and batch systems

The migration of the local batch system BIRD required the
adaptation of different properties like the Kerberos / AFS support, the
automation of various operational tasks and the user and project access. The
latter includes, inter alia, fairshare, accounting and resource access. For
this, some newer features of HTCondor had to be used. We are close to the
user release. Building common...

49. Migrating a WLCG tier-2 to a Cray XC-50 at CSCS-LCG2

Francesco Giovanni Sciacca (Universitaet Bern (CH))

18/10/2017, 11:50

Computing & Batch Services

Computing and batch systems

Founded in 1991, CSCS, the Swiss National Supercomputing Centre, develops and provides the key supercomputing capabilities required to solve important problems to science and/or society. The centre enables world-class research and provides resources to academia, industry and the business sector. Through an agreement with CHIPP, the Swiss Institute of Particle Physics, CSCS hosts a WLCG tier-2...

44. Optimising the resource needs for the LHC computing: ideas for a common approach

Andrea Sciaba (CERN)

18/10/2017, 12:15

Computing & Batch Services

Computing and batch systems

The increase of the scale of LHC computing expected for Run 3 and even more so for Run 4 (HL-LHC) over the course of the next 10 years will most certainly require radical changes to the computing models and the data processing of the LHC experiments. Translating the requirements of the physics programme into resource needs is an extremely complicated process and subject to significant...

42. Hot days with no mechanical cooling data center

Cary Whitney (LBNL)

18/10/2017, 14:00

IT Facilities & Business Continuity

IT facilities

I'll talk about how the data collect helped the center get through a heat wave in the Berkeley area. This is significant since Berkeley computing center does not have any mechanical cooling and relies on the external air temperature and water supply. Talking about what data we thought we needed and what data we did need and how the idea of saving all the data and collecting as much as we can...

20. Singularity at the RACF/SDCC

Christopher Hollowell (Brookhaven National Laboratory)

18/10/2017, 14:25

Grid, Cloud & Virtualisation

Clouds, virtualisation, grids

In this presentation, we'll give an overview of the Singularity
container system, and our experience with it at the RACF/SDCC at
Brookhaven National Laboratory. We'll also discuss Singularity's
advantages over virtualization and other Linux namespace-based
container solutions in the context of HTC and HPC applications.
Finally, we'll detail our future plans for this software at our
facility.

43. Running HEP Payloads on Distributed Clouds

Rolf Seuster (University of Victoria (CA))

18/10/2017, 14:50

Grid, Cloud & Virtualisation

Clouds, virtualisation, grids

The University of Victoria HEP group has been successfully running on distributed clouds for several years using the CloudScheduler/HTCondor framework. The system uses clouds in North America and Europe including commercial clouds. Over the last years, the operation has been very reliably, we are regularly running several thousands of jobs concurrently for the ATLAS and Belle II experiments....

27. Using Docker containers for scientific environments - on-premises and in the cloud

Sergey Yakubov (DESY)

18/10/2017, 15:10

Grid, Cloud & Virtualisation

Clouds, virtualisation, grids

Docker container virtualization provides an efficient way to create isolated scientific environments, adjusted and optimized for a specific problem or a specific group of users. It allows to efficiently separate responsibilities - with IT focusing on infrastructure for image repositories, preparation of basic images, container deployment and scaling, and physicists focusing on application...

33. Low-Power Wide-Area Network (LPWAN) at CERN

Mr Rodrigo Sierra (CERN)

18/10/2017, 16:00

Security & Networking

Networking and security

The interest in the Internet of Things (IoT) is growing exponentially so multiple technologies and solutions have emerged to connect mostly everything. A ‘thing’ can be a car, a thermometer or a robot that, when equipped with a transceiver, will exchange information over the internet with a defined service. Therefore, IoT comprises a wide variety of user cases with very different...

78. Fancy Networking

Tristan Suerink (Nikhef National institute for subatomic physics (NL))

18/10/2017, 16:25

Security & Networking

Networking and security

We've redesigned our HPC/Grid network to be capable of full network function virtualisation, to be prepared for large amounts of 100Gbps connections, and to be 400G ready. In this talk we want to take you through the design considerations for a fully non-blocking 6 Tbps virtual network, and what type of features we have build-in for the cloudification of our clusters using OpenContrail....

79. Network Automation for Intrusion Detection System

Adam Lukasz Krajewski (CERN)

18/10/2017, 16:50

Security & Networking

Networking and security

CERN networks are dealing with an ever-increasing volume of network traffic. The traffic leaving and entering CERN must be precisely monitored and analysed to properly protect the networks from potential security breaches. To provide the required monitoring capabilities, the Computer Security team and the Networking team at CERN have joined efforts in designing and deploying a scalable...

83. CEPH at RAL

Ian Collier

18/10/2017, 17:15

Storage & Filesystems

Storage and file systems

In March 2017 Echo went in to production at the RAL Tier 1 providing over 7PB of usable storage to WLCG VOs. This talk will present details of the setup and operational experience gained from running the cluster in production.

87. WLCG archival storage group

Helge Meinhard (CERN)

18/10/2017, 17:40

Storage & Filesystems

Storage and file systems

Brief introduction, and call for contributions, to a working group on archival storage at WLCG sites

45. EGI CSIRT: Keeping EGI Secure

Vincent Brillault (CERN)

19/10/2017, 09:00

Security & Networking

Networking and security

The EGI CSIRT main goal is, in collaboration with all resources providers, to keep the EGI e-Infrastructure running and secure. During the past years, under the EGI-Engage project, the EGI CSIRT has been driving the infrastructure in term of incident prevention and response, but also security training. This presentation provides an overview of these activities, focusing on the impact for the...

46. Security update

Romain Wartel (CERN)

19/10/2017, 09:25

Security & Networking

Networking and security

This presentation gives an overview of the current computer security landscape. It describes the main vectors of compromises in the academic community including lessons learnt, and reveal inner mechanisms of the underground economy to expose how our resources are exploited by organised crime groups, as well as recommendations to protect ourselves. By showing how these attacks are both...

64. Current Status and Future Directions of KEK Computer Security

Dr Fukuko Yuasa (KEK)

19/10/2017, 09:50

Security & Networking

Networking and security

Recently Japanese universities and academic organizations had experienced sever cyber attacks. To mitigate computer security incidents, we are forced to rethink our strategies in aspects of security management and network design.
In this talk, we report current status and present future directions of KEK Computer security.

47. Security: case study

Romain Wartel (CERN), Vincent Brillault (CERN)

19/10/2017, 10:15

Security & Networking

Networking and security

This is a TLP:RED presentation of a case study. Slides and details will not be made publicly available, and attendees have to agree to treat all information presented as confidential and refrain from sharing details on social media or blog. The presentation focuses on an insider attack and concentrates on the technical aspects of the investigation, in particular the network and file system...

68. Storage for Science at CERN

Dr Giuseppe Lo Presti (CERN)

19/10/2017, 11:10

Storage & Filesystems

Storage and file systems

In this contribution the vision for the CERN storage services and their applications will be presented.
Traditionally, the CERN IT Storage group has been focusing on storage for Physics data. A status update will be given about CASTOR and EOS, with the recent addition of the Ceph-based storage for High-Performance Computing.
More recently, the evolution has focused on providing higher-level...

36. Managing cloudy dCache storage pools with Ansible.

Ulf Bobson Severin Tigerstedt (Helsinki Institute of Physics (FI))

19/10/2017, 11:35

Storage & Filesystems

Storage and file systems

NDGF-T1 is transferring the dCache storage to a model whese dCache is no longer run by the sysadmin but run as a normal user. This enables centralized management of the software versions and their configs.
This automation is done with 3 roles in Ansible and a playbook to tie them together.
The end result is software running in an environment much like the cloud.

41. Cloud storage with the Dynafed data federator

Dr Marcus Ebert (University of Victoria)

19/10/2017, 11:50

Storage & Filesystems

Storage and file systems

We describe our use of the Dynafed data federator with cloud computing resources. Dynafed (developed by CERN IT) allows a dynamic data federation, based on the webdav protocol, with the possibility to have a single name space for data distributed over all available sites. It also allows a failover to another copy of a file in case the connection to the closest file location gets interrupted...

69. The Outlook for Archival Storage at CERN

Michael Davis (CERN)

19/10/2017, 12:15

Storage & Filesystems

Storage and file systems

The CERN Physics Archive is projected to reach 1 Exabyte during LHC Run 3. As the custodial copy of the data archive is stored on magnetic tape, it is very important to CERN to predict the future of tape as a storage medium.

This talk will give an overview of recent developments in tape storage, and a look forward to how the archival storage market may develop over the next decade. The...

39. Securing Elasticsearch for free: integration with SSO and Kerberos at CC-IN2P3

Fabien Wernli (CCIN2P3)

19/10/2017, 14:00

Basic IT Services

Basic IT services

It is now a well-known fact in the HEPiX community that the Elastic stack (FKA ELK) is
an extremely useful tool to dive into huge log data entries. It has also been presented multiple times
as lacking the security features so often needed in multi-user environments. Although it now provides
a plugin addressing some of those concerns, it requires the acquisition of a commercial...

50. On Server Management Interface (BMC)

Alexandru Grigore (CERN)

19/10/2017, 14:25

Basic IT Services

Basic IT services

In this presentation, I will go over CERN's efforts in improving the security and usability of the management interfaces for various server manufacturers.

38. riemann: a different stream processor

Fabien Wernli (CCIN2P3)

19/10/2017, 14:50

Basic IT Services

Basic IT services

We present riemann: a low-latency transient shared state stream processor.
This opensource monitoring tool is written by Kyle Kingsbury and
maintained by the community. Its unique design makes it as flexible as
it gets by melting the walls between configuration and code. Whenever its rich API
doesn't fit the use-case, it's as simple as using any library in the clojure or java
ecosystem...

73. Integrated Monitoring results at IHEP

Mr Qingbao Hu (IHEP)

19/10/2017, 15:15

Basic IT Services

Basic IT services

Various cluster monitoring tools are adapted or developed at IHEP, which show the health status of each device or aspect of IHEP computing platform separately. For example, Ganglia shows the machine load, Nagios monitors the service status, and Job-monitor tool developed by IHEP counts the job success rate and so on. But those monitoring data from different tools are independent and not easy...

67. Wigner Datacenter's new software defined datacenter architecture

Mr Zoltan Szeleczky (Wigner Datacenter)

19/10/2017, 16:10

Basic IT Services

Basic IT services

Our cloud deployment at Wigner Datacenter (WDC) is undergoing significant changes. We are adapting a new infrastructure, an automated OpenStack deployment using TripleO and configuration management tools like Puppet and Ansible. Over the past few months, our team at WDC have been testing TripleO as the base of our OpenStack deployment. We are also planning a centralized monitoring and logging...

72. CSNS HPC Platform Based on SLURM

Mr Yakang LI (ihep)

19/10/2017, 16:35

Basic IT Services

Basic IT services

China Spallation Neutron Source (CSNS) is a neutron source facility for studying neutron characteristics and exploring microstructure of matter,it will also serve as a high-level scientific research platform oriented to dimensional academic subjects.Scientific research on CSNS requires the support of a high-performance computing environment.So from the research and practice...

71. Updates from Database Services at CERN

Andrei Dumitru (CERN)

19/10/2017, 17:00

Basic IT Services

Basic IT services

CERN has a great number of applications that rely on a database for their daily operations and the IT Database Services group is responsible for current and future databases and their platform for accelerators, experiments and administrative services as well as for scale-out analytics services including Hadoop, Spark and Kafka. This presentation aims to give a summary of the current state of...

81. Deployment and monitoring for distributed computing sites

Wei Zheng (IHEP)

19/10/2017, 17:25

Standby

Basic IT services

Now IHEP can provide maintenance for those distributed computing sites, such as USTC and BUAA. We use both puppet and foreman to achieve these sites’ automatic deployment and configuration, OS installation, system configuration and software upgrade. In order to realize unified maintenance，We adopt nagios to monitor this site’s healthy status, including network, system, storage, services, ,etc....

86. Automatic shutdown of servers in case of A/C failure

Peter Gronbech (University of Oxford (GB))

19/10/2017, 17:50

Basic IT Services

Basic IT services

Following various A/C incidents in an Oxford Computer room, we developed a solution to automatically shutdown servers.

The solution has two parts the service which monitors the temperatures and publishes on a web page and the client which runs on the servers, queries the result to determine if shutdown is required. Digitemp software and one wire temperature sensors are used.

2. Modernising CERN document conversion service.

Ruben Domingo Gaspar Aparicio (CERN)

20/10/2017, 09:00