Choose timezone

Your profile timezone:

Use timezone based on:

Event/category Custom

Select a custom timezone

Login

HEPiX Fall 2014 Workshop

13–17 Oct 2014

University of Nebraska - Lincoln

America/Chicago timezone

Support

bbockelm@cse.unl.edu

Contribution List

60. Local Organizer Info

Brian Paul Bockelman (University of Nebraska (US))

13/10/2014, 09:00

HEPiX Business

10. Australia Site Report

Sean Crosby (University of Melbourne (AU))

13/10/2014, 09:15

Site reports

Site Reports

An update on the ATLAS Tier 2 and distributed Tier 3 of HEP groups in Australia. Will talk about our integration of Cloud resources, Ceph filesystems and integration of 3rd party storage into our setup

20. Oxford Particle Physics Computing update

Peter Gronbech (University of Oxford (GB))

13/10/2014, 09:30

Site reports

Site Reports

Site report from the University of Oxford Physics department.

19. Site report: NDGF-T1

Erik Mattias Wadenstein (University of Umeå (SE)), Ulf Tigerstedt (CSC Oy)

13/10/2014, 09:45

Site reports

Site Reports

Site report for NDGF-T1, mainly focusing on dCache.

36. Updates from Jefferson Lab HPC and Scientific Computing

Sandy Philpott (JLAB)

13/10/2014, 10:00

Site reports

Site Reports

An overview since our spring meeting on JLab's latest developments for 12 GeV physics computing and storage, Lustre update, openZFS plan, load balancing between HPC and data analysis, Facilities changes in the Data Center, ...

35. KIT Site Report

Andreas Petzold (KIT - Karlsruhe Institute of Technology (DE))

13/10/2014, 10:15

Site reports

Site Reports

News about GridKa Tier-1 and other KIT IT projects and infrastructure.

4. Fermilab Site Report - Fall 2014 HEPiX

Dr Keith Chadwick (Fermilab)

13/10/2014, 11:00

Site reports

Site Reports

Fermilab Site Report - Fall 2014 HEPiX

41. University of Wisconsin Madison CMS T2 site report

Ajit mohapatra (University of Wisconsin (US)), Tapas Sarangi (University of Wisconsin (US))

13/10/2014, 11:15

Site reports

Site Reports

As a major WLCG/OSG T2 site, the University of Wisconsin Madison CMS T2 has provided very productive and reliable services for CMS MonteCarlo production/processing, and large scale global CMS physics analysis using high throughput computing (HT-Condor), highly available storage system (Hadoop), efficient data access using xrootd/AAA, and scalable distributed software systems (CVMFS). An update...

1. INFN-T1 site report

Andrea Chierici (INFN-CNAF)

13/10/2014, 11:30

Site reports

Site Reports

Updates on INFN Tier1 site

40. Scientific Linux current status update

Pat Riehecky (Fermilab)

13/10/2014, 13:30

End-User IT Services & Operating Systems

IT End User and Operating Systems

This presentation will provide an update on the current status of Scientific Linux, descriptions for some possible future goals, and allow a chance for users to provide feedback on its direction.

11. Next Linux Version at CERN.

Thomas Oulevey (CERN)

13/10/2014, 14:00

End-User IT Services & Operating Systems

IT End User and Operating Systems

CERN is maintaining and deploying Scientific Linux CERN since 2004. In January 2014 CentOS and Red Hat announced joining forces in order to provide common platform for open source community project needs. CERN decided to see how CentOS 7 fits his needs and evaluate CentOS release 7 as their next version. An updated report will be provided, as agreed at HEPiX Spring 2014.

34. Experiences with EL 7 at T2_US_Nebraska

Garhan Attebury (University of Nebraska (US))

13/10/2014, 14:30

End-User IT Services & Operating Systems

IT End User and Operating Systems

Seven years have passed since the initial EL 5 release and yet it's still found in active use at many sites. The successor EL 6 is also showing age with its 4th birthday just around the corner. While both are still under support from RedHat for many years to come, it never hurts to prepare for the future. This talk will detail the experiences at T2_US_Nebraska in transitioning towards EL 7...

14. FTS3, large scale file transfer service with simplicity and performance at its best

Michail Salichos (CERN)

13/10/2014, 15:00

End-User IT Services & Operating Systems

IT End User and Operating Systems

FTS3 is the service responsible for globally distributing the majority of the LHC data across the WLCG infrastructure. It is a file transfer scheduler which scales horizontally and it's easy to install and configure. In this talk we would like to bring the attention to the FTS3 features that could attract wider communities and administrators with several new friendly features. We...

28. Issue Tracking and Version Control Services status update

Borja Aparicio Cotarelo (CERN)

13/10/2014, 16:00

End-User IT Services & Operating Systems

IT End User and Operating Systems

The current efforts around the Issue Tracking and Version Control services at CERN will be presented. Their main design and structure will be shown giving special attention to the new requirements from the community of users in terms of collaboration and integration tools and how we address this challenge in the definition of new services based on GitLab for collaboration and Code Review and...

29. Monviso: a portal for metering and reporting CNAF resources usage

Andrea Chierici (INFN-CNAF)

14/10/2014, 09:00

End-User IT Services & Operating Systems

IT End User and Operating Systems

CNAF T1 monitoring and alarming systems produce tons of data describing state, performance and usage of our resources. Collecting this kind of information centrally would benefit both resource administrators and our user community in processing information and generating reporting graphs. We built the “Monviso reporting portal” that consumes a set of key metrics, graphing them based on two...

5. CERN Site Report

Dr Arne Wiebalck (CERN)

14/10/2014, 09:30

Site reports

Site Reports

News from CERN since the Annecy meeting.

33. IRFU site report

Mr Frederic Schaer (CEA)

14/10/2014, 09:45

Site reports

Site Reports

In this site report, we will speak about what changed at CEA/IRFU and what has been interesting since Hepix@Annecy, 6 months ago.

49. DESY Site Report

Andreas Haupt (Deutsches Elektronen-Synchrotron (DE))

14/10/2014, 10:00

Site reports

Site Reports

News from DESY since the Annecy meeting.

44. BNL RACF Site Report

James Pryor (B)

14/10/2014, 10:15

Site reports

Site Reports

A summary of developments at BNL's RHIC/ATLAS Computing Facility since the last HEPiX meeting.

53. AGLT2 Site Report Fall 2014

Shawn Mc Kee (University of Michigan (US))

14/10/2014, 11:00

Site reports

Site Reports

I will present an update on our site since the last report and cover our work with dCache, perfSONAR-PS and VMWare. I will also report on our recent hardware purchases for 2014 as well as the status of our new networking configuration and 100G connection to the WAN. I conclude with a summary of what has worked and what problems we encountered and indicate directions for future work.

55. RAL Site Report

Martin Bly (STFC-RAL)

14/10/2014, 11:15

Site reports

Site Reports

Latest from RAL Tier1

38. LHC@home status - Outlook for wider use of volunteer computing at CERN

Dr Helge Meinhard (CERN)

14/10/2014, 11:30

Grid, Cloud & Virtualisation

Grids, Clouds, Virtualisation

[LHC@home][1] was brought back to CERN-IT in 2011, with 2 projects; Sixtrack and Test4Theory, the latter using virtualization with CernVM. Thanks to this development, there is increased interest in volunteer computing at CERN, notably since native virtualization support has been added to the BOINC middleware. Pilot projects with applications from the LHC experiment collaborations running on...

30. Benchmarking on System on Chip Archtecture and fast benchmarking

Michele Michelotto (Universita e INFN (IT))

14/10/2014, 13:30

Computing & Batch Services

Computing and Batch Systems

The traditional architecture for High Energy Physics is x86-64 but in the community there is interest in processor more efficient in term of computing power per Watt. I'll show my measurement on ARM and Avoton processor. I'll conclude with some measurements on candidate for fast benchmark that are requested by the physics community, mostyl to measure the performance of machine in cloud.

22. Future of Batch Processing at CERN: a Condor Pilot Service

Jerome Belleman (CERN)

14/10/2014, 14:00

Computing & Batch Services

Computing and Batch Systems

The CERN Batch System comprises 4000 worker nodes, 60 queues and offers a service for various types of large user communities. In light of the developments driven by the Agile Infrastructure and the more demanding processing requirements, it will be faced with increasingly challenging scalability and flexibility needs. Last HEPiX, we presented the results of our evaluation of SLURM,...

16. Compute node benchmarks for Compact Muon Solenoid workflows

Samir Cury Siqueira (California Institute of Technology (US))

14/10/2014, 14:30

Computing & Batch Services

Computing and Batch Systems

Hardware benchmarks are often relative to the target application. In CMS sites, new technologies, mostly processors, need to be evaluated on an yearly basis. A framework was developed at the Caltech CMS Tier-2 to benchmark compute nodes with one of the most CPU-intensive CMS workflows - The Tier-0 Reconstruction. The benchmark is a CMS job that reports the results to a central database...

56. HTCondor and HEP Partnership and Activities

Todd Tannenbaum (Univ of Wisconsin-Madison, Wisconsin, USA)

14/10/2014, 15:00

Computing & Batch Services

Computing and Batch Systems

The goal of the HTCondor team is to to develop, implement, deploy, and evaluate mechanisms and policies that support High Throughput Computing (HTC) on large collections of distributively owned computing resources. Increasingly, the work performed by the HTCondor developers is being driven by its partnership with the High Energy Physics (HEP) community. This presentation will provide an...

57. Releasing the HTCondor-CE into the Wild

Brian Paul Bockelman (University of Nebraska (US))

14/10/2014, 15:20

Computing & Batch Services

Computing and Batch Systems

One of the most critical components delivered by the Open Science Grid (OSG) software team is the compute element, or the OSG-CE. At the core of the CE itself is the gatekeeper software for translating grid pilot jobs into local batch system jobs. OSG is in the process of migrating from the Globus gatekeeper to the HTCondor-CE, supported by the HTCondor team. The HTCondor-CE provides an...

58. HTCondor on the Grid and in the Cloud

James Frey

14/10/2014, 16:10

Computing & Batch Services

Computing and Batch Systems

An important use of HTCondor is as a scalable, reliable interface for jobs destined for other scheduling systems. These include Grid intefaces to batch systems (Globus, CREAM, ARC) and Cloud services (EC2, OpenStack, GCE). The High Energy Physics community has been a major user of this functionality and has driven its development. This talk will provide an overview of HTCondor's Grid...

46. Evaluating Infiniband Based Networking Solutions for HEP/NP Data Processing Applications

Mr Alexandr Zaytsev (Brookhaven National Laboratory (US))

14/10/2014, 16:30

Security & Networking

Networking and Security

The Infiniband networking technology is a long established and rapidly developing technology which is currently dominating the field of low-latency, high-throughput interconnects for HPC systems in general and those included in the TOP-500 list in particular. Over the last 4 years a successful use of Infiniband networking technology combined with additional IP-over-IB protocol and Infiniband...

27. Plans for Dual Stack IPv4/IPv6 services on WLCG - an update from the HEPiX IPv6 Working Group

Dave Kelsey (STFC - Rutherford Appleton Lab. (GB))

15/10/2014, 09:00

Security & Networking

Networking and Security

This talk will present an update on the recent activities of the HEPiX IPv6 Working Group including our plans for moving to dual-stack services on WLCG.

18. EEX: ESnet Extension to Europe and ESnet support for the LHC Community

Joe Metzger (LBL)

15/10/2014, 09:30

Security & Networking

Networking and Security

The ESnet Extension to Europe (EEX) project is building out the ESnet backbone in to Europe. The goal of the project is to provide dedicated transatlantic network services that support U.S. DOE funded science. The EEX physical infrastructure build will be substantially completed before the end of December. Initial services will be provided to BNL, FERMI and CERN while the infrastructure is...

26. Do You Need to Know Your Users?

Bob Cowles (BrightLite Information Security)

15/10/2014, 10:00

Security & Networking

Networking and Security

After several years investigation of trends in Identity Management (IdM), the eXtreme Scale Identity Management (XSIM) project has concluded there is little reason for resource providers to provide IdM functions for research collaborations or even for many groups within the institution. An improved user experience and decreased cost can be achieved with "a small amount of programming."

54. OSG IPv6 Software and Operations Preparations

Robert Quick (Indiana University)

15/10/2014, 11:00

Security & Networking

Networking and Security

OSG Operations and Software will soon be configuring our operational infrastructure and middleware components with an IPv6 network stack capabilities in addition to its existing IPv4 stack. For OSG services this means network interfaces will thus have at least one IPv6 address on which it listens, in addition to whatever IPv4 addresses it is already listening on. For middleware components we...

17. Situational Awareness: Computer Security

Dr Stefan Lueders (CERN)

15/10/2014, 11:30

Security & Networking

Networking and Security

Computer security is important as ever outside the HEP community, but also within. This presentation will give the usual overview on recent issues being reported or made public since the last HEPix workshop (like the ripples of "Heartbleed"). It will discuss trends (identity federation and virtualisation) and potential mitigations to new security threats.

24. UPS Monitoring with Sensaphone-A cost-effective solution

Dr Tony Wong (Brookhaven National Laboratory)

15/10/2014, 13:30

IT Facilities & Business Continuity

IT Facilities and Business Continuity

We describe a cost-effective indirect UPS monitoring system that was implemented recently in parts of its RACF complex. This solution was needed to address a lack of centralized monitoring solution, and it is integrated with an event notification mechanism and overall facility management.

8. First Experience with the Wigner Data Centre

Wayne Salter (CERN)

15/10/2014, 14:00

IT Facilities & Business Continuity

IT Facilities and Business Continuity

After a tender for a CERN remote Tier0 centre issued at the end of 2011, and awarded to the Wigner Data Centre in May 2012, operations commenced at the beginning of 2013. This talk will give a brief introduction to the history of this project and it scope. It will then summarise the initial experience that has been gained to-date and highlight a number of issues that have been encountered;...

48. The Lustre Filesystem for Petabyte Storage at the Florida HPC Center

Dr Dimitri Bourilkov (University of Florida (US))

15/10/2014, 14:30

Storage & Filesystems

Storage and Filesystems

Design, performance, scalability, operational experience, monitoring, different modes of access and expansion plans for the Lustre filesystems, deployed for high performance computing at the University of Florida, are described. Currently we are running storage systems of 1.7 petabytes for the CMS Tier2 center and 2.0 petabytes for the university-wide HPC center.

47. EOS across 1000 km

Luca Mascetti (CERN)

15/10/2014, 15:00

Storage & Filesystems

Storage and Filesystems

In this contribution we report our experience in operating EOS, the CERN-IT high-performance disk-only solution, in multiple Computer Centres. EOS is one of the first production services exploiting the CERN's new facility located in Budapest, using his stochastic geo-location of data replicas. Currently EOS holds more than 100PB of raw disk space for the four big experiments (ALICE, ATLAS,...

50. Using XRootD to Minimize Hadoop Replication

Jeffrey Dost (UCSD)

15/10/2014, 16:00

Storage & Filesystems

Storage and Filesystems

We have developed an XRootD extension to Hadoop at UCSD that allows a site to significantly free local storage space by taking advantage of the file redundancy already provided by the XRootD Federation. Rather than failing when a corrupt portion of a file is accessed, the hdfs-xrootd-fallback system retrieves the segment from another site using XRootD, thus serving the original file to the end...

52. Cernbox + EOS: Cloud Storage for Science

Luca Mascetti (CERN)

15/10/2014, 16:30

Storage & Filesystems

Storage and Filesystems

Cernbox is a cloud synchronization service for end-users: it allows to sync and share files on all major platforms (Linux, Windows, MacOSX, Android, iOS). The very successful beta phase of the service demonstrated high demand in the community for such easily accessible cloud storage solution. Integration of Cernbox service with the EOS storage backend is the next step towards providing sync...

6. Addressing the VM IO bottleneck

Dr Arne Wiebalck (CERN)

15/10/2014, 17:00

Grid, Cloud & Virtualisation

Grids, Clouds, Virtualisation

This is summary of our efforts to address the issue of providing sufficient IO capacity to VMs running in our OpenStack cloud.

42. OpenZFS on Linux

Brian Behlendorf (LLNL)

16/10/2014, 09:00

Storage & Filesystems

Storage and Filesystems

OpenZFS is a storage platform that encompasses the functionality of a traditional filesystem and volume manager. It's highly scalable, provides robust data protection, supports advanced features like snapshots and clones, and is easy to administer. These features make it an appealing choice for HPC sites like LLNL which uses it for all production Lustre filesystems. This contribution...

43. SSD benchmarking at CERN

Liviu Valsan (CERN)

16/10/2014, 09:30

Storage & Filesystems

Storage and Filesystems

Flash storage is slowly becoming more and more prevalent in the High Energy Physics community. When deploying Solid State Drives (SSDs) it's important to understand their capabilities and limitations, allowing to choose the best adapted product for the use case at hand. Benchmarking results from synthetic and real-world workloads on a wide array of Solid State Drives will be presented. The new...

39. Ceph Based Storage Systems for RACF

Mr Alexandr Zaytsev (Brookhaven National Laboratory (US))

16/10/2014, 10:00

Storage & Filesystems

Storage and Filesystems

Ceph based storage solutions are becoming increasingly popular within the HEP/NP community over the last few years. With the current status of Ceph project, both object storage and block storage layers are production ready on a large scale, and the Ceph file system storage layer (CephFS) is rapidly getting to that state as well. This contribution contains a thorough review of various...

12. DPM performance tuning hints for HTTP/WebDAV and Xrootd

Andrea Manzi (CERN)

16/10/2014, 11:00

Storage & Filesystems

Storage and Filesystems

In this contribution we give a set of hints for the performance tuning of the upcoming DPM releases, and we show what one can achieve by looking at different graphs taken from the DPM nightly performance tests. Our focus is on the HTTP/WebDAV and Xrootd protocols and the newer "dmlite" software framework, and some of these hints may give some benefit also to older, legacy protocol...

2. Experience in running relational databases on clustered storage

Ruben Domingo Gaspar Aparicio (CERN)

16/10/2014, 11:20

Storage & Filesystems

Storage and Filesystems

CERN IT-DB group is migrating its storage platform, mainly NetApp NAS’s running on 7-mode but also SAN arrays, to a set of NetApp C-mode clusters. The largest one is made of 14 controllers and it will hold a range of critical databases from administration to accelerators control or experiment control databases. This talk shows our setup: network, monitoring, use of features like transparent...

31. New High Availability Storage at PDSF

Tony Quan (LBL)

16/10/2014, 11:40

Storage & Filesystems

Storage and Filesystems

The PDSF Cluster at NERSC has been providing a data-intensive computing resource for experimental high energy particle and nuclear physics experiments (currently Alice, ATLAS, STAR, ICECUBE, MAJORANA) since 1996. Storage is implemented as a GPFS cluster built out of a variety of commodity hardware (Dell, Raidinc, Supermicro storage and servers). Recently we increased its capacity by 500TB by...

32. Developing Nagios code to suspend checks during planned outages

Ray Spence (u)

16/10/2014, 13:30

Basic IT Services

Basic IT Services

Lawrence Berkeley National Laboratory/NERSC Division Developing Nagios code to suspend checks during planned outages. Raymond E. Spence NERSC currently supports more than 13,000 computation nodes spread over six supercomputing or clustered systems. These systems access cumulatively more than 13.5PB of disk space via thousands of network interfaces. This environment enables scientists...

25. Configuration Services at CERN: update

Ben Jones (CERN)

16/10/2014, 13:50

Basic IT Services

Basic IT Services

A status of the Puppet-based Configuration Service at CERN will presented giving a general update and discussing our current plans for the next 6 months. The presentation will also highlight the work being done to secure the Puppet infrastructure making it appropriate for use by a large number of administratively distinct user-groups.

23. CFEngine Application at AGLT2

Mr Ben Meekhof (University of Michigan)

16/10/2014, 14:15

Basic IT Services

Basic IT Services

CFEngine is a highly flexible configuration management framework. It also has a very high learning curve which can sometimes make decisions about how to deploy and use it difficult. At AGLT2 we manage a variety of different systems with CFEngine. We also have an effective version-controlled workflow for developing, testing, and deploying changes to our configuration. The talk will...

21. Puppet at USCMS-T1 and FermiLab - Year 2

Timothy Michael Skirvin (Fermi National Accelerator Lab. (US))

16/10/2014, 14:40

Basic IT Services

Basic IT Services

USCMS-T1's work to globally deploy Puppet as our configuration management tool is well into the "long tail" phase, and has changed in fairly significant ways since its inception. This talk will discuss what has worked, how the Puppet tool itself has changed over the project, and our first thoughts as to what we expect to be doing in the next year (hint: starting again is rather likely!).

3. DataBase on Demand : insight how to build your on DBaaS

Ruben Domingo Gaspar Aparicio (CERN)

16/10/2014, 15:05

Basic IT Services

Basic IT Services

Inspired on different database as a service, DBaas, providers, the database group at CERN has developed a platform to allow CERN user community to run a database instance with database administrator privileges providing a full toolkit that allows the instance owner to perform backup/ point in time recoveries, monitoring specific database metrics, start/stop of the instance and...

59. Joint procurement of IT equipment and services

Wayne Salter (CERN)

16/10/2014, 16:00

IT Facilities & Business Continuity

IT Facilities and Business Continuity

The presentation describes options for joint activities around procurement of equipment and services by public labs, possibly with funding by the European Commission. The presentation is intended to inform the community and check whether there is interest.

51. Configuration Management, Change Management, and Culture Management

James Pryor (B)

16/10/2014, 16:20

Basic IT Services

Basic IT Services

In 2010, the RACF at BNL began investigating Agile/DevOps practices and methodologies to be able to do more in less time or effort. We choose Puppet in 2010 and by Spring of 2011 we had converted about half our of configuration shell scripts into Puppet code on a handful of machines. Today we have scaled Puppet 3.x to support our entire facility and and host a common Puppet code base that is...

9. Ermis service for DNS Load Balancer configuration

Aris Angelogiannopoulos (Ministere des affaires etrangeres et europeennes (FR))

16/10/2014, 16:45

Basic IT Services

Basic IT Services

This presentation describes the implementation and use cases of the Ermis Service. Ermis is a RESTful service to manage the configuration of DNS load balancers. It enables direct creation and deletion of DNS delegated zones using a SOAP interface provided by the Network group thus simplifying the procedure needed for supporting new services. It is written in Python as a Django Application....

37. RAL Tier 1 Cloud Computing Developments

Ian Peter Collier (STFC - Rutherford Appleton Lab. (GB))

17/10/2014, 09:00

Grid, Cloud & Virtualisation

Grids, Clouds, Virtualisation

Update on the RAL Tier 1 cloud deployment and cloud computing activities.

15. The Adoption of Cloud Technologies within the LHC Experiments

Laurence Field (CERN)

17/10/2014, 09:30

Grid, Cloud & Virtualisation

Grids, Clouds, Virtualisation

The adoption of cloud technologies by the LHC experiments is currently focused on IaaS, more specifically the ability to dynamically create virtual machines on demand. This talk provides an overview of how this alternative approach for resource provision fits into the existing workflows used by the experiments. It shows that in order to fully exploit this approach, solutions are required in...

13. Evolution of WLCG monitoring

Dr Edward Karavakis (CERN)

17/10/2014, 10:00

Grid, Cloud & Virtualisation

Grids, Clouds, Virtualisation

The WLCG monitoring system provides a solid and reliable solution that has supported LHC computing activities and WLCG operations during the first years of LHC data-taking. The current challenge consists of ensuring that the WLCG monitoring infrastructure copes with the constant increase of monitoring data volume and complexity (new data-transfer protocols, new dynamic types of resource...

7. CERN Cloud Report

Dr Arne Wiebalck (CERN)

17/10/2014, 10:30

Grid, Cloud & Virtualisation

Grids, Clouds, Virtualisation

This is a report on the current status of CERN's OpenStack-based Cloud Infrastructure.

45. IHEP Site Report

Jingyan Shi (IHEP)

Site reports

It's the site report including what we have done with the storage, computing. Besides, it will discuss the serious error happened with our central switch and how we deal with. The progress of cloud computing based on openstack will be also discussed.