23–27 Mar 2015
Physics Department, Oxford University
Europe/London timezone

Contribution List

86 out of 86 displayed
Export to PDF
  1. Prof. John Wheater (Oxford University)
    23/03/2015, 09:00
    Miscellaneous
  2. Peter Gronbech (University of Oxford (GB))
    23/03/2015, 09:10
    Miscellaneous
    Workshop Logistics
    Go to contribution page
  3. Johan Henrik Guldmyr (Helsinki Institute of Physics (FI))
    23/03/2015, 09:15
    Site reports
    - CSC General HPC updates - New Haswell Hardware for supercluster and supercomputer - Slurm/Lustre - taito-shell.csc.fi - a slurm/sshd/iptables-based interactive shell load balancer. Replaced large memory (1TB) interactive nodes. - DDN SFA12k - Using ELK stack (Elasticsearch Logstash Kibana) - sort through dCache logs - Search through auditd logs - anybody used shred? root...
    Go to contribution page
  4. Paul Kuipers (Nikhef)
    23/03/2015, 09:30
    Site reports
    Spring 2015 site report
    Go to contribution page
  5. Jingyan Shi (IHEP)
    23/03/2015, 09:45
    Site reports
    The status of IHEP site and the improvement we've got and what is in our plan this year.
    Go to contribution page
  6. Martin Adam (UJF Rez), Václav Říkal
    23/03/2015, 10:00
    Site reports
    We will give an overview of the site and will share experience with these topics: migration of virtualized servers to a new infrastructure, migration from cfengine to puppet and spacewalk as new systems management solution, procurement of a new hardware (worker nodes and storage servers).
    Go to contribution page
  7. Giuseppe Misurelli (Unknown)
    23/03/2015, 10:15
    Site reports
    Update on INFN-T1
    Go to contribution page
  8. Walter Schon
    23/03/2015, 10:55
    Site reports
  9. William Strecker-Kellogg (Brookhaven National Lab)
    23/03/2015, 11:10
    Site reports
    Brookhaven National Lab (BNL) will present the site report for the RHIC-ATLAS Computing Facility (RACF), covering developments over the past 6 months.
    Go to contribution page
  10. James Botts (LBNL)
    23/03/2015, 11:25
    Site reports
    PDSF, the Parallel Distributed Systems Facility, has been serving high energy physics and been in continuous operation at NERSC since 1996. It is currently a tier-1 site for Star, tier-2 for Alice and tier-3 for Atlas. This site report will describe recent updates to the system and upcoming modifications. PDSF will move this year from its current site to a new building on the LBNL campus and...
    Go to contribution page
  11. Garhan Attebury (University of Nebraska (US))
    23/03/2015, 11:40
    Site reports
    Site report covering the status of T2_US_Nebraska and changes / updates since the Fall 2014 meeting.
    Go to contribution page
  12. Sandy Philpott (JLAB)
    23/03/2015, 11:55
    Site reports
    Current high performance and experimental physics computing environment updates: core exchanges between USQCD and Experimental Physics clusters for load balancing, job efficiency, and 12GeV data challenges; Nvidia K80 GPU experiences and updated Intel MIC environment; update on locally developed workflow tools and write-through to tape cache filesystem; status of LTO6 integration into our MSS;...
    Go to contribution page
  13. Mr Peter van der Reest (DESY)
    23/03/2015, 12:10
    Site reports
    DESY site report
    Go to contribution page
  14. Mr Julien Carpentier (CCIN2P3)
    23/03/2015, 12:25
    Site reports
    We will present the lastest status of the IN2P3 Computer Center. Emphasis will be made to the infrastructure and system area.
    Go to contribution page
  15. Lisa Gerhardt (LBNL), Mr Yushu Yao (LBNL)
    23/03/2015, 14:00
    End-User IT Services & Operating Systems
    SciDB is an open-source analytical database for scalable complex analytics on very large array or multi-structured data from a variety of sources, programmable from Python and R. It runs on HPC, commodity hardware grids, or in a cloud and can manage and analyze terabytes of array-structured data and do complex analytics in-database. We present an overall description of the SciDB framework and...
    Go to contribution page
  16. Mr Michel Jouvin (Laboratoire de l'Accelerateur Lineaire (FR))
    23/03/2015, 14:25
    End-User IT Services & Operating Systems
    The HEP Software Foundation (HSF) is a one year old inititative to foster collabarotion in software development in the HEP community and related scientific communities. Launched by a kick-off meeting at CERN in April 2014, the first year has been spend to better define what HSF should be. An HSF workshop was held in January at SLAC and HSF is now entering is "implementation phase". This talk...
    Go to contribution page
  17. Mr Andreas Wagner (CERN)
    23/03/2015, 14:50
    End-User IT Services & Operating Systems
    • Status of CERN Web Services
      • Overview
      • Web Site Life Cycle Management
      • Web Analytics
    • CERN’s Enterprise Social Networking System
      • Motivation & purpose
      • Feature overview: microblogging, profiles, social networking, suggestion systems and discussion forums
    • CERN Search...
    Go to contribution page
  18. Thomas Baron (CERN)
    23/03/2015, 15:15
    End-User IT Services & Operating Systems
    A lot of visible and behind-the-scene actions have been taken in recent months to prepare CERN conferencing services (Indico, Vidyo, the webcast and conference rooms services) for challenges to come. These services will be described in terms of features and usage statistics. We will present their integration to the CERN layered cloud infrastructure, and with other IT base services. We will...
    Go to contribution page
  19. Connie Sieh (FNAL)
    23/03/2015, 16:05
    End-User IT Services & Operating Systems
    Current Status of Scientific Linux
    Go to contribution page
  20. Dr Arne Wiebalck (CERN)
    23/03/2015, 16:30
    End-User IT Services & Operating Systems
    In this talk we will present a brief status update on CERN's work on CentOS 7, the uptake by the various IT services, and the interaction with the upstream CentOS community.
    Go to contribution page
  21. Mr Emyr James (Wellcome Trust, Sanger Institute)
    23/03/2015, 16:55
    End-User IT Services & Operating Systems
    The Wellcome Trust Sanger Institute is a charitably funded genomic research centre. A leader in the Human Genome Project, it is now focused on understanding the role of genetics in health and disease. Large amounts of data is produced at the institute by next-generation sequencing machines. The data is then stored, processed and analysed on the institute's computing cluster. The main compute...
    Go to contribution page
  22. Wayne Salter (CERN)
    23/03/2015, 17:20
    IT Facilities & Business Continuity
    Many of you are aware of the power incident we had on the 16th October during the last HEPiX workshop. I will give a detailed explanation of what happened, the impact on IT services as well as the actions taken to recover from the incident. I will also note some improvements that will be implemented as a result of this incident. I will then go on to discuss other operations incidents that we...
    Go to contribution page
  23. Jose Flix Molina (Centro de Investigaciones Energ. Medioambientales y Tecn. - (ES)
    24/03/2015, 09:00
    Site reports
    We will be revising the status of PIC Tier-1 by Spring 2015. The typical site report which is reported in HEPIX.
    Go to contribution page
  24. Dr Sean Brisbane (University of Oxford)
    24/03/2015, 09:15
    Site reports
    A site report from the University of Oxford focusing on the integration challenges between the various systems.
    Go to contribution page
  25. Dr Arne Wiebalck (CERN)
    24/03/2015, 09:30
    Site reports
    News from CERN since the Lincoln meeting.
    Go to contribution page
  26. Tina Friedrich (Diamond Light Source Ltd)
    24/03/2015, 09:45
    Site reports
    Diamond Light Source site report
    Go to contribution page
  27. Sang Un Ahn (KiSTi Korea Institute of Science & Technology Information (KR))
    24/03/2015, 10:00
    Site reports
    The status of KISTI-GSDC Tier-1 site will be present including a brief of history of the KISTI-GSDC Site, system summary (configuration management), PBS batch issues, Tier-1 operations and future plan.
    Go to contribution page
  28. Martin Bly (STFC-RAL)
    24/03/2015, 10:15
    Site reports
    Latest updates for the RAL Tier-1.
    Go to contribution page
  29. Nils Hoimyr (CERN)
    24/03/2015, 10:55
    End-User IT Services & Operating Systems
    An update will be given on the status of collaborative tools for software developers, Version Control Services (Git and SVN), Issue Tracking (JIRA), Integration (Jenkins) and documentation (TWiki) The presentation will focus on collaborative ascpects for software developers and report on progress since the fall meeting.
    Go to contribution page
  30. Mr Dirk Jahnke-Zumbusch (DESY)
    24/03/2015, 11:15
    End-User IT Services & Operating Systems
    After more than ten years of operations the game is over for Exchange 2003 at DESY. Now Zimbra has been set into production and data from both Exchange 2003 and the UNIX mail service is being migrated and consolidated gradually. The architecture of the Zimbra mail service, the migration procedures and some experiences will be presented. Finally we will look at some integration aspects of...
    Go to contribution page
  31. Nils Hoimyr (CERN)
    24/03/2015, 11:40
    End-User IT Services & Operating Systems
    Status of LHC@home, volunteer computing at CERN and for the LHC experiments. The presenter will give an update on the volunteer computing strategy for HEP and different scenarii for use of volunteer cloud computing or other lightweight cloud infrastructes to run experiment code under CernVM on available computing resources. Furthermore, the current status of the CERN BOINC server...
    Go to contribution page
  32. Rennie S. Scott (FNAL), connie sieh (Fermilab)
    24/03/2015, 12:05
    Site reports
    Site report from Fermilab
    Go to contribution page
  33. Adam Lukasz Krajewski (Warsaw University of Technology (PL))
    24/03/2015, 12:20
    Security & Networking
    Following an incident with a slow database replication between CERN's data centers, we discovered that even a very low rate packet loss in the network (order of 0.001%) can induce significant penalties to long distance single stream TCP transfers. We explore the behaviour of multiple TCP congestion control algorithms in a controlled loss and delay environment in order to understand...
    Go to contribution page
  34. Mr Romain Wartel (CERN)
    24/03/2015, 14:00
    Security & Networking
    This presentation gives an overview of the current computer security landscape. It describes the main vectors of compromises in the academic community including lessons learnt, and reveal inner mechanisms of the underground economy to expose how our resources are exploited by organised crime groups, as well as recommendations to protect ourselves. By showing how these attacks are both...
    Go to contribution page
  35. Ian Peter Collier (STFC - Rutherford Appleton Lab. (GB))
    24/03/2015, 14:25
    Security & Networking
    Report on the initial activities of the WLCG Cloud Traceability Working Group
    Go to contribution page
  36. Linda Ann Cornwall (STFC - Rutherford Appleton Lab. (GB))
    24/03/2015, 14:50
    Security & Networking
    The European Grid Infrastructure (EGI) and Worldwide Large Hadron collider Grid (WLCG) infrastructure largely overlap and share the majority of security activities. A lot of security related activity goes on behind the scenes concerning such a large scale distributed computing infrastructure. Security incident prevention takes up the larger amount of effort, and this is carried out via...
    Go to contribution page
  37. David Crooks (University of Glasgow (GB))
    24/03/2015, 15:15
    Security & Networking
    OSSEC, the popular HIDS (Host Intrusion Detection System), has been widely used for a number of years. More recently, tools like Elasticsearch, Logstash and Kibana (ELK) have become popular in visualising and working with data such as that aggregated by OSSEC. We report on a recent implementation of OSSEC, coupled to an ELK instance, at the Glasgow site of the UKI-SCOTGRID distributed Tier-2....
    Go to contribution page
  38. Marian Babik (CERN)
    24/03/2015, 15:50
    Security & Networking
    WLCG relies on the network as a critical part of its infrastructure and therefore needs to guarantee effective network usage and prompt detection and resolution of any network issues, including connection failures, congestion and traffic routing. The WLCG Network and Transfer Metrics working group was established to ensure sites and experiments can better understand and fix networking issues....
    Go to contribution page
  39. Dave Kelsey (STFC - Rutherford Appleton Lab. (GB))
    24/03/2015, 16:15
    Security & Networking
    This talk will present an update from the HEPiX IPv6 Working Group. This will include details of recent testing activities and plans for the deployment of dual-stack data services and monitoring on (at least some of) the WLCG infrastructure.
    Go to contribution page
  40. Ulf Bobson Severin Tigerstedt (Helsinki Institute of Physics (FI))
    24/03/2015, 16:40
    Security & Networking
    A view back on testing IPv6 and different versions of dCache as it has evolved from 2.6 to 2.12 and barely-working to well working.
    Go to contribution page
  41. Francesco Prelz (Università degli Studi e INFN Milano (IT))
    24/03/2015, 16:55
    Security & Networking
    Probably the most prominent change that IPv6 introduces in the semantics of internet protocol applications is the need to *always* deal with multiple addresses (possibly both IPv4 and IPv6) associated to each network endpoint. A quick overview of how and where addresses are categorised, ordered and preferred is presented, both from the system administrator and the developer viewpoint. A few...
    Go to contribution page
  42. Kacper Surdy (CERN)
    24/03/2015, 17:20
    Storage & Filesystems
    There are terabytes of data stored in a relational database (Oracle) at CERN which in fact does not need a relational model. Moreover, using a relational database management system very often brings a significant overhead in terms of resource utilization. The problem is notably observable for warehouse-type data sets. At the same time running analytical workloads on such data sets requires...
    Go to contribution page
  43. Dave Kelsey (STFC - Rutherford Appleton Lab. (GB)), Dr Shawn McKee (University of Michigan ATLAS Group)
    24/03/2015, 18:00
  44. Julien Leduc (CERN)
    25/03/2015, 09:00
    IT Facilities & Business Continuity
    CERN Computer Center (CC) is a large building that integrates several kilometers of fibers, copper cables, pipes and several complex installations (UPSes, water cooling, heat exchangers...). This evolving building is a large theater with numerous actors: - contractors, performing construction work, building maintenance or hardware replacement - engineers and technicians, debugging...
    Go to contribution page
  45. Yves Kemp (Deutsches Elektronen-Synchrotron (DE))
    25/03/2015, 09:25
    Storage & Filesystems
    Recent advances in both, hard-disks and system-on-a-chip (SoC) designs enabled the development of a novel form of hard-disk: a disk that includes a network interface and an additional ARM processor, not involved in low level disk operations. This setup allows those disks to run an operating system and to communicate with other nodes autonomously using wired Ethernet. No additional hardware or...
    Go to contribution page
  46. Stefan Dietrich (DESY)
    25/03/2015, 09:50
    Storage & Filesystems
    PETRA III is DESY's largest ring accelerator and the most brilliant storage-ring-based X-ray radiation source in the world. With its recent extension, new and faster detectors are used for the data acquisition. They exceed previous detectors in terms of data rate and volume; this is highly demanding for the underlying storage system. This talk will present the challenges we faced, the new...
    Go to contribution page
  47. Yves Kemp (Deutsches Elektronen-Synchrotron (DE))
    25/03/2015, 10:15
    Storage & Filesystems
    The presentation will present: - History and current status of the BeeGFS project (formerly know as FhGFS, originating from Fraunhofer) - Design and technology decisions made by BeeGFS developers - BeeGFS setup and operational experience as IniniBand based high-performance cluster file system serving as scratch space for the DESY HPC system - Discussion of future usage scenarios and...
    Go to contribution page
  48. Alastair Dewhurst (STFC - Rutherford Appleton Lab. (GB))
    25/03/2015, 11:05
    Storage & Filesystems
    RAL is currently exploring the possibilities offered by Ceph. This talk will describe two of these projects. The first project aims to provide large scale, high throughput storage for experimental data. This will initially be used by the WLCG VOs. A prototype cluster built from old hardware has been in testing since October 2014. The WLCG VOs will continue to need to access their data via...
    Go to contribution page
  49. Herve Rousseau (CERN)
    25/03/2015, 11:30
    Storage & Filesystems
    Ceph has become over time a key component of CERN’s Agile Infrastructure by providing storage for the Openstack service. In this talk, we will briefly introduce Ceph’s concepts, our current cluster and the services we provide such as NFS filers, Object Store for the Atlas experiment and Xroot-to-Ceph gateways. We will then talk about our experience running Ceph with some real-world...
    Go to contribution page
  50. Dr Ofer Rind (BROOKHAVEN NATIONAL LABORATORY)
    25/03/2015, 11:55
    Storage & Filesystems
    We review various functionality, performance, and stability tests performed at the RHIC and ATLAS Computing Facility (RACF) at Brookhaven National Laboratory (BNL) in 2014-2015. Tests were run on all three (object storage, block storage and file system) levels of Ceph, using a range of hardware platforms and networking solutions, including 10/40 Gbps Ethernet and IPoIB/4X FDR Infiniband. We...
    Go to contribution page
  51. Mr Spray John (Red Hat, Inc.)
    25/03/2015, 12:20
    Storage & Filesystems
    The Ceph storage system is an open source, highly scalable, resilient data storage service providing object, block and file interfaces. This presentation will introduce what is new in the latest Ceph release, codenamed *Hammer*, and describe the ongoing development activities around CephFS, the Ceph filesystem. An intermediate level of familiarity with large scale storage systems will be assumed.
    Go to contribution page
  52. Dr Arne Wiebalck (CERN)
    25/03/2015, 13:00
  53. Peter Love (Lancaster University (GB))
    25/03/2015, 14:00
    Computing & Batch Services
    This contribution describes the usage and benchmarking of a commercial data centre running Openstack. Different cloud provisional tools are described highlighting the pros and cons of each system. A comparison is made between this facility and a standard grid T2 site in terms of job throughput and availability. Usage of the centre’s local object store is also described.
    Go to contribution page
  54. Dr Tony Wong (Brookhaven National Laboratory)
    25/03/2015, 14:25
    Computing & Batch Services
    The RHIC-ATLAS Computing Facilty (RACF) at BNL has traditionally evaluated hardware on-site, with physical access to the systems. The effort to request evaluation hardware, shipping, set-up and testing has consumed an increasing amount of time and the process has become less productive over the years. To regain past productivity and shorten the evaluation process, BNL has started a pilot...
    Go to contribution page
  55. Gang Qin (University of Glasgow (GB))
    25/03/2015, 14:50
    Computing & Batch Services
    Modern Linux Kernels include a feature set that enables the control and monitoring of system resources, called Cgroups. Cgroups have been enabled on a production HTCondor pool sited at the Glasgow site of the UKI-SCOTGRID distributed Tier-2. A system has been put in place to collect and aggregate metrics extracted from Cgroups on all worker nodes within the Condor pool. From this...
    Go to contribution page
  56. Manfred Alef (Karlsruhe Institute of Technology (KIT))
    25/03/2015, 15:15
    Computing & Batch Services
    In this talk we will provide information about the current status of the preliminary work to relaunch the HEPiX Benchmarking Working Group which will develop the next release of the HEP CPU benchmark.
    Go to contribution page
  57. Dr Michele Michelotto (INFN Padua & CMS)
    25/03/2015, 16:05
    Computing & Batch Services
    The WLCG community has requested a fast benchmark to quickly assess the perfomances of a worker node. A good candidate is a python script used in LHCb
    Go to contribution page
  58. Dr Lucia Morganti (INFN)
    25/03/2015, 16:30
    Computing & Batch Services
    Systems on Chip (SoCs), originally targeted for mobile and embedded technology, are becoming attractive for HEP and HPC scientific communities, given their low cost, huge worldwide shipments, low power consumption and increasing processing power - mostly associated with their GPUs. A variety of development boards are currently available, making it foreseeable to use these power-efficient...
    Go to contribution page
  59. Liviu Valsan (CERN)
    25/03/2015, 16:55
    Computing & Batch Services
    x86 is the uncontested leader for server platforms in terms of market share and is currently the architecture of choice for High Energy Physics applications. But as more and more importance is given to power efficiency, physical density and total cost of ownership we are seeing new processor architectures emerging and some existing ones becoming more open. With the introduction of AArch64,...
    Go to contribution page
  60. Mr David Power (Boston Ltd.)
    25/03/2015, 17:20
    Computing & Batch Services
    The talk's coverage will include Xeon Haswell, ARM and Open Compute Platforms
    Go to contribution page
  61. William Strecker-Kellogg (Brookhaven National Lab)
    26/03/2015, 09:00
    Basic IT Services
    It's simple enough to instantiate a new process in an existing environment; it can be much more challenging to foster acceptance of such a process in IT environments and cultures that are traditionally stagnant and resistant to change, and to maintain and optimize that process to ensure it continues to realize optimal benefit. To enhance our computing facility, we've already taken...
    Go to contribution page
  62. Alberto Rodriguez Peon (Universidad de Oviedo (ES))
    26/03/2015, 09:25
    Basic IT Services
    CERN’s experience of migrating a large site to a Puppet-based and more dynamic Configuration Service will be presented. The presentation will review some of the challenges encountered along the way and describe future plans for how to scale the service and improve the overall automation of operations on the site.
    Go to contribution page
  63. Stefan Dietrich (DESY)
    26/03/2015, 09:50
    Basic IT Services
    Marionette Collective, also known as MCollective, is a framework for building server orchestration, monitoring, and parallel job execution. MCollective uses a modern "Publish Subscribe Middleware" for a scalable and fast execution environment. It is a powerful tool in combination with Puppet, due to the good integration. However it can be a challenging task to configure and deploy...
    Go to contribution page
  64. James Adams (STFC RAL)
    26/03/2015, 10:15
    Basic IT Services
    The Quattor community has been maintaining Quattor for over ten years and having recently held our 19th community workshop the pace of development continues to increase. This talk will demonstrate why Quattor is more than just a configuration management system, report on recent developments and provide some notable updates and experiences from sites.
    Go to contribution page
  65. Peter Love (Lancaster University (GB))
    26/03/2015, 11:05
    Basic IT Services
    The dominant monitoring system used in distributed computing consists of visually rich time-series graphs and notification systems for alerting operators when metrics fall outside of accepted values. For large systems this can quickly become overwhelming. In this contribution a different approach is described using the sonification of monitoring messages with an architecture which fits easily...
    Go to contribution page
  66. Francisco Valentin Vinagrero (CERN)
    26/03/2015, 11:30
    Basic IT Services
    IP-based voice telephony (VoIP) and the SIP protocol are clear examples of disruptive technologies that have revolutionised a previously settled market. In particular, open-source solutions now have the ascendancy in the traditional Private Branch eXchange(PBX) market. We present a possible architecture for the modernisation of CERN's fixed telephony network, highlighting the technical...
    Go to contribution page
  67. Andrei Dumitru (CERN)
    26/03/2015, 11:55
    Basic IT Services
    CERN has a great number of applications that rely on a database for their daily operations. From physics related databases to the administrative, sector there is a high demand to have a database system appropriate to the users' needs and requirements. This presentation gives a summary of the current state of the Database Services at CERN, the work done during LS1 and some insights into the...
    Go to contribution page
  68. Daniel Gruber (U)
    26/03/2015, 12:20
    Computing & Batch Services
    - Introduction - DRMAA2 in a Nutshell - The C Interface - Data Types, Monitoring Sessions, Job Sessions, Working with Jobs, Job Templates, Error Handling and Dealing with Enhancements - Getting started with DRMAA2 - Example Applications - Job Monitoring Applications and Simple Multi-Clustering
    Go to contribution page
  69. George Ryall (STFC - Rutherford Appleton Lab.)
    26/03/2015, 14:00
    Grid, Cloud & Virtualisation
    The STFC Scientific computing department has been developing an OpenNebula based cloud underpinned by Ceph block storage. I will describe some of our use cases, our set up,and give a demonstration of our development VM on demand service. I will go on to explore some of the problems we have overcome to reach this point. Finally, I will present the work we are doing to use spare capacity on...
    Go to contribution page
  70. Bruno Bompastor (CERN)
    26/03/2015, 14:25
    Grid, Cloud & Virtualisation
    This is a report on the current status and future plans of CERN’s OpenStack-based Cloud Infrastructure.
    Go to contribution page
  71. Alexander Dibbo (urn:Google)
    26/03/2015, 14:50
    Grid, Cloud & Virtualisation
    The Scientific Computing Department at the STFC has been developing a Ceph block storage backed OpenNebula cloud. We have carried out a quantitative evaluation of the performance characteristics of virtual machines which have been instantiated with a variety of different storage configurations (using both Ceph and local disks). I will describe our motivations for this testing, our methodology...
    Go to contribution page
  72. Andrew McNab (University of Manchester (GB))
    26/03/2015, 15:15
    Grid, Cloud & Virtualisation
    The Vacuum model provides a method for managing the lifecycle of virtual machines based on their observed success or failure in finding work to do for their experiment. In contrast to centrally managed grid job submission and cloud VM instantiation systems, the Vacuum model gives resource providers direct control over which experiments' VMs or jobs are created and in what proportion. This...
    Go to contribution page
  73. John Hover (Brookhaven National Laboratory (BNL)-Unknown-Unknown)
    26/03/2015, 16:05
    Grid, Cloud & Virtualisation
    Beginning in September 2014, the RACF at Brookhaven National Lab has been collaborating with Amazon's scientific computing group in a pilot project. The goal of this project is to demonstrate the usage of Amazon AWS (EC2, S3, etc.) for real-world ATLAS production. This will prove the practical and economic feasibility of ATLAS beginning to leverage commercial cloud computing to optimize...
    Go to contribution page
  74. Mr Dario Rivera (Amazon Web Services)
    26/03/2015, 16:30
    Grid, Cloud & Virtualisation
    On the heals of discussing the BNL RACF Group's Proof Of Concept on AWS, this session will share best practices on some of the most common AWS services used by Big Science, such as EC2, VPC, S3, and complex hybrid networking and routing. We will also provide an overview of the AWS Scientific Computing Group which was created to help Global Scientific collaborations develop and ecosystem...
    Go to contribution page
  75. Bruno Bompastor (CERN)
    26/03/2015, 16:55
    Grid, Cloud & Virtualisation
    Heat, the Openstack orchestration service, is being deployed at CERN. We will be presenting the overall architecture and features included in the project, our deployment challenges and future plans.
    Go to contribution page
  76. Mr Levente Hajdu (Brookhaven National Laboratory)
    26/03/2015, 17:20
    Grid, Cloud & Virtualisation
    In statistically hungry science domains, data taking data deluges can be both a blessing and a curse. They allow the winnowing out of statistical errors from known measurements, open the door to new scientific opportunities as the physics program matures but are also a testament to the efficiency of the experiment and accelerator and skill of its operators. However, the data samples need to be...
    Go to contribution page
  77. Jerome Belleman (CERN)
    27/03/2015, 09:00
    Computing & Batch Services
    The CERN Batch System comprises 4000 worker nodes, 60 queues and offers a service for various types of large user communities. In light of the developments driven by the Agile Infrastructure and the more demanding processing requirements, it is faced with increasingly challenging scalability and flexibility needs. This production cluster currently runs IBM/Platform LSF. Over the last...
    Go to contribution page
  78. Manfred Alef (Karlsruhe Institute of Technology (KIT))
    27/03/2015, 09:25
    Computing & Batch Services
    The Grid Computing Centre Karlsruhe (GridKa) is using the Grid Engine batch system since 2011. In this presentation I will talk about the experiences with this batch system, including multi-core job support, and first experiences with cgroups.
    Go to contribution page
  79. Erik Mattias Wadenstein (University of Umeå (SE))
    27/03/2015, 09:50
    Computing & Batch Services
    An update on the current status of SLURM usage in the Nordics, as well as recent developments in improving support for LHC type jobs including tuning for efficient scheduling of multicore grid jobs. Also an overview of some remaining challenges will be given together with discussion on how to address them.
    Go to contribution page
  80. Mr Michel Jouvin (Laboratoire de l'Accelerateur Lineaire (FR))
    27/03/2015, 10:15
    Computing & Batch Services
    I propose to give a summary of the Condor workshop, held at CERN mid-December.
    Go to contribution page
  81. Andrew David Lahiff (STFC - Rutherford Appleton Lab. (GB))
    27/03/2015, 11:05
    Computing & Batch Services
    After running Torque/Maui for many years, the RAL Tier-1 migrated to HTCondor during 2013 in order to benefit from improved reliability, scalability and additional functionality unavailable in Torque. This talk will discuss the deployment of HTCondor at RAL, our experiences and the evolution of our pool over the past two years, as well as our future plans.
    Go to contribution page
  82. Jerome Belleman (CERN)
    27/03/2015, 11:30
    Computing & Batch Services
    While we are taking measures to face the limitations discussed earlier on in our IBM/Platform LSF cluster, we have been working on setting up a new batch system based on HTCondor. There has been some progress with the pilot service which we described last HEPiX. We also went on investigating some of the more advanced functions which will lead up to the production state of the new CERN...
    Go to contribution page
  83. Andrew David Lahiff (STFC - Rutherford Appleton Lab. (GB))
    27/03/2015, 11:55
    Computing & Batch Services
    With the increasing interest in HTCondor in Europe, an important question for sites considering migrating to HTCondor is how well it integrates with the standard grid middleware, in particular integration with the information system and APEL accounting. Also, with the increasing interest and usage of private clouds, how easily a batch system can be integrated with a private cloud is another...
    Go to contribution page
  84. Stephen Jones (Liverpool University)
    27/03/2015, 12:20
    Computing & Batch Services
    This talk describes DrainBoss, which is a proportional integral (PI) controller with conditional logic that strives to maintain the correct ratio between single-core and multi-core jobs in an ARC/HTCondor cluster. DrainBoss can be used instead of the HTCondor DEFRAG Daemon.
    Go to contribution page
  85. Mr Romain Wartel (CERN)
    27/03/2015, 12:45
    Miscellaneous
  86. Dr Helge Meinhard (CERN)
    27/03/2015, 12:50
    Miscellaneous
    Usual summary and conclusions
    Go to contribution page