Shawn Mc Kee
(University of Michigan (US))
10/12/15, 9:35 AM
Site Reports
We will present an update on our site since the Fall 2014 report and cover our work with various storage technologies, ATLAS Muon Calibration and our use of the ELK stack for central sysloging. We will also report on our recent hardware purchases for 2015 as well as the status of our new networking configuration and 100G connection to the WAN. I conclude with a summary of what has worked...
Rennie Scott
(Fermilab)
10/12/15, 10:15 AM
Site Reports
News and updates from Fermilab since the Spring HEPiX Workshop.
Peter Gronbech
(University of Oxford (GB))
10/12/15, 11:00 AM
Site Reports
Update on Oxford University Particle Physics group computing setup, including short updates from the other member sites of SouthGrid.
Jose Flix Molina
(Centro de Investigaciones Energ. Medioambientales y Tecn. - (ES)
10/12/15, 11:20 AM
Site Reports
We will be revising the status of PIC by Fall 2015. News from Oxford meeting will be reported.
Tony Quan
(LBL)
10/12/15, 11:40 AM
Site Reports
PDSF, the Parallel Distributed Systems Facility, has been in continuous operation since 1996 serving high energy physics research. It is currently a tier-1 site for Star, a tier-2 site for Alice and a tier-3 site for Atlas.
The PDSF cluster will move early next year from its current site at Oakland to a new building on the LBNL campus. Several racks have already been installed at the new...
Dr
Qiulan Huang
(Institute of High Energy of Physics Chinese Academy of Science)
10/12/15, 12:20 PM
Site Reports
News and updates from IHEP since the Spring HEPiX Workshop. In this talk we will present a brief status of IHEP site including computing farm, Grid, data storage ,network and so on.
Martin Lothar Purschke
(Brookhaven National Laboratory (US))
10/12/15, 2:00 PM
Grid, Cloud & Virtualisation
We report on a simulation effort using the Open Science Grid which
utilized a large fraction of the available OSG resources for about 13
weeks in the first half of 2015.
sPHENIX is a proposed upgrade of the PHENIX experiment at the
Relativistic Heavy Ion Collider. We have collected large data sets of
proton-proton collision data in 2012 and 2013, and plan to carry out a
similar study...
Bonnie King
(Fermilab)
10/12/15, 2:20 PM
End-User IT Services & Operating Systems
Scientific Linux status and news.
Arne Wiebalck
(CERN)
10/12/15, 2:40 PM
End-User IT Services & Operating Systems
In this talk we will present a brief status update on CERN's work on CentOS 7, the uptake by the various IT services, and the interaction with the upstream CentOS community.
We will talk about the SIGs status, new projects and work done over the last months, presenting a list of our contributions and feedback about the experience.
Borja Aparicio Cotarelo
(CERN)
10/12/15, 3:00 PM
End-User IT Services & Operating Systems
The Version Control and Issue Tracking services team at CERN, is facing a transition and consolidation of the services provided in order to fulfill the needs of the CERN community as well as maintain the current deployments that are used.
Software and services development is a key activity at CERN that currently is widely carried out using Agile methodologies. Code hosting and review,...
Chris Brew
(STFC - Rutherford Appleton Lab. (GB))
10/12/15, 3:50 PM
End-User IT Services & Operating Systems
We present our work deploying on ownCloud gateway to our existing home file storage to provided access to mobile clients.
Tim Bell
(CERN)
10/12/15, 4:10 PM
End-User IT Services & Operating Systems
CERN has recently deployed a Mac self service portal to allow users to easily select software and perform standard configuration steps.
This talk will review the requirements, product selection and potential evolution for mobile device management.
Elizabeth Bautista
(Lawrence Berkeley National Lab)
10/12/15, 4:30 PM
IT Facilities & Business Continuity
The NERSC facility is transitioning to a new building housed at the LBNL main campus in the 2015 timeframe. This state of the art facility is energy efficient, providing year-round free air and water cooling, is initially provisioned for 12.5 MW power and capable of up to 42 MW power, has two office floors, a 20K square foot HPC floor with seismic isolation and a mechanical floor. Substantial...
Ian Peter Collier
(STFC - Rutherford Appleton Lab. (GB))
10/12/15, 4:50 PM
Miscellaneous
Introducing a non-graduate recruitment path in scientific computing at the Rutherford Appleton Laboratory.
Recruiting and retaining high quality staff is an increasing challenge at STFC. We traditionally recruit people with relevant degrees and/or industry experience. But this becomes increasingly difficult, as does recruiting to our graduate recruitment program. At the same time steep...
Michel Jouvin
(Laboratoire de l'Accelerateur Lineaire (FR))
10/12/15, 5:10 PM
Site Reports
Changes at LAL and GRIF in the last 18 months.
Sandy Philpott
10/13/15, 9:00 AM
Site Reports
Current high performance and experimental physics computing environment updates: core exchanges between USQCD and Experimental Physics clusters for load balancing, job efficiency, and 12GeV data challenges; Nvidia K80 GPU experiences and updated Intel MIC environment; update on locally developed workflow tools and write-through to tape cache filesystem; status of LTO6 integration into our MSS;...
Ajit Kumar Mohapatra
(University of Wisconsin (US))
10/13/15, 9:20 AM
Site Reports
The University of Wisconsin Madison CMS T2 is a major WLCG/OSG T2 site. It has consistently delivered highly reliable and productive services for CMS MC production/processing, and large scale CMS physics analysis using high throughput computing (HTCondor), highly available storage system (Hadoop), efficient data access using xrootd/AAA, and scalable distributed software systems (CVMFS). The...
Lucien Philip Boland
(University of Melbourne (AU))
10/13/15, 9:40 AM
Site Reports
Update on activities at Australia's HEP Tier 2 grid facility.
Erik Mattias Wadenstein
(University of Umeå (SE))
10/13/15, 10:00 AM
Site Reports
Update on recent events in the Nordic countries
Szabolcs Hernath
(Hungarian Academy of Sciences (HU))
10/13/15, 10:50 AM
Site Reports
We give an update on the infrastructure, Tier-0 hosting services, Wigner Cloud and other recent developments at the Wigner Datacenter. We will also include a short summary on the Budapest WLCG Tier-2 site status as well.
Tomoaki Nakamura
(High Energy Accelerator Research Organization (JP))
10/13/15, 11:10 AM
Site Reports
KEK computing research center supports various project of accelerator based science in Japan. Hadron and Neutrino experiments (T2K) at J-PARC have started with good rate after the recovery of earthquake damage at Fukushima. Belle II experiment is going to collect 100PB of raw data within the several years. In this talk, we would like to report the current status of our computing facility and...
Andreas Petzold
(KIT - Karlsruhe Institute of Technology (DE))
10/13/15, 11:30 AM
Site Reports
News about GridKa Tier-1 and other KIT IT projects and infrastructure.
Johan Henrik Guldmyr
(Helsinki Institute of Physics (FI))
10/13/15, 11:50 AM
Site Reports
- CSC
- Ansible and HPC
- Slurm as an interactive shell load balancer
Yemi Adesanya
10/13/15, 12:10 PM
Site Reports
An update on SLAC's central Unix services in support of Scientific Computing and Core Infrastructure. New funding model for FY15 identifies indirect vs direct-funded effort. Socializing the concept of service and service lifecycle. Sustainable business models to address hardware lifecycle replacement: Storage-as-a-Service with GPFS, OpenStack for dev/test environments and cluster provisioning.
Wolfgang Friebel
(Deutsches Elektronen-Synchrotron Hamburg and Zeuthen (DE))
10/13/15, 2:00 PM
Site Reports
Updates from DESY Zeuthen
Dr
Shigeki Misawa
(Brookhaven National Laboratory)
10/13/15, 2:20 PM
Site Reports
This site report will discuss the latest developments at the RHIC-ATLAS Computing Facility (RACF).
Dave Kelsey
(STFC - Rutherford Appleton Lab. (GB))
10/13/15, 2:40 PM
Security & Networking
This talk will present a status update from the IPv6 working group, including recent testing and the deployment of (some) dual-stack services and monitoring in WLCG.
Edgar Fajardo Hernandez
(Univ. of California San Diego (US))
10/13/15, 3:00 PM
Security & Networking
This talk will present the latest results from the IPv6 compatibility tests performed on the OSG Software stack.
Eric Sallaz
(CERN)
10/13/15, 3:50 PM
Security & Networking
With the evolution of transmission technologies, going above 10Gb Ethernet requires a complete renewal of the fibre infrastructure.
In the last year the CERN Datacentre has evolved to deal with the expansion of the physical infrastructure inside and outside the local site, to support higher speeds like 40GbE and 100GbE, to be ready for any other future requirement. We will explain the choice...
Shawn Mc Kee
(University of Michigan (US))
10/13/15, 4:10 PM
Security & Networking
It has been approximately one year since the WLCG Network and Transfer Metrics working group was initiated and we would like provide a summary what has been achieved during this first year and discuss future activities planned for the group.
The working group as chartered had a number of objectives:
- Identify and make continuously available relevant network and
transfer...
Shawn Mc Kee
(University of Michigan (US))
10/13/15, 4:30 PM
Security & Networking
WLCG relies on the network as a critical part of its infrastructure and therefore needs to guarantee effective network usage and prompt detection and
resolution of any network issues, including connection failures, congestion and traffic routing. The WLCG Network and Transfer Metrics working group
was established to ensure sites and experiments can better understand and fix networking...
Carles Kishimoto Bisbe
(CERN)
10/13/15, 4:50 PM
Security & Networking
With the virtualization of the data centre, there is a need to move virtual machines transparently across racks when the physical servers are being decommissioned. We will present the solution being tested at CERN using VPLS in an MPLS network
Liviu Valsan
(CERN)
10/13/15, 5:10 PM
Security & Networking
This presentation provides an update on the global security landscape since the last HEPiX meeting. It describes the main vectors of compromises in the academic community including lessons learnt, presents interesting recent attacks while providing recommendations on how to best protect ourselves. It also covers security risks management in general, as well as the security aspects of the...
Liviu Valsan
(CERN)
10/13/15, 5:30 PM
Security & Networking
The HEP community is facing an ever increasing wave of computer security threats, with more and more recent attacks showing a very high level of complexity. Having a centralised Security Operations Centre (SOC) in place is paramount for the early detection and remediation of such threats. Key components and recommendations to build an appropriate monitoring and detection Security Operation...
Michel Jouvin
(Laboratoire de l'Accelerateur Lineaire (FR))
10/14/15, 8:50 AM
Grid, Cloud & Virtualisation
This presentation is a general introduction, and it will also describe the opportunities for cooperation with HEPIX and how sites can participate in GDB.
Arne Wiebalck
(CERN)
10/14/15, 9:00 AM
Grid, Cloud & Virtualisation
This presentation will provide an update of the activities in the CERN Cloud service since the Oxford workshop.
Arne Wiebalck
(CERN)
10/14/15, 9:20 AM
Grid, Cloud & Virtualisation
This talk will summarise our activities related to the optimisation of the virtualised compute resources in our OpenStack-based infrastructure. In particular, we will discuss some of the issues we've encountered and the various optimisations we've applied to bring the virtual resources as close as possible to bare-metal performance.
Sean Crosby
(University of Melbourne (AU))
10/14/15, 9:40 AM
Grid, Cloud & Virtualisation
In the CERN Cloud Computing project, there is a need to ensure that the overall performance of hypervisors and virtual machines does not decrease due to configuration changes, or just because of the passage of time.
This talk will outline an automated performance framework currently being developed, which will allow performance of virtual machines and hypervisors to be graphed and linked.
Christopher Hollowell
(Brookhaven National Laboratory)
10/14/15, 10:00 AM
Grid, Cloud & Virtualisation
Application containers have become a competitive alternative to virtualized servers.
Containers allow applications to be written once, distributed across a heterogeneous
environment (ie, cloud, remote data centers) and executed transparently on multiple platforms
without the performance overhead commonly found on virtual systems. We present an initial
evaluation of Docker, along with a...
Domenico Giordano
(CERN)
10/14/15, 10:50 AM
Grid, Cloud & Virtualisation
Performance measurements and monitoring are essential for the efficient use of computing resources as they allow selecting and validating the most effective resources for a given processing workflow. In a commercial cloud environment an exhaustive resource profiling has additional benefits due to the intrinsic variability of a virtualised environment. In this context resource profiling via...
Wataru Takase
(KEK)
10/14/15, 11:10 AM
Grid, Cloud & Virtualisation
Cloud computing enables a flexible provisioning of computing resources by utilizing virtual machines on demand, and can provide an elastic data analysis environment. At KEK we plan to integrate cloud-computing technology into our batch cluster system in order to provide heterogeneous clusters dynamically, to support various kinds of data analyses, and to enable elastic resource allocation...
Chander Sehgal
(Fermilab)
10/14/15, 11:30 AM
Grid, Cloud & Virtualisation
The Open Science Grid (OSG) ties together computing resources from a broad set of research communities, connecting their resources to create a large, robust computing grid; this computing infrastructure started with large HEP experiments such as ATLAS, CDF, CMS, and Dzero and it has evolved so now it is also enabling the scientific computation of many non-physics researchers. OSG has been...
Andrea Chierici
(INFN-CNAF)
10/14/15, 11:50 AM
Grid, Cloud & Virtualisation
INDIGO-DataCloud aims at developing a data and computing platform targeted at
scientific communities, integrating existing standards and open software
solutions. INDIGO proposes:
- to build-up a PaaS solution leveraging existing resources and e-Infrastructures, since the mere access to IaaS resources has been demonstrated as not being a realistic option for most Research Communities
-...
Andrew David Lahiff
(STFC - Rutherford Appleton Lab. (GB))
10/14/15, 12:10 PM
Grid, Cloud & Virtualisation
Running services in containers managed by a scheduler offers a number of potential benefits compared to traditional infrastructure, such as increased resource utilisation through multi-tenancy, the ability to have elastic services, and improved site availability due to self-healing. At the RAL Tier-1 we have been investigating migration of services to an Apache Mesos cluster running on bare...
Brian Lin
(University of Wisconsin - Madison)
10/14/15, 2:00 PM
Grid, Cloud & Virtualisation
HTCondor-CE is a special configuration of HTCondor designed to connect compute resources to the wider grid. Leveraging the power of HTCondor, HTCondor-CE is able to provide built-in security measures, end-to-end job tracking, and better integration with overlay job systems. This talk will present an overview of the HTCondor-CE software, its deployment in the Open Science Grid (OSG), and...
James Botts
(LBNL)
10/14/15, 2:20 PM
Grid, Cloud & Virtualisation
This presentation will describe work that has been done to
make NERSC Cray systems friendlier to data intensive workflows
in anticipation of the availability of the NERSC-8 system this Autumn. Using shifter, a docker-like container technology developed at NERSC by
Doug Jacobsen and Shane Canon, the process of delivering
software stacks to Cray compute nodes has been greatly...
Dr
Qiulan Huang
(Institute of High Energy of Physics Chinese Academy of Science)
10/14/15, 2:40 PM
Grid, Cloud & Virtualisation
The report will introduce our plan of computing resources virtualization. We will discuss progress of this project such as Openstack-based infrastructure, virtual scheduling and measure. And we will share experience of our test-bed which deployed with Openstack icehouse.
Alberto Rodriguez Peon
(Universidad de Oviedo (ES))
10/14/15, 3:00 PM
Storage & Filesystems
CvmFS is a network file system based on HTTP and optimised to deliver experiment software in a fast, scalable, and reliable way. This presentation will review the status of the stratum 0 deployment at CERN, mentioning some of the challenges faced during its migration to ZFS as the underlying file system.
Alberto Pace
(CERN)
10/14/15, 3:50 PM
Storage & Filesystems
Several discussions are ongoing at CERN concerning the future of AFS and a possible replacement using the existing CERNBOX service based on OwnCloud and the CERN disk storage solution developed for the LHC Computing (EOS).
The talk will mainly review the future plans for teh CERNBOX Services ant the various use case that have been identified.
Alexandr Zaytsev
(Brookhaven National Laboratory (US))
10/14/15, 4:10 PM
Storage & Filesystems
We give a report on the status of Ceph based storage systems deployed in RACF that are currently providing 1 PB of data storage capacity for the object store (with Amazon S3 compliant RADOS Gateway front end), block storage (RBD), and shared file system (CephFS with dCache front end) layers of Ceph storage system. The hardware deployment and software upgrade procedures performed in the year of...
George Vasilakakos
(STFC)
10/14/15, 4:30 PM
Storage & Filesystems
RAL is currently developing storage services powered by a Ceph object storage system. We review the test results and experiences of the newly-installed 5 PB cluster at RAL, as well as our plans for it. Since the aim is to provide large scale storage for experimental data with minimal space overhead, we focus on testing a variety of erasure coding techniques and schemes. We look at...
Gerard Bernabeu Altayo
(Fermilab)
10/14/15, 4:50 PM
Storage & Filesystems
Fermilab stores more than 110PB of data employing different technologies (dCache, EOS, BlueArc) to address a wide variety of use cases and application domains. This presentation captures present state of data storage at Fermilab and maps out future directions in storage technology choices at the lab.
Mr
David Fellinger
(DataDirect Networks, Inc)
10/14/15, 5:10 PM
Storage & Filesystems
The acceleration of high performance computing applications in large clusters has primarily been achieved with a focus on the cluster itself. Lower latency interconnects, more efficient message passing structures, higher performance processors, and general purpose graphics processing units have been incorporated in recent cluster designs. There has also been a great deal of study regarding...
Mr
Amit Chattopadhyay
(WD)
10/15/15, 9:00 AM
Storage & Filesystems
Given the amount of inline data generated, large volume hard-drive manufacturing is an appropriate environment to employ contemporary 'Big Data' techniques. It can be used to generate a feedback loop, as well as a feed-forward path. The feedback loop helps fix what is broken. The feed-forward path can be used to predict drive health.
In this presentation, I will focus on some work we have...
Patrick Fuhrmann
(Deutsches Elektronen-Synchrotron Hamburg and Zeuthen (DE))
10/15/15, 9:20 AM
Storage & Filesystems
The pressure to provide cheap, reliable and unlimited cloud storage space in the commercial area has provided science with affordable storage hardware and open source storage solutions with low maintenance costs and tuneable performance and durability properties, resulting in different cost models per storage unit. Those models, already introduced by WLCG a decade ago (disk vs tape) are now...
Zhenping Liu
(Brookhaven National Laboratory (US))
10/15/15, 9:40 AM
Storage & Filesystems
As ATLAS tier-1 computing facility center in US, RHIC and ATLAS Computing Facility (RACF) at Brookhaven National Lab has been operating a very large scale disk storage system dCache with tape back-end to serve a geographically diverse, worldwide ATLAS scientific community. This talk will present the current state of USATLAS dCache storage system at BNL. It will describe structure,...
Natalia Ratnikova
(Fermilab)
10/15/15, 10:00 AM
Storage & Filesystems
Prior to LHC Run 2 CMS collected over 100 PB of physics data on the distributed storage facilities outside CERN, and the storage capacity will considerably increase in the next years. During the Run 2 the amount of storage allocated to individual users analysis data will reach up to 40% of the total space pledged by the CMS sites. CMS Space Monitoring system is developed to give a...
Dave Rotheroe
(HP)
10/15/15, 10:20 AM
IT Facilities & Business Continuity
HP IT consolidated from 85 ancient data centers running old IT technology to six new mega data centers with modern IT running transformed applications in the late 2000’s. That achievement resulted in (literally) billions saved through a meshing of significantly more efficient data centers, utilization of then-current IT technology, and application transformation. Since achieving that, HP IT...
Mr
Imran Latif
(Brookhaven National Laboratory)
10/15/15, 11:10 AM
IT Facilities & Business Continuity
The methods and techniques of scientific research at Brookhaven National Laboratory are increasingly dependent upon the ability to acquire, analyze, and store vast quantities of data. The needs for data processing equipment and supporting infrastructure are anticipated to grow significantly in the next few years, soon exceeding the capacity of existing data center resources, currently located...
Eric Bonfillou
(CERN)
10/15/15, 11:30 AM
IT Facilities & Business Continuity
This presentation will give an overview of the recent efforts to establish an accurate and consistent inventory and stock management for CERN data centre assets. The underlying tool, Infor EAM (http://www.infor.com/solutions/eam/), was selected because of its wider usage over many years in other areas at CERN. The presentation will focus on the structuring of the IT assets data and how it is...
Jose Flix Molina
(Centro de Investigaciones Energ. Medioambientales y Tecn. - (ES)
10/15/15, 11:50 AM
IT Facilities & Business Continuity
Energy consumption is an increasing concern for data centers. This contribution summarizes recent energy efficiency upgrades at the Port d’Informació Científica (PIC) in Barcelona, Spain which have considerably lowered energy consumption. The upgrades were particularly challenging, as they involved modifying the already existing machine room, which is shared by PIC with the general IT services...
Mr
Michael Ross
(HP)
10/15/15, 12:10 PM
IT Facilities & Business Continuity
In today’s Federal Information Technology world you are face with many challenges. To name a few Cyber-attacks, un-funded mandates to consolidate data centers, new IT acquisitions laws requiring de-duplication with agency level CIO oversight implemented by the Federal Information Technology Acquisition Reform Act (FITARA), aging equipment and data center infrastructure. Each of these...
Carlos Fernando Gamboa
(Brookhaven National Laboratory (US))
10/15/15, 2:00 PM
Basic IT Services
Within the High Energy Physics (HEP) software infrastructure a diverse Data storage and Distribution software technologies are used. Despite their heterogeneity they all provide a capability to trace application event data to be able to troubleshoot problems related to the software stack or software usage. The subsequent data is written once and stored in the majority of cases in a file,...
Cary Whitney
(LBNL)
10/15/15, 2:20 PM
Basic IT Services
The ELK (Elastic, Logstash, Kibana) stack has been chosen to be one of the key component for our new centerwide monitoring project. I'll discuss our overall philosophy on monitoring and how ELK fits in. The current structure and how it is performing.
Define centerwide: Everything in the data centers. All hosts, filesystems, most applications, power, cooling, water flow, temperature,...
Miguel Coelho dos Santos
(CERN)
10/15/15, 2:40 PM
Basic IT Services
This presentation will provide an update of the activities concerning IT Monitoring. This includes the monitoring of data centre infrastructure, hardware monitoring, host monitoring and application monitoring; as well as the tools being used or tested.
Alberto Rodriguez Peon
(Universidad de Oviedo (ES))
10/15/15, 3:00 PM
Basic IT Services
An update on CERN’s Configuration Service will be presented. This presentation will review the current status of the infrastructure and describe some of the ongoing work and future plans, with a particular focus on automation and continuous integration. Recent effort to scale and accommodate a higher number of puppet clients will also be mentioned.
Bonnie King
(Fermilab)
10/15/15, 3:50 PM
Basic IT Services
Fermilab has moved from the era of two large multi-decade experiments to hosting several smaller experiments with a shorter lifecycle. Improvements in micrcontroller performance have moved computers closer to the experiment Data Acquisition systems where custom electronics have previously been used. There are also efforts to standardize DAQ software into reuseable products in alignment with...
James Adams
(STFC RAL)
10/15/15, 4:10 PM
Basic IT Services
An update of developments and activities in the Quattor community over the last six months.
Christopher Hollowell
(Brookhaven National Laboratory)
10/15/15, 4:30 PM
Computing & Batch Services
NVMe (Non-Volatile Memory express) is a leading-edge SSD technology
where drives are directly attached to the PCI-e bus. Typical
SAS/SATA controllers are optimized for use with traditional
rotating hard drives, and as such can increase latency, and
reduce the bandwidth available to attached SSDs. Since
NVMe drives bypass a SAS/SATA controller, they can help
minimize/eliminate...
Thomas Finnern
(DESY)
10/15/15, 4:50 PM
Computing & Batch Services
This presentation will provide information on the status of the batch systems at DESY Hamburg. This includes the clusters for GRID, HPC and local batch purposes showing the current state and the activities for upcoming enhancements.
Helge Meinhard
(CERN),
Michele Michelotto
(Universita e INFN, Padova (IT))
10/15/15, 5:10 PM
Computing & Batch Services
Status of the benchmarking working group and work going on in WLCG around benchmarking
William Edward Strecker-Kellogg
(Brookhaven National Laboratory (US)),
William Strecker-Kellogg
(Brookhaven National Lab)
10/16/15, 9:00 AM
Computing & Batch Services
Scheduling jobs with heterogeneous requirements to a heterogeneous pool
of computers is a challenging task. HTCondor does a great job supporting
such a general-purpose setup with features like Hierarchical Group
Quotas and Partitionable Slots. At BNL we have a model, configuration,
and software to handle the administration of such a pool, and in this
talk we will share our experience...
Vanessa HAMAR
(CC-IN2P3)
10/16/15, 9:20 AM
Computing & Batch Services
We are using Univa Grid Engine as BATCH scheduling system to our satisfaction since four years. We focus on the latest major version 8.2.1, which was deployed at IN2P3-CC 4 months ago, and provides further scalability improvements.
We are supporting about 200 groups and experiments running up to 17.000 jobs simultaneously. The requirements, in terms of computing resources, storage or...
Todd Tannenbaum
(Univ of Wisconsin-Madison, Wisconsin, USA)
10/16/15, 9:40 AM
Computing & Batch Services
The goal of the HTCondor team is to to develop, implement, deploy, and evaluate mechanisms and policies that support High Throughput Computing (HTC) on large collections of distributively owned computing resources. Increasingly, the work performed by the HTCondor developers is being driven by its partnership with the High Energy Physics (HEP) community.
This talk will present recent changes...
William Edward Strecker-Kellogg
(Brookhaven National Laboratory (US)),
William Strecker-Kellogg
(Brookhaven National Lab)
10/16/15, 10:00 AM
Computing & Batch Services
The RACF is a key component in BNL's new Computational Science Initiative (CSI). One
of CSI's goals is to leverage the RACF's expertise to shorten the time and effort
needed to archive, process and analyze data from non-traditional fields at BNL.
This presentation describes a concrete example of how the RACF has helped
non-traditional workloads run in the RACF computing environment, and...
Jerome Belleman
(CERN)
10/16/15, 10:50 AM
Computing & Batch Services
So as to have our Batch Service at CERN answer increasingly challenging scalability and flexibility needs, we have chosen to set up a new batch system based on HTCondor. We have set up a Grid-only pilot service and major LHC experiments have started trying it out. While the pilot is slowly becoming production-ready, we're laying out a plan for our next major milestone: to run local jobs too,...
Mizuki Karasawa
(BNL)
10/16/15, 11:10 AM
Basic IT Services
In a rapidly growing facility as NSLS-II, we use foreman as an automation tool that integrates to DNS, DHCP, TFTP, Puppet which makes installation & provisioning processes much easier and help to bring the service/server components online in a short timely manner. For those who use Puppet Enterprise as an paid version ENC, Foreman can also substitute of that. This talk will present the detail...
Mizuki Karasawa
(BNL)
10/16/15, 11:30 AM
Basic IT Services
Gitlab - a MIT licensed open source tool, that cooperates a set of rich features managing Git repositories, code reviews, issue tracking, activity feeds and wikis. The most powerful feature - CI for continuous integration makes the code developing much more efficient and cost saving, it's also a great tool to enhance the communication and collaboration. In NSLS-II, we have a great number of...
Dmitry Nilsen
10/16/15, 11:50 AM
Basic IT Services
Host deployment and configuration technologies at SCC.
The Steinbuch Centre for Computing(SCC) at Karlsruhe Institute Of Technology (KIT) serves a number of projects, including the WLCG Tier-1 GridKa, the Large Scale Data Facility (LSDF), and the Smart Data Innovation Lab (SDIL) using bare metal and virtual compute resources and provides a variety of storage and computing services to the...
Andrew David Lahiff
(STFC - Rutherford Appleton Lab. (GB))
10/16/15, 12:10 PM
Basic IT Services
At RAL we have been considering InfluxDB and Grafana as a possible replacement for Ganglia, in particular for application-specific metrics. Here we present our experiences with setting up monitoring for services such as Ceph, FTS3 and HTCondor, and discuss the advantages and disadvantages of InfluxDB and Grafana over Ganglia.
Helge Meinhard
(CERN),
Tony Wong
(Brookhaven National Laboratory)
10/16/15, 12:30 PM