20th Real Time Conference

Europe/Rome
Padova, Italy

Padova, Italy

<a href="https://goo.gl/maps/vWFxL">Centro Congressi A. Luciani Via Forcellini, 170/A Padova ITALY</a>
Adriano Luchetta (Consorzio RFX), Dora Merelli (IEEE), Gianluca Moro, Martin Grossmann, Martin Lothar Purschke (Brookhaven National Laboratory (US)), Patrick Le Dû, Rejean Fontaine (Université de Sherbrooke), Sascha Schmeling (CERN), Stefan Ritt (Paul Scherrer Institute), Zhen-An Liu (IHEP,Chinese Academy of Sciences (CN))
Description

20th Real Time Conference

Welcome to Padova!

Photo of RT2016 participants at Palazzo del Bo . UNIPD - 6 June 2016 - low resolution2016


 


We invite you at the Centro Congressi “A. Luciani” in Padova for the 2016 Real Time Conference (RT2016). It will take place Monday 6 through Friday 10 June 2016, with optional pre-conference tutorials Sunday, June 5.

Like the previous editions, RT2016 will be a multidisciplinary conference devoted to the latest developments on realtime techniques in the fields of Plasma and Nuclear Fusion, particle physics, nuclear physics and astrophysics, space science, accelerators, medical physics, nuclear power instrumentation and other radiation instrumentation.

 

 

Padova


Important Dates:

  • June 5, 2016 - Short-Courses
  • June 6-10, 2016 -  Conference 

 

Sponsors

 

Registration
Exhibitor Form
Exhibitors accompanying
Surveys
Bus Transportation to/from Conference Venue
    • 09:30 12:30
      Short Course - Real-time data visualization and control using modern Web technologies Caffe Pedrocchi (Padova)

      Caffe Pedrocchi

      Padova

      • 09:30
        Real-time data visualization and control using modern Web technologies 1 1h 30m Caffe Pedrocchi (Padova, Italy)

        Caffe Pedrocchi

        Padova, Italy

        Via VIII Febbraio, 15 35122 Padova (PD)
        Speaker: Stefan Ritt (Paul Scherrer Institute)
      • 11:00
        Coffee Break 15m Caffe Pedrocchi (Padova)

        Caffe Pedrocchi

        Padova

      • 11:15
        Real-time data visualization and control using modern Web technologies 2 1h 15m Cafe Pedrocchi, Padova

        Cafe Pedrocchi, Padova

    • 09:35 19:00
      Registration 9h 25m Caffe Pedrocchi (Padova)

      Caffe Pedrocchi

      Padova

    • 12:30 14:00
      Break: Lunch
    • 14:00 17:00
      Short Course - Real-time data acquisition and processing applications using NIRIO FPGAs-based technology and NVIDIA GPUs Caffe Pedrocchi

      Caffe Pedrocchi

      Padova, Italy

      • 14:00
        Real-time data acquisition and processing applications using NIRIO FPGAs-based technology and NVIDIA GPUs 1 1h 30m Caffe Pedrocchi

        Caffe Pedrocchi

        Padova, Italy

        Via VIII Febbraio, 15 35122 Padova (PD)
        Speaker: Mariano Ruiz (Technical University of Madrid)
      • 15:30
        Coffee Break 15m Caffe Pedrocchi (Padova, Italy)

        Caffe Pedrocchi

        Padova, Italy

      • 15:45
        Real-time data acquisition and processing applications using NIRIO FPGAs-based technology and NVIDIA GPUs 2 1h 15m Caffe Pedrocchi

        Caffe Pedrocchi

        Padova, Italy

    • 07:30 08:15
      Bus Transfer to Conference Venue

      Bus Transfer to Conference Venue

    • 08:15 09:00
      Registration Centro Congressi (Padova)

      Centro Congressi

      Padova

    • 09:00 10:35
      Opening Session 1 Centro Congressi (Padova)

      Centro Congressi

      Padova

      Conveners: Adriano Francesco Luchetta (Consorzio RFX), Martin Lothar Purschke (Brookhaven National Laboratory (US))
      • 09:00
        Welcome words and conference information 15m
        Welcome words and conference information
        Speakers: Adriano Francesco Luchetta (Consorzio RFX), Martin Lothar Purschke (Brookhaven National Laboratory (US)), Rejean Fontaine (Université de Sherbrooke), Dr Sascha Schmeling (CERN)
      • 09:15
        ITER DIAGNOSTICS DEVELOPMENT 40m
        ITER is the largest and most technically advanced magnetic fusion device ever and is under construction in France. It is also a nuclear installation. As a result, monitoring and controlling this device using diagnostics is crucial for successful operation. Design, construction and planning for operation of these diagnostics are now well underway with some buildings complete and several more under construction. A sufficient diagnostic set is needed to cover the reliable routine operation, advanced operation and physics exploitation. This involves many boundary penetrations and in general interfaces with many of the ITER major components. These diagnostics also have to be fully operational in many diverse scenarios with managed redundancy as needed in critical areas. Demonstration of the success of ITER will come through the diagnostics. To facilitate this, a set of 50 diagnostics will be deployed, each one with its own set of specific requirements. These diagnostics are divided up in to categories including magnetics, neutrons, bolometer, optical, microwave and operational systems. The latter including pressure gauges, infrared systems and a range of observation systems for tritium and dust. Incorporation of all these systems provides a very large matrix of interfaces across virtually the whole of the device from inside to outside. These interfaces also include the control system. Managing these interfaces is a complex task. It is further complicated by the fact that many teams (more than 60 teams) are working on the systems and these are stationed around the world in the partner and supplier laboratories as well as at the construction site. From the control and operations perspective, the systems will need to be tightly managed to ensure that the whole system is built up in a coherent way. This will ensure that all the hard and soft interfaces are integrated. The environment has also to deal with neutrons, activation, maintenance and ultra-high vacuum. All these together provide a complex design path with components and hence the diagnostics being designed to be secure, cost effective and reliable. This talk will focus on the approaches and the challenges of implementing a full suite of diagnostics on ITER.
        Speaker: Dr Michael Walsh (ITER)
      • 09:55
        Real Time Control Of Suspended Test Masses In Advanced Virgo Laser Interferometer 40m

        Virgo seismic isolation system is composed by 10 complex mechanical structures named “Superattenuators”, or simply “Suspensions”, that isolate optical elements of Virgo interferometer from seismic noise at frequency larger than a few Hz. Each structure can be described by a model with 80 vibrational modes and is controlled by 24 coil-magnet pairs actuators. The suspension status is observed using 20 local sensors plus 3 global sensors available when the VIRGO interferometer is locked, that is when all optical lengths are controlled.

        Since early beginning we made extensive use of digital control techniques implemented on custom Digital Signal Processor boards and software tools designed and developed within our group. With Advanced Virgo we are now at the third generation of Suspension Control Systems and our data conditioning, conversion and processing boards, developed in accordance with a custom variation of MicroTCA.4.

        Results of our design and development efforts will be presented focusing mainly on real time issues and key performances of the overall system

        Speaker: Dr Alberto Gennai (INFN)
    • 10:35 10:55
      Break: Coffee
    • 10:55 11:35
      Opening Session 2 Centro Congressi (Padova)

      Centro Congressi

      Padova

      Conveners: David Abbott (Jefferson Lab), Stefan Ritt (Paul Scherrer Institut (CH))
      • 10:55
        Tide control on Venice Lagoon 40m

        The lagoon of Venice is the largest of the Mediterranean Sea, facing the North Adriatic Sea. It is about 55km2 wide and 1,5 m average deep. The astronomic tide maximum excursion is about ± 50 cm over the mean sea level. Tides force a water flush in the lagoon, carrying inside oxygenated and expelling de-oxygenated water, two times per day. This “breath” is essential to biological life and to connected ecosystem services. However, in some meteorological conditions, the level of the sea can be substantially higher (max registered + 2 m over the mean value in 1966). In these occasions, the historical city of Venice, located in the centre of the lagoon, is flooded. Climate changes and local anthropic pressures caused an intensification of flooding events in the last century. They can occur more than 20 times in a year, with different severity levels. Meteorological conditions can be forecast, with a precision which rises in the few hours before the event. The municipality operates an alert system and ensures services such as walkpaths.
        However, the only way to protect Venice’s from any flooding, including the most severe ones, is to temporarily close the entrance of sea water into the lagoon, when needed and as long as necessary, i.e. until the sea level will be back to a ”safe” quote.
        For this reason, a complex system of mobile barriers at the lagoon inlets (MOSE) is being constructed, funded by the Italian State and controlled by the Ministry of Infrastructure and Transport. Work began in 2003 and is continuing in parallel at the Lido, Malamocco and Chioggia inlets. Worksites are now in the final stage. The first barrier (North Lido) has been completed with the installation of the housing caissons and of the 21 gates. In the others barriers (South Lido, Malamocco and Chioggia) works will be concluded by the 2018. In that moment, the lagoon of Venice will become the first “regulated lagoon” in the world.

        Speaker: Dr Pierpaolo Campostrini (MOSE)
    • 11:35 12:15
      RT simulation and RT safety and security Centro Congressi (Padova)

      Centro Congressi

      Padova

      Conveners: Martin Lothar Purschke (Brookhaven National Laboratory (US)), Satoru Yamada (KEK)
      • 11:35
        Design and Implementation of EAST Data Visualization in VEAST System 20m
        The Experimental Advanced Superconductive Tokamak (EAST) Device began operation in 2006. EAST’s inner structure is very complicated and contains a lot of subsystems which have a variety of different functions. In order to facilitate the understanding of the device and experimental information and promote the development of the experiment, the virtual EAST system has established an EAST virtual reality scene in which the user can roam and access to information by interacting with the system. However, experiment-related parameter information, diagnostic information and magnetic measurement information are displayed in the form of charts, figures, tables and two dimensional graphics. In order to express the experimental results directly, three-dimensional data visualization results are created using computer graphics technology. Data visualization is the process of visualizing data based on the characteristics of the data, the selection of the appropriate data structure and the proper sequence of visual pipeline. We use the visualization toolkit(VTK) to realize the data visualization in VEAST system and give the detailed steps of data visualization of plasma column, electron cyclotron emission diagnostic and plasma magnetic field. Besides, the general format is defined for the users to organize their data so that they can visualize their data in our system.
        Speaker: Prof. Xiao Bingjia (Hefei institutes of physical science chinese academy of sciences)
      • 11:55
        Model based fast protection system for High Power RF tube amplifiers used at European XFEL accelerator 20m
        The driving engine of the superconducting accelerator of the European X-ray Free-Electron Laser (XFEL) are 27 Radio Frequency (RF) stations. Each of an underground RF station consists from multi-beam horizontal klystron which can provide up to 10MW of power at 1.3GHz. Klystrons are sensitive devices with limited lifetime and high mean time between failures. In the real operation the lifetime of the tube can be thoroughly reduced by failures. To minimize the influence of service conditions to the klystrons lifetime the special fast protection system named as Klystron Lifetime Management System (KLM) has been developed. The main task of this system is to detect all events which can destroy the tube as quickly as possible and switch off driving signal. Detection of events is based on comparison of model of high power RF amplifier with real signals. All algorithms are implemented in Field Programmable Gate Array (FPGA). For the XFEL implementation of KLM is based on the standard Low Level RF (LLRF) Mi-cro TCA technology (MTCA.4 or xTCA). This article focus on the klystron model estimation for protection system and implementation of KLM in FPGA on MTCA.4 architecture.
        Speaker: Łukasz Butkowski (DESY)
    • 12:15 13:30
      Break: Lunch
    • 13:30 14:05
      Bus Transfer to Palazzo Bo

      Bus Transfer to Palazzo Bo

    • 14:05 14:15
      Welcome words 10m
    • 14:15 15:15
      Upgrades 1 Palazzo Bo (Padova)

      Palazzo Bo

      Padova

      Conveners: Martin Grossmann, Dr Sascha Schmeling (CERN)
      • 14:15
        The ALICE C-RORC GBT card, a prototype read-out solution for the ALICE upgrade. 30m
        ALICE (A Large Ion Collider Experiment) is the detector system at the LHC (Large Hadron Collider) optimized for the study of heavy-ion collisions at interaction rates up to 50 kHz and data rates beyond 1 TB/s. Its main aim is to study the behavior of strongly interacting matter and the quark gluon plasma. ALICE is preparing a major upgrade and starting from 2021, it will collect data with several upgraded sub-detectors (TPC, ITS, Muon Tracker and Chamber, TRD and TOF). The ALICE DAQ read-out system will be upgraded as well, with a new read-out link called GBT (GigaBit Transceiver) with a max. speed of 4.48 Gb/s and a new PCIe gen.3 x16, interface card called CRU (Common Read-out Unit). Several test beams have been scheduled for the test and characterization of the prototypes or parts of new detectors. The test beams usually last for a short period of one or two weeks and it is therefore very important to use a stable read-out system to optimize the data taking period and be able to collect as much statistics as possible. The ALICE DAQ and CRU teams proposed a data acquisition chain based on the current ALICE DAQ framework in order to provide a reliable read-out system. The new GBT link, transferring data from the front-end electronics, will be directly connected to the C-RORC, the current read-out PCIe card used in the ALICE experiment. The ALICE DATE software is a stable solution in production since more than 10 years. Moreover, most of the ALICE detector developers are already familiar with the software and its different analysis tools. This setup will allow the detector team to focus on the test of their detectors and electronics, without worrying about the stability of the data acquisition system. An additional development has been carried on with a C-RORC-based Detector Data Generator (DDG). The DDG has been designed to be a realistic data source for the GBT. It generates simulated events in a continuous mode and sends them to the DAQ system through the optical fibers, at a maximum of 4.48 Gb/s per GBT link. This hardware tool will be used to test and verify the correct behavior of the new DAQ read-out card, CRU, once it will become available to the developers. Indeed the CRU team will not have a real detector electronics to perform communication and performance tests, so it is vital during the test and commissioning phase to have a data generator able to simulate the FEE behavior. This contribution will describe the firmware and software features of the proposed read-out system and it will explain how the read-out chain will be used in the future tests and how it can help the development of the new ALICE DAQ software.
        Speaker: Filippo Costa (CERN)
      • 14:45
        The new Global Muon Trigger of the CMS experiment 30m
        For the 2016 physics data runs the L1 trigger system of the CMS experiment is undergoing a major upgrade to cope with the increasing instantaneous luminosity of the CERN LHC whilst maintaining a high event selection efficiency for the CMS physics program. Most subsystem specific trigger processor boards are being exchanged with powerful general purpose processor boards, conforming to the MicroTCA standard, whose tasks are performed by firmware on an FPGA of the Xilinx Virtex 7 family. Furthermore, the muon trigger system is undergoing change from a subsystem centred approach, where each of the three muon detector systems provides muon candidates to the global muon trigger, to a region based system, where muon track finders combine information from the subsystems to generate muon candidates in three detector regions that are then sent to the upgraded global muon trigger. The upgraded global muon trigger receives up to 108 muons from the sector processors of the muon track finders in the barrel, overlap, and endcap detector regions. The muons are sorted and duplicates are identified for removal in two steps. The first step treats muons from different sector processors of a track finder in one detector region. Muons from track finders in different detector regions are compared in the second step. With energy sums from the calorimeter trigger an isolation variable is calculated and added to each muon, before the best eight are sent to the upgraded global trigger where the final trigger decision is taken. The upgraded global muon trigger algorithm is implemented on one of the general purpose processor boards that uses about 70 optical links at 10 Gb/s to receive the input data from the muon track finders and the calorimeter energy sums, and to send the selected muon candidates to the upgraded global trigger. The design of the upgraded global muon trigger in the context of the CMS L1 trigger upgrade, and experience from commissioning and data taking with the new system are presented here within.
        Speaker: Thomas Reis (CERN)
    • 15:15 15:30
      Break: Coffee
    • 15:30 16:40
      Upgrades 2 Palazzo Bo (Padova)

      Palazzo Bo

      Padova

      Upgrade 2

      Conveners: Masaharu Nomachi, Wolfgang Kuehn (Justus-Liebig-Universitaet Giessen (DE))
      • 15:30
        The development of the Global Feature Extractor for the LHC Run-3 upgrade of the ATLAS L1 Calorimeter trigger system 30m
        The Global Feature Extractor (gFEX) is one of several modules in the LHC Run-3 upgrade of the Level 1 Calorimeter (L1Calo) trigger system of the ATLAS experiment. It is a single Advanced Telecommunications Computing Architecture (ATCA) module for large-area jet identification with three Xilinx UltraScale FPGAs for data processing and a system-on-chip (SoC) FPGA for control and monitoring. A pre-prototype board has been designed to verify all functionalities. The performance of this pre-prototype has been tested and evaluated. As a major achievement, the high-speed links in FPGAs are stable at 12.8 Gb/s with Bit Error Ratio (BER) < 10-15 (no error detected). The low-latency parallel GPIO (General Purpose I/O) buses for communication between FPGAs are stable at 960 Mb/s. Besides that, the peripheral components of Soc FPGA have also been verified. After laboratory tests, the link speed test with LAr (Liquid Argon Calorimeter) Digital Processing Blade (LDPB) AMC card has been carried out at CERN for determination of the link-speed to be used for the links between LAr and L1Calo systems. The links from LDPB AMC card to gFEX run properly at both 6.4 Gb/s and 11.2 Gb/s. Test results of pre-prototype board validate the gFEX technologies and architecture. Now the prototype board design with three UltraScale FPGAs is on the way, the status of development will be presented.
        Speaker: Weihao Wu (Brookhaven National Laboratory (US))
      • 16:00
        SWATCH: Common software for controlling and monitoring the upgraded CMS Level-1 trigger 20m
        The Large Hadron Collider at CERN restarted in 2015 with a higher centre-of-mass energy of 13 TeV. The instantaneous luminosity is expected to increase significantly in the coming years. An upgraded Level-1 trigger system is being deployed in the CMS experiment in order to maintain the same efficiencies for searches and precision measurements as those achieved in the previous run. This system must be controlled and monitored coherently through software, with high operational efficiency. The legacy system is composed of approximately 4000 data processor boards, of several custom application-specific designs. These boards have been controlled and monitored by a medium-sized distributed system of over 40 computers and 200 processes. The legacy trigger was organised into several subsystems; each subsystem received data from different detector systems (calorimeters, barrel/endcap muon detectors), or with differing granularity. Only a small fraction of the control and monitoring software was common between the different subsystems; the configuration data was stored in a database, with a different schema for each subsystem. This large proportion of subsystem-specific software resulted in high long-term maintenance costs, and a high risk of losing critical knowledge through the turnover of software developers in the Level-1 trigger project. The upgraded system is composed of a set of general purpose boards, that follow the MicroTCA specification, and transmit data over optical links, resulting in a more homogeneous system. This system will contain the order of 100 boards connected by 3000 optical links, which must be controlled and monitored coherently. The associated software is based on generic C++ classes corresponding to the firmware blocks that are shared across different cards, regardless of the role that the card plays in the system. A common database schema will also be used to describe the hardware composition and configuration data. Whilst providing a generic description of the upgrade hardware, its monitoring data, and control interface, this software framework (SWATCH) must also have the flexibility to allow each subsystem to specify different configuration sequences and monitoring data depending on its role. By increasing the proportion of common software, the upgrade system's software will require less manpower for development and maintenance. By defining a generic hardware description of significantly finer granularity, the SWATCH framework will be able to provide a more uniform graphical interface across the different subsystems compared with the legacy system, simplifying the training of the shift crew, on-call experts, and other operation personnel. We present here, the design of the control software for the upgrade Level-1 Trigger, and experience from using this software to commission the upgraded system.
        Speaker: Tom Williams (STFC - Rutherford Appleton Lab. (GB))
      • 16:20
        Unified Communication Framework (UCF) 20m
        UCF is a unified network protocol and FPGA firmware for high speed serial interfaces employed in Data Acquisition systems. It provides up to 64 different communication channels via a single serial link. One channel is reserved for timing and trigger information whereas the other channels can be used for slow control interfaces and data transmission. All channels are bidirectional and share network bandwidth according to assigned priority. The timing channel distributes messages with fixed and deterministic latency in one direction. From this point of view the protocol implementation is asymmetrical. The precision of the timing channel is defined by the jitter of the recovered clock and is typically in the order of 10-20 ps RMS. The timing channel has highest priority and a slow control interface should use the second highest priority channel in order to avoid long delays due to high traffic on other channels. The framework supports point-to-point connections and star-like 1:n topologies but only for optical networks with passive splitter. It always employs one of the connection parties as a master and the others as slaves. The star-like topology can be used for front-ends with low data rates or pure time distribution systems. In this case the master broadcasts information according to assigned priority whereas the slaves communicate in a time sharing manner to the master. Inside the OSI layer model the UCF can be classified to the layers one to three which includes the physical, the data and the network layer. The above presented framework can be used in trigger and fast data processes as well as slow data or monitoring processes. It can be applied to every kind of experiment, may it be the neutron lifetime experiment PENeLOPE in Munich, the Belle II experiment at KEK or the COMPASS experiment at CERN. In all these experiments UCF is currently beeing implemented and tested. The project is supported by the Maier-Leibnitz-Laboratorium (Garching), the Deutsche Forschungsgemeinschaft and the Excellence Cluster "Origin and Structure of the Universe".
        Speaker: Dominic Maximilian Gaisbauer (Technische Universitaet Muenchen (DE))
    • 16:40 18:00
      Fast data Transfer links and networks Palazzo Bo (Padova)

      Palazzo Bo

      Padova

      Conveners: Stefan Ritt (Paul Scherrer Institut (CH)), Prof. Zhen-An Liu (IHEP,Chinese Academy of Sciences (CN))
      • 16:40
        MicroTCA.4 based data acquisition system for KSTAR Tokamak 20m
        The Korea Superconducting Tokamak Advanced Research (KSTAR) control system comprises various heterogeneous hardware platforms. This diversity of platforms raises a maintenance issue. In addition, limited data throughput rate and the need for higher quality data drives us to find the next generation control platform. Investigation shows that many leading experiments in the field of high energy physics are seriously pursuing the modern high performance open architecture based modular design. An extension of Micro Telecommunications Computing Architecture (MTCA) initiated by the Physics community, MTCA.4, provides modular structure of high-speed links and allows flexible reconfiguration of system functionality. For the systematic standardization of the real time control at KSTAR, we developed a new functional digital controller based on the MTCA.4 Standard. The KSTAR Multifunction Control Unit (KMCU, K-Z35) is realized using Xilinx System-On-Chip SOC architecture. The KMCU development is the result of a successful international collaboration. The KMCU features a Xilinx ZYNQ7000 SOC with ARM processor, FPGA fabric with multi-gigabit transceivers, 1GB DDR-3 memory, as well as a single VITA-57 FMC site reserved for future functional expansion. KMCU is matched with a dedicated Rear Transition Module (RTM) with sites for two FMC-like analog Data Acquisition (DAQ) modules. The RTM pinout is compatible with the DESY standard Zone-3 interface, D1.0. The first DAQ system to be implemented is the Motional Stark Effect (MSE) diagnostic. The MSE DAQ system uses two analog modules each with 16 channel simultaneous ADC sampling at 2MSPS with programmable gain. By using a single RTM with many compatible high performance analog modules, the programmable and reconfigurable KMCU takes advantage of the modular design concept. An internal EPICS IOC easily manages the selected hardware configuration. Another novel function of KMCU is simultaneous two point streaming data transmission. Some parts of the Plasma Control System (PCS) at KSTAR demand data from diagnostics for 3D reconstruction or specific data processing. KMCU duplicates input data from a rear module and simultaneously transmits it to the CPU and external host system through the MTCA backplane and front SFP+ interface. We are now developing a new functionally optimized KMCU series. The paper presents the complete data acquisition system and commissioning results of the MSE diagnostics based on the MTCA.4 Standard. We also introduce a conceptual design for real time processing node for plasma control system based on the KMCU.
        Speaker: Dr Woong-ryol Lee (National Fusion Research Institute)
      • 17:00
        The readout system upgrade for the LHCb experiment 20m
        The LHCb experiment is designed to study differences between particles and anti-particles as well as very rare decays in the charm and beauty sector at the LHC. The detector will be upgraded in 2019 and a new trigger-less readout system has to be implemented in order to significantly increase its efficiency. In the new scheme, event building and event selection are carried out in software and the event filter farm receives all data from every LHC bunch-crossing. Another feature of the system is that data coming from the front-end electronics is delivered directly into the event builders memory through a specially designed PCIe card called PCIe40. The PCIe40 board handles the data acquisition flow as well as the distribution of fast and slow controls to the detector front-end electronics. It embeds one of the most powerful FPGAs currently available on the market with 1.2 million logic cells. The board has a bandwidth of up to 490 Gbits/s in both input and output over optical links and up to 100 Gbits/s over the PCI Express bus to the CPU. We will present how data flows through the board and to its associated server during event building. We will focus on specific issues regarding the design of the different firmwares being developed for the FPGA, showing how to manage flows of 100 Gbits/s, and all the techniques put in place when different firmwares are developed by distributed teams of sub-detector experts.
        Speaker: Paolo Durante (CERN)
      • 17:20
        Performance of the new DAQ system of the CMS experiment for run-2. 20m
        The data acquisition system (DAQ) of the CMS experiment at the CERN Large Hadron Collider assembles events at a rate of 100 kHz, transporting event data at an aggregate throughput of 100 GByte/s to the high-level trigger (HLT) farm. The HLT farm selects and classifies interesting events for storage and offline analysis at a rate of around 1 kHz. The DAQ system has been redesigned during the accelerator shutdown in 2013/14. The motivation was twofold: Firstly, the compute nodes, networking and storage infrastructure reached the end of their lifetime. Secondly, in order to handle higher LHC luminosities and event pileup, a number of sub-detectors are upgraded, increasing the number of readout channels and replacing the off-detector readout electronics with a μTCA implementation. The new DAQ architecture takes advantage of the latest developments in the computing industry. For data concentration, 10/40 Gbit Ethernet technologies are used, as well as an implementation of a reduced TCP/IP in FPGA for a reliable transport between custom electronics and commercial computing hardware. A 56 Gbps Infiniband FDR CLOS network has been chosen for the event builder with a throughput of ~4 Tbps. The HLT processing is entirely file-based. This allows the DAQ and HLT systems to be independent, and to use the same framework for the HLT as for the offline processing. The fully built events are sent to the HLT with 1/10/40 Gbit Ethernet via network file systems. A hierarchical collection of HLT accepted events and monitoring meta-data are stored in to a global file system. The monitoring of the HLT farm is done with the Elasticsearch analytics tool. This paper presents the requirements, implementation, and performance of the system. Experience is reported on the first year of operation in the LHC pp runs as well as at the heavy ion Pb-Pb runs in 2015.
        Speaker: Jeroen Hegeman (CERN)
      • 17:40
        RTM RF Backplane Extensions for MicroTCA.4 Crates – Concept and Performance Measurements 20m
        The idea of the Rear Transition Module (RTM) Backplane was originally created to simplify cable management of an MicroTCA.4 based LLRF control system for the European XFEL project. The first RTM backplane (called an RF Backplane) was designed to distribute about dozen of precise RF and clock signals to uRTM cards. It was quickly found out, that this backplane offers very powerful extension possibilities for the MTCA.4 standard and can be used also more widely than for the RF applications only. Nowadays, the RTM Backplane is compliant with the PICMG standard and an optional crate extensions. The RTM Backplane provides multiple links for high-precision clock and RF signals (DC to 6GHz) to analog µRTM cards it ) together with distribution of a low noise managed power supply and data transmission to RTM cards. In addition, the RTM backplane offers a possibility to add so called extended RTMs (eRTM) and RTM Power Modules (RTM-PM) to a 12 slot MicroTCA crate. Up to three 6 HE wide eRTMs and two RTM-PMs can be installed behind the front PM and MCH modules. An eRTM attached to the MCH via Zone 3 connector is used for analog signal management on the RTM backplane. This eRTM allows also installing a powerful CPU to extend the processing capacity of the MTCA.4 crate. Three additional eRTMs provide significant space extensions of the MTCA.4 crate that can be used e.g. for analog electronics designed to supply RF signals to the uRTMs. The RTM-PMs deliver a managed low-noise (separated from front crate PMs) analog bipolar power supply (+VV, -VV) for the µRTMs and an unipolar power supply for the eRTMs. This extends functionality of the MicroTCA.4 crate and offers unique performance improvement for analog front-end electronics. This paper covers a new concept of the RTM Backplane, a new implementation for the real-time LLRF control system and performance evaluation of designed prototype, including precise measurements of RF loss, impedance matching and crosstalk.
        Speaker: Krzysztof Czuba (Warsaw University of Technology)
    • 19:00 21:00
      Welcome Reception: Caffé Pedrocchi, in front of Palazzo Bo - University Caffe Pedrocchi (Padova)

      Caffe Pedrocchi

      Padova

    • 07:45 08:30
      Bus Transfer to Conference Venue

      Bus Transfer to Conference Venue

    • 08:30 10:20
      DAQ 1 / Front End Electronics Centro Congressi (Padova)

      Centro Congressi

      Padova

      Conveners: Denis Calvet (CEA/IRFU,Centre d'etude de Saclay Gif-sur-Yvette (FR)), Dr Marco Bellato (INFN - Padova)
      • 08:30
        Register-Like Block RAM: Implementation, Testing in FPGA and Applications for High Energy Physics Trigger Systems 30m
        In high energy physics experiment trigger systems, block memories are utilized for various purposes, especially in indexed searching algorithms. It is often demanded to globally reset all memory locations between different events which is a feature not supported in regular block memories. Another common demand is to be able to update the contents in any memory location in a single clock cycle. These two demands can be fulfilled with registers but the cost of using registers for large memory is unaffordable. In this paper, a register-like block memory design scheme is presented, which allows updating memory locations in single clock cycle and effectively resetting entire memory within a single clock. The implementation and test results are also discussed.
        Speaker: Jinyuan Wu (Fermi National Accelerator Lab. (US))
      • 09:00
        A 3.9 ps RMS Resolution Time-to-Digital Convertor Using Dual-sampling Method on Kintex UltraScale FPGA 20m
        The principle of tapped-delay line (TDL) style field programmable gate array (FPGA)-based time-to-digital converters (TDC) requires finer delay granularity for higher time resolution. Given a tapped delay line constructed with carry chains in an FPGA, it is desirable to find a solution subdividing the intrinsic delay elements further, so that the TDC can achieve a time resolution beyond its cell delay. In this paper, after exploring the available logic resource in Xilinx Kintex UltraScale FPGA, we propose a dual-sampling method to have the TDL status sampled twice. The effect of the new method is equivalent to double the number of taps in the delay line, therefore a significant improvement in time resolution should present. Two TDC channels have been implemented in a Kintex UltraScale FPGA and the effectiveness of the new method is investigated. For fixed time intervals in the range from 0 to 440 ns, the average time resolutions measured by the two TDC channels are respectively 3.9 ps with the dual-sampling method and 5.8 ps by the conventional single-sampling method. In addition, the TDC design maintains advantages of multichannel capability and high measurement throughput in our previous design. Every part of TDC, including dual-sampling, code conversion and on-line calibration could run at 500 MHz clock frequency.
        Speakers: Dr Chong Liu (University of Science and Technology of China), Prof. Yonggang Wang (University of Science and Technology of China)
      • 09:20
        Timing distribution and Data Flow for the ATLAS Tile Calorimeter Phase II Upgrade 20m
        The Hadronic Tile Calorimeter (TileCal) detector is one of the several subsystems composing the ATLAS experiment at the Large Hadron Collider (LHC). The LHC upgrade program plans an increase of order five times the LHC nominal instantaneous luminosity culminating in the High Luminosity LHC (HL-LHC). In order to accommodate the detector to the new HL-LHC parameters, the TileCal read out electronics is being redesigned introducing a new read out strategy with a full-digital trigger system. In the new read out architecture, the front-end electronics allocates the MainBoards and the DaughterBoards. The MainBoard digitizes the analog signals coming from the PhotoMultiplier Tubes (PMTs), provides integrated data for minimum bias monitoring and includes electronics for PMT calibration. The DaughterBoard receives and distributes Detector Control System (DCS) commands, clock and timing commands to the rest of the elements of the front-end electronics, as well as, collects and transmits the digitized data to the back-end electronics at the LHC frequency (~25 ns). The TileCal PreProcessor (TilePPr) is the first element of the back-end electronics. It receives and stores the digitized data from the DaughterBoards in pipeline memories to cope with the latencies and rates specified in the new ATLAS DAQ architecture. The TilePPr interfaces between the data acquisition, trigger and control systems and the front-end electronics. In addition, the TilePPr distributes the clock and timing commands to the front-end electronics for synchronization with the LHC clock with fixed and deterministic latency. The complete new read out architecture is being evaluated in a Demonstrator system in several Test Beam campaigns during 2015 and 2016. At the end of this year, a complete TileCal module with the upgraded electronics will be inserted in the ATLAS detector. This contribution shows a detailed description of the timing distribution and data flow in the new read out architecture for the TileCal Phase II Upgrade and presents the status of the hardware and firmware developments of the upgraded front-end and back-end electronics and preliminary results of the TileCal demonstrator program.
        Speaker: Fernando Carrio Argos (Instituto de Fisica Corpuscular (ES))
      • 09:40
        The TOTEM precision clock distribution system. 20m
        To further extend the measurement potentialities for the experiment at luminosities where the pile-up and multiple tracks in the proton detectors make it difficult to identify and disentangle real diffractive events from other event topologies, TOTEM has proposed to add a timing measurement capability to measure the time-of-flight difference between the two outgoing protons. For such a precise timing measurements, a clock distribution system that empowers time information at spatially separate points with picosecond range precision, is needed. For the clock distribution task, TOTEM will adopt an adaptation of the Universal Picosecond Timing System, developed for the FAIR (Facility for Antiproton and Ion Research) facility at GSI, actually installed as BUTIS system. In this system an optical network, using dense wavelength division multiplex (DWDM) technique, is used to transmit two reference clock signals from the counting room to a grid of receivers in the tunnel. To these clocks another signal is added that is reflected back and used to continuously measure the delays of every optical transmission line; these delay measurements will be used to correct the time information generated at the detector location. The usage of the DWDM make it possible to transmit multiple signals generated with different wavelengths, over a common single mode fibers. Moreover allows to employ standard telecommunication modules conform to international standards like the ITU (International Telecommunications Union) ones. The prototype of this system, showed that the influence of the transmition system on the jitter is negligible and that the total jitter of the clock transmission, is practically due to the inherent jitter of clock sources and the end user electronics. By the time of the Conference, the system will be commissioned in the interaction point 5 (IP5) TOTEM control room. Details on the system design, tests and characterization, will be given in this contribution.
        Speakers: Francesco Cafagna (Universita e INFN, Bari (IT)), Michele Quinto (Universita e INFN-Bari (IT))
      • 10:00
        A Programmable Read-out chain for Multichannel Analog front-end ASICs 20m
        In this contribution we introduce an innovative multiplexed ASICs read-out system based on 32 analog channels sampling at 40 MHz with 12 bits resolution and 96 digital I/O with selectable voltage standard ranging from differential signaling and 1.8 or 3.3 V CMOS. The ADCs and ASICs read-out is managed by a Kintex-7 FPGA and the communication with the host computer relies on the fast USB 3 communication protocol. The main feature of this system is the new idea of FPGA firmware development based on an easy to use graphical interface, which is able to carry out all the requested functions for an ASIC read-out system (for instance state machines, triggers, counters, time delays, and so on..), without the need to write any Hardware Description code. Furthermore, we also introduce a cloud compiling service, which allows the user to avoid to install the FPGA development environment to create a measurement setup based on this read-out system.
        Speaker: Mr Francesco Caponio (Nuclear Instruments)
    • 10:20 10:40
      Break: Coffee
    • 10:40 12:10
      Upgrades 3 Centro Congressi (Padova)

      Centro Congressi

      Padova

      Conveners: Christian Bohm (Stockholm University (SE)), Lorne Levinson (Weizmann Institute of Science (IL))
      • 10:40
        FELIX: the new detector readout system for the ATLAS experiment 20m
        From the ATLAS Phase-I upgrade and onward, new or upgraded detectors and trigger systems will be interfaced to the data acquisition, detector control and timing (TTC) systems by the Front-End Link eXchange (FELIX). FELIX is the core of the new ATLAS Trigger/DAQ architecture. Functioning as a router between custom serial links and a commodity network, FELIX is implemented by server PCs with commodity network interfaces and PCIe cards with large FPGAs and many high speed serial fiber transceivers. By separating data transport from data manipulation, the latter can be done by software in commodity servers attached to the network. Replacing traditional point-to-point links between Front-end components and the DAQ system by a switched network, FELIX provides scaling, flexibility uniformity and upgradability. Different Front-end data types or different data sources can be routed to different network endpoints that handle that data type or source: e.g. event data, configuration, calibration, detector control, monitoring, etc. This reduces the diversity of custom hardware solutions in favour of software. Front-end connections can be either high bandwidth serial connections from FPGAs (e.g. 10 Gb/s) or those from the radiation tolerant CERN GBTx ASIC which aggregates many slower serial links onto one 5 Gb/s high speed link. Already in the Phase 1 Upgrade there will be about 2000 fiber connections. In addition to connections to a commodity network and Front-ends, FELIX receives Timing, Trigger and Control (TTC) information and distributes it with fixed latency to the GBTx connections. As part of the FELIX implementation, the firmware and Linux software for a high efficiency PCIe DMA engine has been developed. The system architecture of FELIX will be described; the results of the demonstrator program and the first prototype, along with future plans, will be presented.
        Speaker: Julia Narevicius (Weizmann Institute of Science (IL))
      • 11:00
        Upgrade of the TOTEM data acqusition system for the LHC's Run Two 20m
        The TOTEM (TOTal cross section, Elastic scattering and diffraction dissociation Measurement at the LHC) experiment at LHC, has been designed to measure the total proton-proton cross-section with a luminosity independent method, based on the optical theorem, and to study the elastic and diffractive scattering at the LHC energy. To cope with the increased intensity of the LHC run 2 phase, and the increase on statistics required by the extension of the TOTEM physics program, approved for the 2016 run campaign, the previous VME based DAQ has been substituted by a new one based on the Scalable Readout System (SRS). The system is composed of 16 SRS-FECs, and one SRS-SRU; it features a throughput of ~120MB/s, saturating the SRS-FEC 1Gb/s link, for an overall 2GB/s data transfer rate into the online PC farm. This guarantee a baseline maximum trigger rate of ~24kHz, to be compared with the 1KHz of the previous VME based system. This trigger rate will be further improved,up to 100kHz trigger rate, implementing second level trigger algorithm in the SRS-SRU. The new system design fulfills the requirements for an increased efficiency, providing higher bandwidth, and increasing the purity of the data recorded supporting both a zero suppression algorithm and a second-level trigger based on pattern recognition algorithms implemented in hardware. Moreover a full compatibility with the legacy front-end hardware has been guaranteed, as well as the interface with the CMS experiment DAQ and the LHC Timing Trigger and Control (TTC) system. A complete re-design of the firmware, leveraging the usage of industrial strength firmware technologies, has been undertook to provide a set of common interfaces and services between the standard system modules to the specific one of the user's application. This to allow an efficient development and easier insertion of different zero suppression and second-level trigger algorithms and a share of firmware blocks between different SRS components. Furthermore, to avoid packed losses and improve reliability of the UDP data transmission, a solution has been adopted that uses the Ethernet Flow control and New API (NAPI) mode driver, featuring a ticketing algorithms at the application layer. In this contribution we will describe in details the full system and performances during the commissioning phase at the LHC Interaction Point 5 (IP5).
        Speaker: Michele Quinto (Universita e INFN-Bari (IT))
      • 11:20
        Phase-I Trigger Readout Electronics Upgrade for the ATLAS Liquid-Argon Calorimeters 20m
        For the Phase-I luminosity upgrade of the LHC, a higher granularity trigger readout of the ATLAS LAr Calorimeters is foreseen in order to enhance the trigger feature extraction and background rejection. The new readout system digitizes the detector signals, which are grouped into 34000 so-called Super Cells, with 12-bit precision at 40 MHz. The data is transferred via optical links to a digital processing system which extracts the Super Cell energies. A demonstrator version of the complete system has now been installed and operated on the ATLAS detector. The talk will give an overview of the Phase-I Upgrade of the ATLAS LAr Calorimeter readout and present the custom developed hardware including their role in real-time data processing and fast data transfer. This contribution will also report on the performance of the newly developed ASICs including their radiation tolerance and on the performance of the prototype boards in the demonstrator system based on various measurements with the 13 TeV collision data. Results of the high speed link test with the prototype of the LAr Digital Processing Boards will be also reported.
        Speaker: Nicolas Chevillot (Centre National de la Recherche Scientifique (FR))
      • 11:40
        Large-scale DAQ tests for the LHCb upgrade 20m
        The Data Acquisition (DAQ) of the LHCb experiment will be upgraded in 2020 to a high-bandwidth triggerless readout system. In the new DAQ event fragments will be forwarded to the to the Event Builder (EB) computing farm at 40 MHz. Therefore the front-end boards will be connected directly to the EB farm through optical links and PCI Express based interface cards. The EB is requested to provide a total network capacity of 32 Tb/s, exploiting about 500 nodes. In order to get the required network capacity we are testing various technology and network protocols on large scale clusters. We developed on this purpose an Event Builder implementation designed for an InfiniBand interconnect infrastructure. We present the results of the measurements performed to evaluate throughput and scalability measurements on HPC scale facilities.
        Speaker: Antonio Falabella (Universita e INFN, Bologna (IT))
    • 12:10 12:25
      Conference Photo
    • 12:25 13:25
      Break: Lunch Centro Congressi (Padova)

      Centro Congressi

      Padova

    • 13:25 14:45
      Mini Oral 1 Centro Congressi (Padova)

      Centro Congressi

      Padova

      Conveners: Christian Bohm (Stockholm University (SE)), Rejean Fontaine (Université de Sherbrooke)
    • 14:45 15:00
      Break: Coffee
    • 15:00 16:30
      Poster session 1 Centro Congressi (Padova)

      Centro Congressi

      Padova

      • 15:00
        A Calculation Software Based on Pipe-and-Filter Architecture for the 4πβ-γ Digital Coincidence Counting Equipment 1h 30m
        The 4πβ − γ coincidence efficiency extrapolation method is the most popular for the absolute determination of radioactivity. In this paper, a calculation software based on pipe-and-filter architecture for the 4πβ-γ digital coincidence counting (DCC) equipment is presented. The equipment has ability to handle four 500MSPS 8bit resolution channels and four 62.5MSPS 16 bits resolution channels with ±1ns synchronization accuracy. However, digitizing pulse-trains in high speed and high resolution brings challenges of controlling and processing. In parallel, it allows DCC system digitizing and processing the pre-amplifier pulses themselves, which avoid the associated loss of information inherent in the operation of pulse shaping amplifiers. To meet these new challenges and demands, the software is designed in pipe-and-filter architecture, which support reuse and concurrent execution. Results indicate that this design is effective, easy to implement and extend for real-time acquisition controlling and off-line processing.
        Speaker: Zhiguo Ding (University of Science and Technology of China, Department of morden physics)
      • 15:00
        A coprocessor for the Fast Tracker Simulation 1h 30m
        The Fast Tracker (FTK) executes real time tracking for online event selection in the ATLAS experiment. Data processing speed is achieved by exploiting pipelining and parallel processing. Track reconstruction is executed on a 2-level pipelined architecture. The first stage, implemented on custom ASICs called Associative Memory (AM) Chips, performs Track Candidate (road) recognition in low resolution. The second stage, implemented on FPGAs (Field Programmable Gate Arrays), builds on the track candidate recognition, performing Track Fitting in full resolution. The use of such parallelized architectures for real time event selection opens up a new huge computing problem related to the analysis of the acquired samples. For each type of implemented trigger, millions of events have to be simulated to determine, within a small statistical margin of error, the efficiency and the bias of that trigger. The AM chip emulation is a particularly complicated task. This paper proposes the use of a hardware co-processor, in place of its simulation, to solve the problem. We report on the implementation and performance of all the functions complementary to the pattern matching in a modern, compact embedded system for track reconstruction. That system is the miniaturization of the complex FTK processing unit, which is also well suited for powering applications outside the realm of High Energy Physics as well.
        Speaker: Christos Gentsos (Aristotle Univ. of Thessaloniki (GR))
      • 15:00
        A Digital On-line Implementation of a Pulse-Shape Analysis Algorithm for Neutron-gamma Discrimination in the NEDA Detector 1h 30m
        Modern nuclear physics experiments involving fusion-evaporation reactions frequently require the detection of particles (alphas,protons,neutrons) which provide crucial information about the nucleus under study. Some reaction channels involving neutron detection have very low-cross section and require the use of large scintillator arrays which are also sensitive to the gamma-rays, hence, meaning that neutron-gamma discrimination (NGD) techniques must be applied. Besides, due to the high counting rates at which the experiments are carried out and the need of using digital electronics, the NGD is expected to be implemented digitally in the earlier stages in order to decrease the total data throughput which would be mostly produced by the gamma-rays. NGD has been largely used in a wide assortment of neutron detectors (Neutron Wall) using analog electronics employing pulse-shape analysis (PSA) techniques such as the zero cross-over (ZCO) and charge-comparison (CC) methods. Due to the inherent limitations of the analog electronics, an effort is put into moving these PSA methods for NGD to the digital domain using programmable devices such as FPGA, so a higher degree of flexibility and integration can be achieved without losing performance in terms of the discrimination performance. In this paper we analyze the performances, complexity and resources of two widely-used PSA algorithms (Zero-CrossOver and Charge Comparison) in order to implement them using digital electronics based on FPGA. The chosen algorithm will be set in the new-digital electronics of the NEDA (Neutron Detector Array) detector, currently in a development stage. It is expected, by employing this algorithm in an FPGA, to provide a simple mechanism to discard a large amount of gamma-rays while preserving the flexibility and robustness that digital systems offer.
        Speaker: Francisco Javier Egea Canet
      • 15:00
        A hardware implementation of the Levinson routine in a radio detector of cosmic rays to improve a suppression of the non-stationary RFI 1h 30m
        Radio detector of the ultra high-energy cosmic rays in the Pierre Auger Observatory operates in the frequency range 30-80 MHz, which is often contaminated by the human-made RFI. Several filters were used to suppress the RFI: based on the FFT, IIR notch filter and FIR filter based on the liner prediction. It refreshes the FIR coefficients calculating either in the external ARM processor, internal soft-core NIOS processor implemented inside the FPGA or hard-core embedded processors (HPS) being a silicon part of the FPGA chip. Refreshment times significantly depend on used type of calculation process. For stationary RFI the FIR coefficients can be refreshed each minute or rarer. However, an efficient suppression of non-stationary short-term contaminations requires a much faster response. FIR coefficients calculated by an external ARM take several seconds, by NIOS on the level of hundreds milliseconds. The HPS allows a reduction of refreshment time to ~20 ms (for 32-stage FIR filter). This is still not too long. A symmetry of covariance matrix allows using much faster Levinson procedure instead of typical Gauss routine solving a set of linear equations. The Levinson procedure calculated even in the HPS takes relatively a lot of time. A hardware implementation this procedure inside the FPGA fabric as specialized microprocessor requires only ~40 000 clock cycles. By the 200 MHz ADC and global FPGA clock, this corresponds to ~200 us - 2 level of magnitudes less than for the HPS. We practically tested this algorithm on the radio-detector Front-End Board and compared with the previous approaches: FFT, IIR, NIOS and HPS. As a signal source was used the Butterfly antenna with the LNA used in the Auger Engineering Radio Array. The code has been implemented into several various chips for a comparison of speed, resource occupancies, however, a target is Cyclone V E FPGA 5CEFA9F31I7 used in the Front-End Board for the Pierre Auger radio detector. The FIR filter should operate in the fly, it means with the same clock as ADCs. In order to avoid aliasing, according to Nyquist rule the sampling frequency should be at least twice higher than the higher frequency in the signal spectrum. The spectrum is formed by the band-pass filters to 30-80 MHz. Selected sampling frequency in the radio detectors is 200 MHz. The hardware Levinson procedure does not need to operate with the same ADC clock, however, it is recommended to avoids temporarily memories. The 200 MHz speed has been achieved in the StratixIII FPGAs (speed grade - 2). Cyclone V (speed grade - 7) needs some more optimizations and probably additional pipeline stages. Nevertheless, the algorithm operating with lower clock then 200 MHz can be used also in the FIR filter. 180 MHz obtained at present in Cyclone V enlarges a refreshment time on ~10% only. We plan to test the algorithm in real radio stations in Argentinean pampas.
        Speaker: Zbigniew Szadkowski (University of Lodz)
      • 15:00
        A High Frame Rate Test System for The HEPS-BPIX based on NI-sbRIO Board 1h 30m
        HEPS-BPIX is a pixel detector designed for the High Energy Photon Source (HEPS) in China. As a hybrid pixel detector, it consists of a silicon sensor and a readout chip which is bump-bonded to the sensor with Indium. The detector contains an array of 104×72 pixels while each pixel measures 150 μm×150 μm. Each pixel of the readout chip comprises a preamplifier, a discriminator and a counter. Aiming at X-ray imaging, HEPS-BPIX works in the single photon counting mode, the counting depth of every pixel is 20 bits. The test system of the detector which implements all the control, calibration, readout and real-time imaging has been developed based on the NI-sbRIO board (sbRIO-9626). The field programmable gate array (FPGA) of the NI-sbRIO board deserializes the data from the pixel array and translates the clock as well as the serial configuration data to the detector. The FPGA firmware and the simple data acquisition (DAQ) system have been designed with LabVIEW environment in order to decrease the time of the development. Through the use of the LabVIEW programmed DAQ software, the test system can control the signal generator by Ethernet to calibrate the detector automatically. Meanwhile, it can monitor the real-time image and change the configuration data to make the debugging much easier. The test system has been utilized for the X-ray test and the beam line test of the detector. A series of X-ray images have been taken and a high frame rate of 1.2kHz has been realized. This paper will give the details of the test system and present results of the performance of the HEPS-BPIX.
        Speakers: Jie Zhang (Institute of High Energy Physics, Chinese Academy of Sciences), Jingzi Gu (Institute of High Energy Physics, Chinese Academy of Sciences)
      • 15:00
        A Hybrid Analog-digital Integrator for EAST Device 1h 30m
        A hybrid analog-digital integrator has been developed to be compatible with the long pulse plasma discharges on Experimental Advanced Superconductor Tokamak (EAST), in which a pair of analog integrators are used to integrate the input signal by turns to reduce the error caused by the leakage of integration capacitors, and the outputs of two integrators can be combined to construct a continuous integration signal by a Field Programmable Gate Array (FPGA) built in the digitizer. The integration drift is almost linear and stable in controlled temperature, so a period of typically 50 s is used to determine the effective drift slope, which is used to rectify the integration signal in real time. The data integrated in the internal FPGA can be directly transferred into the reflective memory installed in the same PCI eXtensions for Instrumentation (PXI) chassis. The test results show that the processed integration drift is less than 200 uVs during 1000 s integration, which will meet the accuracy of magnetic diagnostics in EAST experimental campaigns.
        Speaker: Dr Yong Wang (Institute of Plasma Physics, Chinese Academy of Sciences)
      • 15:00
        A new electronic board to drive the Laser calibration system of the ATLAS hadron calorimeter 1h 30m
        The LASER calibration system of the ATLAS hadron calorimeter aims at monitoring the ~10000 PMTs of the TileCal. The LASER light injected in the PMTs is measured by sets of photodiodes at several stages of the optical path. The monitoring of the photodiodes is performed by a redundant internal calibration system using an LED, a radioactive source, and a charge injection system. The LASer Calibration Rod (LASCAR) electronics card is a major component of the LASER calibration scheme. Housed in a VME crate, its main components include a charge ADC, a TTCRx, a HOLA part, an interface to control the LASER, and a charge injection system. The 13 bits ADC is a 2000pc full-scale converter that processes up to 16 signals stemming from 11 photodiodes, 2 PMTs, and 3 charge injection channels. Two gains are used (x1 and x4) to increase the dynamic range and avoid a saturation of the LASER signal for high intensities. The TTCRx chip (designed by CERN) retrieves LHC signals to synchronize the LASCAR card with the collider. The HOLA mezzanine (also designed by CERN) transmits LASER data fragments (e.g. digitized signal from the photodiodes) to the DAQ of ATLAS. The interface part is used during the pp collisions when the LASER is flashed in empty bunch-crossings. A time correction may then be performed, depending on the LASER intensity requested. The charge injection part aims at monitoring the linearity of the photodiode preamplifiers by injecting a 5V max signal with a 16-bits dynamics. All these features are managed with a field-programmable gate array (FPGA Cyclone V) and a microcontroler (Microchip pic32) equipped with an ethernet interface to the Detector Control System (DCS) of ATLAS.
        Speaker: Philippe Gris (Univ. Blaise Pascal Clermont-Fe. II (FR))
      • 15:00
        A Time-to-Digital Converter Based on a Digitally Controlled Oscillator 1h 30m
        Time measurements play a crucial role in trigger and data acquisition systems (TDAQ) of High Energy Physics (HEP) experiments, where calibration, synchronization between signals and phase-measurements accuracy are often required. Although the various elements of a time measurement system are typically designed using a classical mixed-signal approach, state-of-art research is also focusing on all-digital architectures. Mixed signal approach has the advantage to reach better performances, but it requires more development time than a fully-digital design. Moreover, the porting of analog IP blocks into a new technology typically requires a significant design effort compared with digital IPs. In this work, we present a fully-digital TDC application, based on a synthesizable DCO, where the TDC measures the phase relationship between a timing signal and a 40 MHz reference clock. The DCO design is technology-independent, it is described by means of a hardware description language and it can be placed and routed with automatic tools. We present the TDC architecture, the DCO performances and the results on a preliminary implementation on a 130 nm ASIC prototype, in terms of output jitter, power consumption, frequency range, resolution, linearity and differential non-linearity. The TDC will be used in the new readout chip that is under development for the Muon detector electronic upgrade in LHCb experiment at CERN. The TDC presented in this paper has the fundamental task of measuring the phase difference between the 40 MHz LHC machine clock and a digital signal coming from the muon detector, in order to allow the phase shift of the detector signal according to the required resolution of the experiment.
        Speaker: Luigi Casu (Universita e INFN (IT))
      • 15:00
        Adaptive IIR-notch filter for RFI suppression in a radio detection of cosmic rays 1h 30m
        Radio stations can observe radio signals caused by coherent emissions due to geomagnetic radiation and charge excess processes. Auger Engineering Radio Array (AERA) observes the frequency band from 30 to 80 MHz. This range is highly contaminated by human-made RFI. In order to improve the signal to noise ratio RFI filters are used in AERA to suppress this contamination. The AERA uses the IIR notch filters operating with fixed parameters and suppressing four narrow bands. They are not sensitive on new source of RFI as walkie-talkie, mobile communicators and other human-made RFI. In order to increase an efficiency of a self-trigger the signal should be cleaned from the RFI to improve a signal to noise ratio. One of the source of RFI are narrow-band transmitters. This type of RFI can be significantly suppressed by digital filters after a signal digitization in the ADCs. IIR filters are generally potentially unstable due to feedbacks, however than are much shorter and power efficient than FIR filters. We implemented a NIOS virtual processor calculating new set of IIR filter coefficients, which are reloaded dynamically on the fly. The spectrum analysis of 30-80 MHz MHz band is supported by the Altera FFT IP Core. The NIOS adjusts the new coefficients the poles of the filter to be inside the unique complex radius (a condition of stability) as well as it tunes a width of the notch filter. Practical implementation was tested in the laboratory with signal and pattern generators as well as with the LPDA antenna with LNA - a set used in real AERA radio stations.
        Speaker: Zbigniew Szadkowski (University of Lodz)
      • 15:00
        An Extensible Induced Position Encoding Readout Method for Micro-pattern Gas Detectors 1h 30m
        The requirement of a large number of electronics channels has become an issue to the further applications of Micro-pattern Gas Detectors (MPGDs), and poses a big challenge for the integration, power consumption, cooling and cost. Induced position encoding readout technique provides an attractive way to significantly reduce the number of readout channels. In this paper, we present an extensible induced position encoding readout method for MPGDs. The method is demonstrated by the Eulerian path of graph theory. A standard encoding rule is provided, and a general formula of encoding & decoding for n channels is derived. Under the premise of such method, a one-dimensional induced position encoding readout prototyping board is designed on a 5×5 cm2 Thick Gas Electron Multiplier (THGEM), where 47 anode strips are read out by 15 encoded multiplexing channels. Verification tests are carried out on a 8 keV Cu X-ray source with 100μm slit. The test results show a robust feasibility of the method, and have a good spatial resolution and linearity in its position response. The method can dramatically reduce the number of readout channels, and has potential to build large area detectors and can be easily adapted to other detectors like MPGDs.
        Speakers: Mr Guangyuan YUAN (State Key Laboratory of Particle Detection and Electronics, University of Science and Technology of China, Hefei 230026, China;Department of Modern Physics, University of Science and Technology of China, Hefei 230026, China), Mr Siyuan MA (State Key Laboratory of Particle Detection and Electronics, University of Science and Technology of China, Hefei 230026, China;Department of Modern Physics, University of Science and Technology of China, Hefei 230026, China)
      • 15:00
        An I/O Controller for Real Time Distributed Tasks in Particle Accelerators 1h 30m
        SPES is a second generation ISOL radioactive ion beam facility in construction at the INFN National Laboratories of Legnaro (LNL). Its distributed control system embeds custom control in almost all instruments or cluster of homogeneous devices. Nevertheless, standardization is an important issue that concerns modularity and long term maintainability for a facility that has a life span of at least twenty years. In this context, the research project presented in this paper focuses on the design of a custom IOC (Input Output Controller) which acts as a local intelligent node in the distributed control network and is generic enough to perform several different tasks spanning from security and surveillance operations, beam diagnostic, data acquisition and data logging, real time processing and trigger generation. The IOC exploits the COM (Computer On Module) Express standard that is available in different form factors and processors, fulfilling the computational power requirement of varied applications. The Intel x86-64 architecture makes software development straightforward, easing the portability. The result is a custom motherboard with several application specific features and generic PC functionalities. The design is modular to a certain extent, thanks to an hardware abstraction layer and allows the development of soft and hard real time applications by means of a real time Operating System and of an on-board FPGA closely coupled to the CPU. Three PCIe slots, a FPGA Mezzanine Card (FMC) connector and several general-purpose digital/analog inputs/outputs enable functionality extensions. An optical fiber link connected to the FPGA is an high speed interface for high throughput data acquisitions or timing sensitive applications. The power distribution complies the AT standard and the whole board can be supplied via Power Over Ethernet (POE+) IEEE 802.3at standard. Networking and device-to-cloud connectivity are guaranteed via a gigabit ethernet link. The design, performance of the prototypes and intended usage will be presented.
        Speakers: Dr Davide Pedretti (Universita e INFN, Legnaro (IT)), Dr Stefano Pavinato (INFN - National Institute for Nuclear Physics)
      • 15:00
        An α/γ Discrimination Method for BaF2 Detector by FPGA-based Linear Neural Network 1h 30m
        A pulse shape discrimination (PSD) method based on linear neural network is proposed for separation of α- and γ-induced events in BaF2 crystal.
        An artificial linear neural network was designed to identify α- and γ- induced events in BaF2 crystal with the inputs of several pulse information, including pulse pedestal, amplitude, gradient, long/short amplitude integral, and the amount of the samples over threshold. The neural network output, which is a two-element vector, (Y1, Y2), indicates which type the input pulse is of three: the α-induced, the γ-induced and the noises. The linear neural network is trained in Matlab using 40000 BaF2 detector pulses, and the desired optimality is achieved. Then we implement the linear neural network in a Spartan-6 FPGA using the weight matrix of the neurons.
        We build a signal digitalization and real-time discrimination system basing on this method. A 1Gsps ADC is used for BaF2 detector signal sampling. Once trigged, a 2K sampling-point pulse sequence will be sent into a data processing module, and several pulse information will be extracted in FPGA, as the inputs of the linear neural network. Then the output vector transfers to PC, and after a graphic analysis, it’s shown that the α-induced events, γ-induced events and noises are well separated.
        To evaluate the performance of this system, a coincidence evaluation test is processed. We utilize a LaBr3 detector as another input of the digitalization system. And a collimated 22Na γ-source is placed between the BaF2 and LaBr3 crystals. Owing to the two-photon radiation of 22Na, the self-trigged events in BaF2 detector in coincidence with γ-rays in LaBr3 crystal should also be a γ-induced event except for occasional coincidence. Among 5000 two-detector-coincidence events, 28 BaF2 events are identified as α-induced by the system, the false coverage rate is below 0.5% considering the chance of occasional coincidence. The test verifies good effect and feasibility of this system.
        Speaker: Mr Chenfei Yang (1. State Key Laboratory of Particle Detection and Electronics, University of Science and Technology of China, Hefei 230026, China; 2. Department of Modern Physics, University of Science and Technology of China, Hefei 230026, China)
      • 15:00
        Assessment of General Purpose GPU systems in real-time control 1h 30m
        The recent advance of GPU technology is offering great prospects in computation. Originally developed for graphical applications, general purpose GPUs (GPGPU) have been extensively used for massively parallel computation. The penetration of the GPU technology in real-time control has been somewhat limited due to two main reasons: 1) Control algorithms for real-time applications involving highly parallel computation are not very common in practical applications, 2) The excellent performance in computation of GPUs is paid for a penalty in memory transfer. As a consequence, GPU applications for real-time controls suffer from an often unacceptable latency. There are in any case some real-time applications in fusion research that may take benefit from the usage of GPGPUs such as state space-based control for a very large number of states. The excellent performance of GPUs in term of throughput of computation is however counterbalanced by a poor performance in memory transfer, leading to an increase in latency in the typical cycle in real-time control involving data sample acquisition; elaboration; transfer of the resulting data to actuators. A precise assessment of latency vs throughput represents therefore a very useful information when designing real-time control systems for potentially parallel applications, especially when facing the option for GPUs or multi-threaded CPU applications. We designed a code (for GPU & CPU) to test latency and jitter. Operations that we used as a test load were dense matrix-vector multiplications and memory transfer in order to mimic a large state space based control algorithm. We compared obtained results to see where GPU computation excels and where it falls behind in order to give useful hints to designers facing the option of using either a multi-threaded, multicore CPU application or a GPGPU.
        Speaker: Tautvydas Jeronimas Maceina (Consorzio RFX)
      • 15:00
        Automated Testing of MicroTCA.4 Modules 1h 30m
        The Low Level Radio Frequency (LLRF) control system of the European X-Ray Free Electron Laser is designed using MicroTCA.4 standard. The real-time control system is composed of a few Advanced Mezzanine Cards (AMCs): timing, digitizer, digital controller and vector modulator modules. The DAMC-TCK7 digital controller module was developed as a high-performance low-latency data processing device. The DAMC-TCK7 card, based on Xilinx Kintex 7 FPGA device, provides all necessary resources required to implement the digital LLRF controller, i.e.: processing power, DRAM memory, flexible timing distribution and Rear Transition Module (RTM) controller. The module is equipped with various high-speed serial interfaces available on the AMC, Zone 3 connectors and front-panel that are capable of transferring data up to 12.5 Gbps each. More than 60 DAMC-TCK7 modules were fabricated for the XFEL accelerator. The manufactured modules should be carefully tested before they will be installed in the XFEL accelerator tunnel. A dedicated framework for automated testing of the DAMC-TCK7 modules was developed to simplify and accelerate the test procedure. This paper presents details of the automated test framework design and the results after tests of 60 DAMC-TCK7 modules. The framework is composed of a FPGA firmware, Linux driver and software that allows testing of all key components of the digital AMC module, like power supply, FPGA, memory, clock distribution, high-speed interfaces, IPMI controller and its sensors. The framework uses MMC (Module Management Controller) to verify the proper operation of power supply module, AMC and RTM management. The other modules are tested using the FPGA with dedicated firmware. As a result, a report in the PDF file format is generated.
        Speaker: Dr Dariusz Makowski (Lodz University of Technology, Department of Microelectronics and Computer Science)
      • 15:00
        Automation and Control of a plasma experiment using EPICS 1h 30m
        The IST-BPlasma device is a compact setup used for the execution of plasma physics experimental protocols. It creates a low temperature plasma resultant from the interaction of an electron beam with noble gases gas at low pressure and is equipped with Radio-Frequency (RF) based diagnostics, namely (i) resonant cavity and (ii) electrostatic probes. These allow to measure the plasma density and study the propagation of electrostatic waves. However, its control system did not enable to maintain stable operating parameters, such as gas pressure or accurate probe positioning throughout the experiments and the output from the diagnostics was not recorded into digital data files due to fully analog acquisition system. As a result, a set of two dsPICnode V3.0 boards, fitted with Microchip dsPIC30F microcontrollers, were selected and their firmwares and expansion cards developed according to the specific purpose. The board used for hardware control was connected to actuators and sensors installed on the setup including valves, pressure gauges via RS-485, vacuum pumps and position encoder among others. The firmware also includes a gas injection PID algorithm. On the other hand, the second board was used to acquire the output signals of the RF diagnostics crystal detectors. Both boards were connected via RS-232 to a local-host computer running EPICS I/O Controller (IOC). The interface was achieved using StreamDevice generic device support for EPICS, along with a communication protocol to send configuration commands and receive operation and diagnostics data. Finally, a Graphical User Interface (GUI) was developed using Control System Studio (CS-Studio) enabling the operator to have supervisory control and live access to operation parameters, as well as, to the diagnostics data. The client application ran on a separate computer and was connected to the the local-host IOC using EPICS ChannelAccess. This approach presented operation flexibility since it allowed both local and remote access. Operation trials conducted showed that by using the presented solution it was possible to control the fundamental parameters of the apparatus and correctly retrieve experimental data. Moreover, this configuration created the possibility of introducing additional features in future modifications. The designed control and acquisition solution improved reliability, reproducibility of experimental conditions and user experience. Given the successful implementation of this solution, it is foreseeable that it can be easily ported and implemented in other similar devices.
        Speaker: Mr Pedro Lourenço (Instituto de Plasmas e Fusão Nuclear)
      • 15:00
        Beam Test Performance of the Prototype Trigger-less Data Acquisition for the PANDA Experiment 1h 30m
        We present the first FPGA-based version of a Prototype Trigger-less Data Acquisition (PTDAQ) for the PANDA experiment. The PANDA experiment will operate in an trigger-less environment of 20 MHz interaction rate, with a peak rate up to 50 MHz, and a design luminosity of 2 x 10³² cm⁻¹ s⁻¹. The event size will be about a few KB, producing data rates of several hundreds of GB/s. A reduction of up to three orders of magnitude will be accomplished after event filtering based on a full reconstruction of the events in real-time, including tracking and particle identification. An additional complication arises from overlapping events occurring at high event rates. Thus, event assembly is a highly non-trivial process and is accomplished by combining data packets from the freely streaming subsystems using precision time stamps. The PTDAQ system consists of xTCA-based FPGA Processor (xFP) cards, equipped with a Xilinx Virtex-5 FPGA and 4 GB DDR2 RAM. The system is scalable. The xFP cards are either hosted by a microTCA shelf or by a AdvancedTCA carrier board. This hardware platform is a development of IHEP Beijing in cooperation with JLU Giessen. Featuring similar functionality as the final DAQ of PANDA, the PTDAQ receives data from freely streaming front-end electronic devices (FEE) synchronized using the Synchronization of Data Acquisition Network (SODANET). We have implemented data input and output interfaces and basic features like zero suppression and low level event-building in VHDL. For Gigabit Ethernet I/O we use FPGA implementations of the TCP and UDP protocols from collaborating institutes. A first in beam test at the MAMI facility reading out a PANDA electromagnetic calorimeter prototype as well as the Glasgow Tagged Photon Spectrometer was performed in November 2015. This is the first test of the full DAQ chain including SODANET and the PTDAQ. In this contribution we present the overall architecture of the PTDAQ system as well as results from the test.
        Speaker: Wolfgang Kühn (JLU Giessen)
      • 15:00
        Benchmarking message queue libraries and network technologies to transport large data volume in the ALICE O2 system 1h 30m
        ALICE (A Large Ion Collider Experiment) is the heavy-ion detector designed to study the physics of strongly interacting matter and the quark-gluon plasma at the CERN LHC (Large Hadron Collider). ALICE has been successfully collecting physics data of Run 2 since spring 2015. In parallel, preparations for a major upgrade, called O2 (Online-Offline) and scheduled for the Long Shutdown 2 in 2019-2020, are being made. One of the major requirements is the capacity to transport data between so-called FLPs (First Level Processors), equipped with readout cards, and the EPNs (Event Processing Node), performing data aggregation, frame building and partial reconstruction. It is foreseen to have 268 FLPs dispatching data to 1500 EPNs with an average output of 20 Gb/s each. In overall, the O2 processing system will operate at terabits per second of throughput while handling millions of concurrent connections. To meet these requirements, the software and hardware layers of the new system need to be fully evaluated. In order to achieve a high performance to cost ratio three networking technologies (Ethernet, InfiniBand and Omni-Path) were benchmarked on Intel and IBM platforms. The core of the new transport layer will be based on a message queue library that supports push-pull and request-reply communication patterns and multipart messages. ZeroMQ and nanomsg are being evaluated as candidates and were tested in detail over the selected network technologies. This paper describes the benchmark programs and setups that were used during the tests, the significance of tuned kernel parameters, the configuration of network driver and the tuning of multi-core, multi-CPU, and NUMA (Non-Uniform Memory Access) architecture. It presents, compares and comments the final results. Eventually, it indicates the most efficient network technology and message queue library pair and provides an evaluation of the needed CPU and memory resources to handle foreseen traffic.
        Speaker: Adam Tadeusz Wegrzynek (Warsaw University of Technology (PL))
      • 15:00
        Brain Emulation for Image Processing 1h 30m
        We present an innovative and high performance embedded system for real-time pattern matching. The design uses Field Programmable Gate Arrays (FPGAs) and the powerful Associative Memory chip (an ASIC) to achieve real-time performance. The system works as a contour identifier able to extract the salient features of an image. It is based on the principles of cognitive image processing, which means that it executes fast pattern matching and data reduction mimicking the operation of the human brain.
        Speaker: Pierluigi Luciano (UNICLAM and INFN)
      • 15:00
        Charged particle track reconstruction in CMS using fast algorithms implemented in hardware: an overview of the proposed implementations to be used for the HL-LHC and the current efforts to demonstrate their operation 1h 30m
        The CMS detector will be upgraded in preparation for the high luminosity operation of the Large Hadron Collider, which is due to start in 2025. The CMS collaboration plans to reconstruct the trajectories of charged particles produced in the LHC collisions, using advanced electronics processing data from its tracking detector. The resulting tracks must be available within a few microseconds, such that they can be used as input to the level-1 trigger. This talk introduces the proposed system implementations and the current efforts to demonstrate charged particle tracking in hardware.
        Speaker: Kristian Hahn (Northwestern University (US))
      • 15:00
        Concentrator for the Readout of the PANDA Micro Vertex Detector based on MicroTCA 1h 30m
        The Micro Vertex Detector (MVD) will be used as the central tracking detector in the PANDA (AntiProton Annihilation at Darmstadt) detector system which is under development for the future accelerator facility FAIR in Darmstadt, Germany. The design of the MVD is based on silicon strip detectors at the outer layer and on silicon pixel detectors at the inner layers. Data from the readout ASICs in the front end will be sent via GBT opical links to a multiplexing layer aggregating them to 10 Gbit/s optical uplinks to the Level-1 Trigger network. The multiplexing layer will be based on MTCA.4 using the HGF-AMC, a versatile MTCA.4 module developed by DESY in cooperation with KIT. In order to extend the multiplexing capabilities of the HGF-AMC, a Rear Transition Module (RTM) with 8 optical links has been designed.
        Speaker: Mr Harald Kleines (Forschungszentrum Juelich)
      • 15:00
        Control system optimization techniques for real-time applications in fusion plasmas: the RFX-mod experience 1h 30m
        Magnetic confinement of fusion relevant plasmas is the target of many devices that are nowadays working towards the achievement of electricity from controlled fusion reactions. Such plasmas constitute a particularly harsh nuclear environment in which violent instabilities can arise [1], causing confinement losses and possible damage to structural materials. Effective control of such instabilities is therefore compulsory for controlled fusion experiments, with active control playing an important role. The RFX-mod experiment is a medium size (R = 2m, a = 0.459 m) toroidal device that has been operating since 2004. It is equipped with a state-of-the-art system for active control of magneto-hydro-dynamic (MHD) instabilities. Such a system, operating with a cycle time T=200μs, is composed of 192 independently fed actuators (saddle coils) and over 600 inputs (magnetic sensors) [2]. The high degree of flexibility of the control system allows virtually switching on or off each single coil, thus different control schemes can be easily implemented [3]. In order to improve the efficiency and effectiveness of the active control system, a series of efforts have been made in optimizing the produced magnetic fields, for example by minimizing the harmonic distortion due to the toroidal geometry [4], or adapting the control scheme to real experimental conditions, such as the fault of single coils [5]. The techniques used to achieve these results, which will be illustrated in the present work, have been tested both in simulations and experiment. A dynamic simulator [6] has been developed for the purpose of testing optimization strategies. It consists of a detailed three-dimensional description of the conducting structures coupled to a two-dimensional plasma magnetohydrodynamic (MHD) model (resulting in the CarMa code [7]) and integrated by a complete representation of the real time control system. The implementation of simple, linear algebra based, real time optimization methods will be described along with proof of the sought beneficial effects. Focus of the work is set on a spurious harmonics reduction technique based on the decoupling of sensors and actuators, a description of its derivation will be given together with the implementation in the control loop. The similar procedure for the compensation of faulted actuators will also be mentioned. [1] Freidberg, J. P. Ideal magnetohydrodynamics. New York London: Plenum Press (1987). [2] P. Sonato et al, Fusion Eng. Des. 66–68, 161 (2003) [3] M. Baruzzo et al, Nucl. Fusion 52, 103001 (2012) [4] L. Pigatto et al, 41st EPS Conference on Plasma Physics, P5.080, Berlin (2014) [5] L. Pigatto, et al., Fusion Eng. Des. 96-97 (2015) 690-693 [6] G. Marchiori et al, Nucl. Fusion 52, 023020 (2012) [7] R. Albanese et al, IEEE Trans. on Mag. 44, 1654 (2008)
        Speaker: Mr Leonardo Pigatto (Consorzio RFX, Corso Stati Uniti, 4 35127, Padova, Italy)
      • 15:00
        Data Acquisition and Protection System for a Multi-MHz Neutron Detector 1h 30m
        On the "KWS2" small angle scattering instrument at the "FRM-2" neutron source at Garching, Germany a new 3He neutron detector was installed and commissioned in 2015. It is built of 18 "8-pack" modules from GE Power / Reuter-Stokes. Each of these modules has its own data acquisition and slow control processor, using only Gigabit Ethernet as connection to the outside world. We show how data acquisition, time synchronization and interaction with the slow control system are laid out, and some first results and performance data.
        Speaker: Matthias Drochner (FZJ)
      • 15:00
        DCT trigger in a high-resolution test platform for a detection of very inclined showers in the Pierre Auger surface detectors 1h 30m
        The paper is presenting the first results from the trigger based on the Discrete Cosine Transform (DCT) operating in the new Front-End Boards with Cyclone V E FPGA deployed in 7 test surface detectors in the Pierre Auger Engineering Array. The patterns of the ADC traces generated by very inclined showers were obtained from the Auger database and from the CORSIKA simulation package supported next by OffLine reconstruction Auger platform which gives a predicted digitized signal profiles. Simulations for many variants of the initial angle of shower, initialization depth in the atmosphere, type of particle and its initial energy gave a boundary of the DCT coefficients used next for the on-line pattern recognition in the FPGA. Preliminary results have proven a right approach. We registered several showers triggered by the DCT for 120 MSps and 160 MSps. Very inclined showers generated by hadrons and starting their development early in the atmosphere produce a relatively thin muon pancake (~1m thickness) on a detection level. Ultra-relativistic charged particles trespassing the water in a surface detector generate the Cherenkov light detected next in photo-multipliers (PMT). A direct light gives a peak with a very short rise time and fast exponential attenuation. The DCT trigger allows recognition of ADC traces with specific shapes The standard trigger requires 3-fold coincidences in a single time bin. The present sampling frequency in the surface detectors is 40 MHz. The new Front-End Board developed for the Auger-Beyond-2015 surface detector upgrade allows a sampling up to 250 MSps (120 MSps and 160 MSps were used in tests). Neutrinos can generate showers starting their development deeply in the atmosphere, known as "young". They contain a significant amount of an electromagnetic component, usually preceded by a muon bump. Simulations show that it is often fully separated from the EM fraction and the 16-point DCT algorithm can also be used. A probability of 3-fold coincidences of direct light corresponding to a standard Auger trigger is relatively low. Much more probable are 2-fold coincidences of a direct light. The 3rd PMT is next hit by reflected light, but with some delay. By fast sampling (120-160 MSps) this delay gives signal in the next time bins. The standard T1 trigger ceases giving a sufficient rate for horizontal and very inclined showers. The rate drops down below an acceptable level. We had to modify the T1 trigger to get approximately standard trigger rate.
        Speaker: Zbigniew Szadkowski (University of Lodz)
      • 15:00
        Design and development of a real-time readout electronics system to retrieve data from a square multi-anode photomultiplier tube for neutron gamma pulse shape discrimination 1h 30m
        Pulse Shape Discrimination (PSD) algorithms can reliably separate neutrons and gamma-ray photons interacting in a scintillation detector. When implemented in the digital domain, the PSD algorithms allow real-time discrimination of neutron sources from gamma sources. This paper presents a design of a readout electronics system to retrieve data from a multi-anode photomultiplier tube (MAPMT) for a scintillator based coded-aperture neutron imager. The scintillator was coupled with Hamamatsu H9500, a square MAPMT, where each anode of the MAPMT was linked to a resistor network to infer the position of incidence of radiation within the scintillant. Additionally, the resistor network output signals were filtered through a novel noise reduction circuit to preserve the data corresponding to each pulse. Localised pulses were digitised using 12-bit 500 MSPS Analogue to Digital Converter (ADC). Sampled signals were temporarily stored in a local ping-pong buffer, before being processed by the customised application developed on a field programmable gate array (FPGA). Initial results suggest that the high ADC sampling rate provides sufficient information for neutron gamma source discrimination using PSD. Parallel real-time signal processing, implemented on the FPGA, enables multi-channel functioning to generate an array of interactions within the scintillator in terms of gamma rays and neutrons.
        Speaker: Mr Michal Cieslak (Lancaster University)
      • 15:00
        Design and evaluation of a FPGA online feature extraction data pre-processing stage for the CBM-TRD experiment 1h 30m
        The feature extraction is a data pre-processing stage of the proposed data-acquisition chain (DAQ) for the CBM-TRD experiment at FAIR, aiming to deliver event-filtered and bandwidth-reduced data to the First Level Event Selection (FLES). A data rate of about 1TB/s and a high event rate of approximately 100 kHz is expected for the final experiment. The TRD detector of the CBM experiment will be composed of about 24,000 SPADIC 1.0 chips. The SPADIC 1.0 can deliver full time-bin signals plus useful metadata. The presented firmware implements multiple algorithms in order to find and extract regions of interest within time-bin signals. Algorithms such as peak-finding, charge integration, center of gravity and time-over threshold were implemented for online analysis. On the other hand, a local clustering algorithm allows to find cluster members and to implement even further data reduction algorithms. Feature extraction is a common problem in data acquisition of high-energy particle experiments. However, the hardware description language (HDL) based designs tend to be written to solve very specific problems in the data-acquisition chain. Constraints as data format, front-end electronics and data containers make reusing and maintaining HDL designs a difficult task for a firmware designer. According to the problem at hand, a previously developed feature extraction framework has been used to generate an alternative FPGA firmware that implements similar processing algorithms to the originally hand-written VHDL design. The mentioned framework allows the creation of FPGA firmware without the need of writing HDL code, such as VHDL or Verilog for instance. This is achieved by using a domain-specific language (DSL) with which the designer is able to focus on the mathematical operations to be applied over the time-bin signals while leveraging the low-level code generation to the DSL compiler. The presented results in this work compare and analyze the design architecture of the hand-written FPGA firmware against the firmware generated by the feature extraction framework. Furthermore, performance results of the feature extraction stage used in the TRD data-acquisition chain during a beam-test campaign performed at the CERN-SPS hall in 2015 will be presented and discussed.
        Speaker: Cruz De Jesus Garcia Chavez (Johann-Wolfgang-Goethe Univ. (DE))
      • 15:00
        Design and test of a GBTx based board for the upgrade of the ALICE TOF readout electronics 1h 30m
        During CERN LHC Long Shutdown 2 foreseen in 2018-2019, the ALICE Time of Flight (TOF) readout electronics is going to be upgraded in order to be able to cope with the target trigger rate of 200 KHz with proton-proton collisions and 50 KHz with lead-lead collisions. For this reason, the Digital Readout Module (DRM) board is currently being redesigned with improved features and more up to date technologies. As a first step towards the new DRM board, an intermediate test board with all the newer features has been designed, realized and tested. This communication will focus on the design of the test board and the test results obtained. Since the board is going to work in a moderately hostile environment (0.13 krads expected in 10 years of data taking), a Microsemi Igloo2 FPGA has been chosen as the heart of the board. Being Flash memory based, it is basically immune to Single Event Upsets (SEUs) for what concerns the Configuration Memory. For what concerns the logic and memory, a triple modular redundancy scheme will be implemented in order to reduce the SEU rate. The GBTx radiation hard ASIC and a Versatile Link module VTRx (a rad-hard optical transceiver) from CERN will be used to implement both the interfaces towards the Data Acquisition (DAQ) and towards the trigger system. The GBTx implements a bidirectional 4.8 Gbps link between the detector area where the DRM sits and the counting room used for sending detector data in the uplink direction and for getting triggers and trigger information in the downlink direction. For configuring the GBTx registers and the front-end cards, the test board also hosts an ARM processor based piggy-back card, which provides a TCP-IP connection to an external computer. In order to fit the existing slow control environment currently in place with the DRM ALICE TOF cards, a CAEN proprietary CONET2 optical link protocol was implemented with a SFP+ transceiver connected to the Igloo2 internal SERDES. The configuration of the GBTx registers can also be done via a National Instruments USB controller driving the GBTx I2C port via a dedicated Virtual Instrument. The quality of the GBTx link has been tested in three ways: with an optical loopback, by connecting two different test boards together and also by means of a Xilinx KC705 evaluation board with a Kintex FPGA featuring a GBTx core provided by CERN. The Xilinx board was used to send triggers to the test board, which then sends back predefined data patterns stored into the static RAM. These configurations have been used to evaluate the BER of the GBTx optical connection both in the uplink and in the downlink directions: a BER as low as 10e(-14) has been measured with a 80 meters long multi-mode optical fiber. The capability to pair a Microsemi FPGA with the GBTx ASIC was extensively tested.
        Speaker: Davide Falchieri (Universita e INFN, Bologna (IT))
      • 15:00
        Design and Testing of the Bunch-by-Bunch Beam Transverse Feedback Electronics for SSRF 1h 30m
        Shanghai Synchrotron Radiation Facility (SSRF) is one of the third-generation high-beam current (3.5GeV) synchrotron light sources. In the storage ring of SSRF, multi-bunch instabilities would increase beam emittance and energy spread, which degrade beam quality and even cause beam loss. To address the above issues, a Transverse Feedback System is indispensable for SSRF, in which the key component is the bunch-by-bunch transverse feedback electronics. The whole feedback system consists of five main parts: BPM, RF front-end, signal processor, RF amplifier, and vertical/horizontal transverse kickers. This paper focuses on the signal processor, which is the main part of the feedback electronics. The RF front-end imports the signals from the BPM, then filters them to a bandwidth below 250 MHz, and send them to the signal processor. Then we use a 12 bit 500 Msps ADC (Analogue to Digital Converter) to sample the signal, and the digitized data are transferred to an FPGA (Field Programmable Gate Array) for Digital Signal Processing (DSP). The input data are first deserialized to four 125 Msps data streams, and then the information of each bunch is extracted from the data stream using shift registers, which is then processed by FIR filters to obtain the feedback coefficient of each beam. This coefficient can also be adjusted with different gains which can be controlled via remote PC. To make sure that the kicker takes effect on the correct bunch, the algorithm also contains the delay function with a step size of 4 ns, combined with the external delay line chips, a fine delay step size of 10 ps and a range of 2 μs can be achieved. The output of the FPGA are then converted to analog voltages by 500 Msps DAC (Digital to Analogue Converter). These voltages are then amplified and used as the input of the kickers to tune the beam into the optimum orbit. We also conducted initial testing on the signal processor to evaluate its performance and function. The test results indicate that the ENOB of the Analog-to-Digital conversion circuits is better than 9.5 bit in the frequency range up to 300 MHz, which is good enough for the application. Besides, this system also functions well as expected.
        Speaker: Mr Jinxin Liu (University of Science and Technology of China)
      • 15:00
        Design of a Compact Hough Transform for a new L1 Trigger Primitives Generator for the upgrade of the CMS Drift Tubes muon detector at the HL-LHC 1h 30m
        The operation of the CMS Drift Tubes muon detectors at the High Luminosity LHC will be possible only with an upgrade of the current readout and trigger electronics, which are based on very old technology and exposed to direct radiation, hence particularly sensitive to ageing. The current Trigger Primitives Generator (TPG) is designed around a synchronous device measuring at once the muon track parameters and the parent buch crossing by means of a time-dependent fitting algorithm. The device is strictly coupled to the readout electronics, sharing the same input signals, and placed on dedicated on-detector crates. The readout will be upgraded becoming sparsified and asynchronous to benefit from the recent fast developments in telecommunications technology, increasing the bandwidth using fast optical drivers such as the Gigabit Bidirectional Transceiver and therefore the trigger electronics must undergo its own upgrade. We present a proposal for a novel L1 Trigger Primitives Generator for the CMS barrel muon detectors able to operate on the asynchronous charge collection time measurements done by the new foreseen TDCs. This new L1 TPG is being designed around the implementation in state-of-the-art FPGA devices of the original development a Compact Hough Transform (CHT) algorithm, identifying the track segment parameters, combined with a Majority Mean-Timer, used to idetify the muon parent bunch crossing. The requirement for the algorithm is fitting inside FPGAs and having O(2) μs latency for decision taking, with efficiency and resolution equal or higher to thecurrent ones, given O(10) measured drift times immersed in a non negligible background. The major challenges are parallelization of the algorithm, fast readout of the CHT parameter matrix, the capability of handling data from a large array of Drift Tubes in minimal number of FPGAs, and coping with the latency requirements. These issues will be addressed proposing valuable solutions.
        Speaker: Nicola Pozzobon (Universita e INFN, Padova (IT))
      • 15:00
        Design of a concentrator for CMS trigger upgrade 1h 30m
        The CMS trigger system after phase I upgrade will be working with a 10 Gbps line rate asynchronously between modules, but the data output rate from the Muon Endcap RPC link board is 1.6 Gbps, so concentration from 1.6 Gbps to 10Gbps with fanout function is needed. A so-called Concentration Pre-Processing and Fan-out(CPPF) module has been designed for this purpose with processing and fan-out functionalities. This paper describes the design of this module which is in a double width single height MTCA compliant AMC module. The test and joint test with RPC-LB and MTF7 will be given.
        Speakers: Mr Chunjie Wang (IHEP Beijing), Prof. Zhen-An Liu (IHEP Beijing)
      • 15:00
        Design of Adaptive and Fast Readout System Based on Wire Scanner 1h 30m
        A new adaptive and fast readout system for the front-end signal of wire scanner is designed, which is used to measure the beam profile and emittance. This system is capable of handling constantly changing current signal, the case rate of which can reach up to 1000 counts/s and the input range is from 1 nA to 1 mA. The development of this new adaptive and fast front-end readout system which is applied to beam diagnostic, plays a crucial role in improving the accuracy of beam diagnostic, shortening the time of adjustable beam, improving the efficiency of accelerators. At present,this system has been used in the beam diagnostic of the injector II for accelerator driven sub-critical system (ADS) , it also can read out some detectors outputted in wide-range current signal, which are widely used in nuclear physics experiments and accelerator systems.
        Speaker: Mr hong su (Institute of Modern Physics, Chinese Academy of Sciences)
      • 15:00
        Design of the Readout Electronics Prototype for LHAASO WCDA 1h 30m
        The Large High Altitude Air Shower Observatory (LHAASO) is proposed to be built at an altitude of more than 4000 m, which aims for a very high energy gamma source survey above 30 TeV. The Water Cherenkov Detector Array (WCDA) is one of the major components in LHAASO. The WCDA electronics are responsible for the readout of 3600 Photomultiplier Tubes (PMTs), and a total of 400 Front End Electronics (FEE) modules are required. The main challenges in the WCDA readout electronics design include: 1) Both precise time and charge measurement is required over a large dynamic input amplitude range from 1 Photo Electron (P.E.) to 4000 P.E. 2) The 3600 PMTs are scattered within an area of 90000 m2, and since high precision time measurement is required, high quality of clock distribution over a long distance is necessary, and automatic clock phase compensation is expected with varying ambient temperature. 3) Besides, due to the requirement of “triggerless” architecture, all data from FEEs need to be read out based on 1000 M Ethernet and TCP/IP standard. To simplify the system architecture considering the large scale of detector node distribution, clock, data, and commands are mixed together and transmitted through the same fiber for each FEE. In this paper, we present the prototype design of the readout electronics for the LHAASO WCDA, and key techniques are discussed. We also conducted tests on the prototype electronics to evaluate the performance. The results indicate that a charge resolution better than 15%@ 1 P.E. and 2%@ 4000 P.E., and a time resolution better than 0.3 ns RMS are successfully achieved over the whole dynamic range; the clock phase compensation precision is better than 100 ps in the temperature range from -10 ºC to 60 ºC, beyond the application requirement. Detailed information are included in the attached supporting material document.
        Speaker: Mr Cong Ma (University of Science and Technology of China (USTC))
      • 15:00
        Development of ATLAS Liquid Argon Calorimeter Readout Electronics for the HL-LHC 1h 30m
        The high-luminosity phase of the Large Hadron Collider will provide 5-7 times greater instantaneous and total luminosities than assumed in the original design of the ATLAS Liquid Argon Calorimeters and their readout system. An improved trigger system with a higher acceptance rate of 1 MHz and a longer latency of up to 60 micro-seconds together with a better radiation tolerance require an upgrade of the readout electronics. Concepts for the future readout of the 182,500 calorimeter channels at 40-80 MHz and 16 bit dynamic range, and the development of low-noise, low-power and high-bandwidth electronic components will be presented. These include ASIC developments towards radiation-tolerant low-noise pre-amplifiers, analog-to-digital converters up to 14 bits and low-power optical links providing transfer rates of at least 10 Gb/s per fiber.
        Speaker: Kai Chen (Brookhaven National Laboratory (US))
      • 15:00
        Development of data acquisition and control system (DACS) for long pulse operations of Indian test facility of ITER Diagnostics Neutral Beam . 1h 30m
        The Indian Test Facility (INTF) is a negative Hydrogen ion based 100kV, 60A, 5Hz modulated NBI system having 3s ON/20sOFF duty cycle. Prime objective of the facility is to characterize ITER Diagnostic Neutral Beam (DNB) with full specifications, prior to shipment and installation in ITER. The automated and safe operation of the system will require a reliable and rugged instrumentation and Control system which provide control, data acquisition (DAQ), safety and interlock functions, referred as INTF-DACS. The INTF-DACS has been designed based on the ITER CODAC architecture and ITER-PCDH (plant control design handbook) guidelines with the aim of developing the technical understanding of CODAC framework to be utilized for development of plant system Instrumentation &Control for DNB. The hardware has been selected from the ITER slow and fast controller catalog. For high speed diagnostics, non NI high speed digitizers have been selected. In the area of software, CODAC core software for control application and NI-Labview for the DAQ application have been finalized. There are around 300 no of control and 500 no of acquisition channels consisting of thermal, optical, current and voltage measurements. The DACS has the mandate to operate INTF for pulse lengths up to 3600 sec by integrating 11 different plant systems; which includes the power supply plant system under a separate controller. The corresponding development possesses many technical challenges. The estimated file size of a single experimental pulse is in GBs for which ITER suggested HDF5 format is selected. The timing distribution is another challenge due to the different resolutions required in fast controller, slow controller and high speed diagnostics in a distributed area. Long pulse data acquisition and monitoring is another challenge. Data exchange across the software platforms, based on EPICS and Labview, is also required for integration. Presently the control and data acquisition hardware has been integrated and the development phase has been initiated on actual hardware platforms. This paper describes the various developmental activities undertaken to solve the technical challenges in above areas and integration of various components of the DACS towards realizing the full fledged functional INTF DACS.
        Speaker: Mr Himanshu Tyagi (ITER-India,IPR)
      • 15:00
        Development of front-end readout electronics for CsI (Tl) gamma detection array at ETF of CSR 1h 30m
        A Front-end readout electronics has been developed for CsI (Tl) Gamma Array with 1024 large area avalanche photodiodes (APDs) at External Target Facility (ETF) of Cooler Storage Ring (CSR) in the Institute of Modern Physics. The full read-out electronics consists of 32 identical analog boards, 8 Acquisition and Control boards (ACBs) and a PXI chassis. In the analog board Application Specific Integrated Circuits (ASICs) ATHED (Asic for Time & High Energy Deposit) are used to realize multi-channel energy and time measurements. The ACB implements analog output signal conversion, slow control, fast timing control signals generation and data acquisition. The read-out of the system is based on a PXI data acquisition, which can satisfy the requirements of a high counting rate and a large number of readout channels. The test results show that with a 60Co source an energy resolution of 7.2% has been achieved.
        Speaker: Xinzhe Wang
      • 15:00
        Development of Integrated Response Time Evaluation Methodology for the Plant Protection System 1h 30m
        Studies on setpoint determination methodologies for the plant protection system (PPS) for a nuclear power plant have been actively performed. The objective of determining a trip setpoint for the PPS is to meet the requirement of the analytical limit assumed in performing the safety analyses for a nuclear power plant. However, the PPS instrumentation channel which contains a transmitter, a signal conditioning processor, a protection system cabinet, and a final actuator should also meet the response time requirement assumed during the safety analysis. The response time is another critical factor required to ensure that the PPS accepts the crucial assumptions of the safety analysis. Researches on the response time for the PPS have been partially performed to cover an individual component or system using either analysis or test method. Furthermore, although the response time evaluation considers the whole instrumentation channel on the trip signal path, the evaluation task such as analysis or test has been separately performed. In other words, the response time evaluation for the PPS has not been handling the whole design process that contains safety analyses, system designs, response time analyses, and response time test. Additionally, definite relationship between the results from the analysis and test has not been considered. In this case, the safety of a nuclear power plant cannot be guaranteed since the related process variable could exceed the analytical response time (ART) confirmed by the results of the safety analysis. In order to solve the problems regarding the response time evaluation for the PPS, this paper proposes the integrated response time evaluation methodology that ensures the PPS meet a critical requirement of the ART. The proposed methodology has been applied to the PPS instrumentation channels for the advanced power reactor 1400 (APR1400) and the optimized power reactor 1000 (OPR1000) to fully verify the satisfaction of the ARTs for the low steam generator trip parameter. The two approaches indicate the appropriateness of the proposed methodology regardless of the type and size of nuclear power plants. The whole design process that covers the safety analysis, the system design, the response time analysis, and the response time test is addressed in the proposed methodology. Each output of the design process is the ART, the designed response time (DRT), the estimated response time (ERT), and the measured response time (MRT). The proposed methodology is composed of three steps for evaluating the response time of the PPS. The first, second, and third steps, respectively, are to demonstrate that the DRT is less than the ART, the ERT is less than the DRT, and the MRT is less than the ERT. Since the three steps were sequentially satisfied for the APR1400 and OPR1000, it can be guaranteed that the plant’s process variable does not exceed the safety limit during and after design basis events. Therefore, the safety of a nuclear power plant can be enhanced using the proposed methodology because the integrated response time evaluation methodology fully guarantees the safety analysis response time.
        Speaker: Dr CHANG JAE LEE (KEPCO E&C)
      • 15:00
        DEVELOPMENT, IMPLEMENTATION AND COMMISSIONING OF DATA ACQUISITION & CONTROL SYSTEM FOR TWIN SOURCE 1h 30m
        Twin Source - An Inductively coupled two RF driver based 180 kW, 1 MHz negative ion source experimental is setup at IPR, Gandhinagar with the objective of understanding the physics and technology of multi-driver coupling. The data acquisition and control system (DACS) for TS experiments involves development of control core program, Control GUI, acquisition program and front end signal conditioning electronics; testing, implementation & commissioning for its safe, reliable and successful operation. The TS-Control architecture is similar to ITER CODAC Core system with some technical features from ROBIN DACS . The control system consists of three parts (i) Master control system(S7 400PLC), (ii) Remote I/O (ET200S) for vacuum & cryo and (iii) (ET200M) for water cooling system, Extraction and acceleration power supply and (iv) S7 300 PLC for RF generator control. The optical PROFINET and PROFIBUS is used between the master control system and the remote I/O station and S7 300 PLC respectively. For the development of control core program, Siemens step 7 software is used, whereas, CODAC core system 4.0 is used for SCADA function . For the data acquisition purpose, National Instrument (NI) PXIe system and NI 6259 digitizer cards have been consider, following ITER fast controller catalogue. ITER PCDH (plant control design handbook) guidelines are not followed fully in TS-DACS. In CODAC core system fast acquisition function is not user-friendly. The LabVIEW real time software has been used for real time data acquisition application, though it is not a part in the PCDH fast controller catalogue. There are approximately 200 nos. of control channels and 152 nos. of acquisition channels to perform complete control of the system. All the signals coming from floating at high potential (~50 kV) sub-systems (like ion source) are connected to the TS-DACS system through fiber optic (FO) link, developed in-house which provide electrical isolation and better noise immunity. The 180 kW RF generator has been commissioned through the TS-DACS. This paper discusses about the design, software development, implementation strategy, commissioning of the TS-DACS along with some of its operational test results of subsystems linked with twin source experiment.
        Speaker: Mr ratnakar kumar yadav (ITER-India)
      • 15:00
        Emulation of a prototype FPGA track finder for the CMS Phase-2 upgrade with the CIDAF emulation framework 1h 30m
        The CMS collaboration is preparing a major upgrade of its detector, so it can operate during the high luminosity run of the LHC (HL-LHC) from 2025. The upgraded tracker electronics will reconstruct the trajectories of charged particles within a latency of a few microseconds, so that they can be used by the level-1 trigger. An emulation framework, CIDAF, has been developed to provide a reference to a proposed FPGA-based implementation of this track finder, which employs a Time-Multiplexed (TM) technique for data processing.
        Speaker: Luigi Calligaris (STFC - Rutherford Appleton Lab. (GB))
      • 15:00
        Enabling real time reconstruction for high resolution SPECT systems 1h 30m
        Single Photon Emission Tomography (SPECT) is mainly limited by the trade-off between spatial resolution and sensitivity. In this context, CdZnTe detectors enable higher spatial resolution compared to previously used scintillators, using sub-pixel positioning and DOI. Consequently, the size of the numerical detector representation increases and SPECT imaging tends to face some of PET imaging issues. Detection data for SPECT is currently computed by bins. This approach implies browsing all detection bins allowed by the system, and storing a huge matrix to link every detecting parameter combination to each voxel in the object. Due to the improvement of spatial resolution, the increasing of the detecting space size made this method no longer practicable. The aim of the present study is to propose new approaches to deal with this massive amount of information faster than by binning detection events, using algorithms based on the Maximum Likelihood Expectation Maximization (MLEM), which is the most common algorithm in SPECT imaging. The first improvement consists in computing the matrix linking detection events to the object to be imaged on the flight, instead of storing them in a huge matrix for every couple of detector bin / image bin. Moreover, this matrix is sparse; consequently most of couples do not need to be computed. Therefore, it saves memory and time. Another improvement is achieved by storing input data in list-mode instead of bin-mode. First, it makes the calculation faster, because there is less events in a SPECT acquisition than possibilities of detecting parameters combination. Then, partial updates can be made from groups of acquired events, without waiting the end of the acquisition. In this way, events go into a pipeline, and it is not necessary to store them, since MLEM works on previous updates and there is no iteration on all the events of the acquisition at the same time. Since events can be processed independently, algorithm is parallelizable and computing can be made in real time. Real time processing enables to dynamically adapt acquisition parameters, to make the system more suitable for the particular characteristics of the examination and the patient, and thus improve the result with the same amount of detected photons.
        Speaker: Mélanie Bernard
      • 15:00
        Evaluation of 100 Gb/s LAN networks for the LHCb DAQ upgrade 1h 30m
        The LHCb experiment is preparing a major upgrade in 2020 resulting in a need for a high-end network for a data acquisition system. Its capacity will grow up to a target speed of 40 Tb/s, aggregated by 500 nodes. This can only be achieved reasonably by using links capable of coping with 100 Gigabit/s line rates. The constantly increasing need for more and more bandwidth has initiated the development of several 100 Gigabit/s networks mostly for the HPC field. There are 3 candidates on the horizon, which need to be considered: Intel® Omni-Path, 100G Ethernet and EDR InfiniBand. We present test results with such links both using standard benchmarks (e.g. iperf) and using a custom built benchmark called LHCB-DAQPIPE. DAQPIPE allows to emulate various classical event-building protocols, push, pull, barrel-shifter etc… on multiple technologies. It is particularly well suited to run on supercomputing sites, which are the only possibility to test systems which have the same size as the required DAQ networks. Such systems can simply not be afforded for lab-tests. The key benefit of these measurements is that we can gain detailed insight into the behaviour of the system without the need to build a system to scale. This allows to find out the limitations of the different network component and how they are connected with protocols.
        Speakers: Balazs Voneki (CERN), Sebastien Valat (CERN)
      • 15:00
        Exploring RapidIO technology within a DAQ system event building network 1h 30m
        RapidIO (http://rapidio.org/) technology is a packet-switched high-performance fabric, which has been under active development since 1997. The technology is used in all 4G/LTE basestations worldwide. RapidIO is often used in embedded systems that require high reliability, low latency and deterministic operations in a heterogeneous environment. RapidIO has several offloading features in hardware, therefore relieving the CPUs from time-consuming work. Most importantly, it allows for remote DMA and thus zero-copy data-transfer. In addition it lends itself readily to integration with FPGAs. In this paper we investigate RapidO as a technology for high-speed DAQ networks, in particular the DAQ system of an LHC experiment. In addition to basic measurements of network throughput and server utilization we present measurements using a generic, multi-protocol event-building emulation tool which was developed for the LHCb experiment. Event building using a local area network, such as the one foreseen for the future LHCb DAQ puts heavy requirements on the underlying network as all data sources from the collider will want to send to the same destinations at the same time. This leads to an instantaneous overcommitment of the output buffers of the switches. We test how the congestion control in RapidIO can cope with these conditions. We will present results from implementing a event building cluster based on RapidIO interconnect, focusing on the bandwidth capabilities of the technology as well as the scalability.
        Speaker: Simaolhoda Baymani (CERN)
      • 15:00
        Fabrication of Fiber Optics Spectrometer using SiPM for Radiation Waste Measurement 1h 30m
        In this study, an optical fiber detector was constructed by using a YSO scintillator, optical fiber, and Silicon-PhotoMultiplier(SiPM) and used an MCU module and MCA for signal processing and algorithm development.. The single crystal size of the scintillator was set to 3(diameter)mm × 20 mm after simulating the absorption rate of gamma rays in the scintillator by using the MCNPX code. The constructed detector used the standard gamma ray sources Cs-137, Ba-133 to measure radiation and analyze the spectral characteristics of gamma rays. The resulting trend curve showed excellent linearity with an R-squared value of 0.98, and the detector characteristics were found to vary 5% or less with distance based on comparison with the MCNPX value. Furthermore, the spectroscopic analysis of the gamma ray energy from the single-ray and mixed-ray sources showed that Cs-137 had its peak energy at 662 keV, Ba-133 had at 356 keV.
        Speaker: Prof. Koansik Joo (myongji university)
      • 15:00
        Fast and efficient algorithms for computational electromagnetics on GPU’s architecture 1h 30m
        Integral formulations can be more convenient than 3D finite-element-method (FEM) codes for the numerical solution of quasi-magnetostatic (eddy currents) problems in large and complex domains, consisting of many interconnected parts or components (e.g. magnetic confinement fusion devices), since they do not require the discretisation of non-conducting subdomains. A good accuracy is often achieved with a relatively coarse discretization, thus reducing the need of allocated memory and computing time. Moreover, suitable techniques (e.g. the Fast Multiple Method (FMM) [1] or the Adaptive Cross Approximation (ACA) coupled with hierarchical matrix (H -matrix) arithmetics [2]), can be used to overcome the impractical memory and computational time requirements which arise in very large scale models (integral formulations require the storage of dense matrices: the matrix size scales quadratically with the number of degrees of freedom n and its inversion has a computational cost of the order of n3 for both direct and iterative solvers). However, by following an integral approach, a specific post-processing tool is needed to evaluate the magnetic flux density and the magnetic vector potential components produced in the 3D space by known current density distributions over elementary geometric entities associated to the mesh elements (uniform polyhedral for 3D, or uniform polygonal sources for 2D) or to the sources themselves (2D axisymmetric massive or filamentary coils, 3D coils modeled by means of uniform polyhedral, polygons or current sticks). Several analytic expressions for the calculation of the magnetic flux density and the magnetic vector potential produced by a polyhedron [3], a polygon [4] or a current stick [5] with a uniform current density have been published by many authors. The aim of this paper is to present fast and efficient algorithms for the computation of the magnetic field and magnetic vector potential and their implementation on GPU’s to benefit from their Single Instruction stream Multiple Data stream (SIMD) architecture, by programming each thread to compute the contribution to the magnetic field (or magnetic vector potential) of a single elementary source at a single field point [6]. A critical review of the results will be presented for some test cases, together with an overview of pros and cons of GPU’s vs CPU’s implementations. Their applicability in Real Time (RT) applications in fusion technology is also discussed. 1. Greengard, A fast algorithm for particle simulations, JCP, 1987, 73 (1), 325–348 2. Hackbusch, A sparse matrix arithmetic based on H-matrices. part I: Introduction to H-matrices, Computing, 1999, 62, 89–108 3. Fabbri, Magnetic Flux Density and Vector Potential of Uniform Polyhedral Sources, IEEE Transactions on Magnetics, 2008, 44 (1) 4. Collie, Magnetic fields and potentials of linearly varying currents or magnetization in a plane bounded region, in Proc. Compumag, Oxford, UK, 1976, 76, 86–95 5. Hanson, Compact expressions for the Biot-Savart fields of a filamentary segment, Phys. Plasmas, 2002, 9 4410-4412 6. Chiariello, Fast magnetic field computation in fusion technology using GPU technology, FED, 2013, 88, 1635–1639
        Speaker: Tautvydas Maceina (Università di Padova)
      • 15:00
        Fast Intra Bunch Train Charge Feedback for FELs based on Photo Injector Laser Pulse Modulation 1h 30m
        Bunch charge variations in Free Electron Lasers such as the Free Electron Lasers at Hamburg (FLASH) or the European X-Ray Free Electron Laser (E-XFEL) impacts the longitudinal phase space distribution of the electrons resulting in different bunch peak currents, pulse duration and pulse shapes. The electron bunches are generated by short ultraviolet laser pulses impinging onto a photocathode inside a radio frequency (RF) accelerating cavity. At FLASH, bursts of bunches up to 800 pulses with an intra train repetition rate of 1 MHz are used and even higher repetition rates for the E-XFEL (up to 4.5 MHz) are planned. Charge variations along these bunch-trains can be caused by variations of the laser pulse energies, instabilities of the accelerating field in the RF cavity and time dependent effects in the photoemission process. To improve the intra bunch-train charge flatness and to compensate train-to-train fluctuations a dedicated digital control system, based on the Micro Telecommunication Architecture (MicroTCA.4) standard, was designed and implemented at the FLASH. The system consists of a bunch charge detection module which analyzes data from toroid system and provides input signal for the controller which drives a fast UV-Pocket cell installed in the optical path of the photo-cathode laser. The Pockels cell alters the laser polarization and thus the transmission through a polarizer. The modulation of UV laser pulse energy with an iterative learning feed-forward minimizing repetitive errors from bunch-train to bunch-train and a fast feedback algorithm implemented in a Field Programmable Gate Array (FPGA) allows for fast tuning of bunch charge inside the bunch-train. In this paper a detailed description of the system and first measurements results is presented.
        Speaker: Tomasz KOZAK (Deutsches Elektronen-Synchrotron)
      • 15:00
        Feasibility of software-based real-time calibration of multi-gigabit PET data 1h 30m
        Title: Feasibility of software-based real-time calibration of multi-gigabit PET data Abstract: Positron Emission Tomography (PET) imaging of the breast has the potential to play a role in the detection, diagnosis, staging, guiding surgical resection, and monitoring of therapy for breast cancer. Of these potential roles, producing images at near or real-time is especially important to guide surgical resection and biopsy. This task becomes more difficult in systems with large numbers of detectors and channels. We are constructing a two-panel clinical PET system dedicated to imaging the breast that has 294,912 LYSO crystals read out by 4608 Position-Sensitive Avalanche Photodiodes (PSAPD). The system will read out data using UDP over six gigabit ethernet ports with a maximum predicted data rate of 456MBps for clinical settings. We discuss software considerations for receiving data since UDP does not guarantee transmission. We see a consistent loss greater than 20% when using the Linux networking stack. Using the packet capture library libpcap removes this baseline loss. We implement a dual-threaded design for receiving then processing raw data from the system. This model shows 0.037±0.004% data loss at 240MBps. This rate is the maximum for the current two gigabit ethernet cable setup. We extend and test data loss of the dual-threaded model by adding additional processing of raw data in the second thread. The processing of raw data produces calibrated data with an accurate timestamp, energy, and position in real-time. We show negligible (< 0.0001%) loss at or below 60MBps. There, however, is a steady increase in loss with increasing data rate up to 45.9±0.6% loss at 240MBps. We conclude, that barring upgrades to our current data acquisition computer, we need to produce calibrated data from saved raw data after the scan, which can be done quickly without the constraint of minimizing data loss.
        Speaker: Mr David Freese (Stanford University Department of Electrical Engineering)
      • 15:00
        Field Waveform Digitizer for BaF2 Detector Array at CSNS-WNS 1h 30m
        In CSNS-WNS (White Neutron Sources at China Spallation Neutron Source), BaF2 (Barium fluoride) detector array is designed for neutron capture cross-section measurements with high accuracy and efficiency. Once proton beam collides with the target specimen, neutron will be excited and flight from the target to BaF2 array. The time of flight corresponds to the energy of the neutron. To identify signals from BaF2 crystal excited by which particle, alpha or gamma, pulse shape discrimination technique is usually used according to the ratio of fast and slow components in the signal. Waveform digitization is a valid supporting technology for pulse shape discrimination. To precisely obtain the wave and time information carried by detector signal, and maximally cover the dynamic range of signal, high speed ADC with sampling rate of 1 GSps and 12-bit resolution is used in the readout system for CSNS-WNS BaF2 detector array. The detector array consists of 92 BaF2 crystal elements with completely 4π solid-angle coverage, which results in 92 analog channels for waveform digitization and time of flight measurement in total. High speed, high resolution and large number of channels inevitably lead to the data amount increasing drastically. Traditionally, lower speed or lower resolution is used to relieve the stress of data readout. In CSNS-WNS, besides waveform digitization task, customized field digitizer module (FDM) also measures the time of neutron flight precisely based on the continuous waveform sampling data. Furthermore, to read massive measured data out in real time, FDM is integrated with PXIe interface, a high-speed serial bus. There are total 46 FDM modules in 4 PXIe chassis, which makes the readout system a distributed architecture. Each FDM can support two valid channels for signal digitizing and two high-density DDR3 memories for Ping-Pong data readout. For the purpose of eliminating invalid data and system synchronization, external trigger or clock signal can be fed into FDM through PXIe backplane star bus or micro-miniature coaxial cables from front-panel. Furthermore, considered as a universal waveform digitization platform, with the help of FPGA and waveform digitized data, FDM can also support full digital hardware trigger function, which can make it possible to remove traditional analog trigger cables with dedicated bus from PXIe chassis backplane. FDM with digital trigger mode can simplify the structure of the data acquisition system drastically. To further reduce the pressure of data readout and storage, a real-time data compress algorithm is implemented in FPGA.
        Speaker: Mr Qi Wang (State Key Laboratory of Particle Detection and Electronics, University of Science and Technology of China, Hefei, 230026, China ;Department of Modern Physics, University of Science and Technology of China, Hefei, 230026, China)
      • 15:00
        FPGA Implementation of Toeplitz Hashing Extractor for Real Time Post-processing of Raw Random Numbers 1h 30m
        Random numbers are widely used in many fields such as statistical analysis, numerical simulation and cryptography. However, most existing random number generators cannot directly output ideal random bits without post-processing, where complicated mathematical operation is usually needed and the speed is severely limited. With the development of random number generation, the speed of raw random data generation has reached to Gbps magnitude and existing post-processing cannot satisfy the growth of demand. To close the gap between experimental demonstration and practical application, we propose a concurrent, pipeline-like algorithm based on Toeplitz hashing function and implement it in a resource-limited FPGA. By taking advantage of the concurrent computation features of FPGA instead of common computer serial computation, the post-processing speed is greatly improved by three or four orders of magnitudes to above 3.36 Gbps, which is suited for Gbps real-time post-processing of raw random numbers. In our scheme, a matrix building seed unit is employed to store the elements to construct the Toeplitz matrix by signal fan-out. The entire Toeplitz matrix multiplication is evenly decomposed into several sub-matrix multiplications, which are sequentially calculated in a shared time division multiplexing unit. The sub-intermediate results are then accumulated to obtain the final random bits. All the calculation units work in a concurrent pipeline mode. By employing this kind of time division multiplexing calculation structure, FPGA resources are substantially saved and the extractor can be successfully realized. To implement the Toeplitz hashing function, a data acquisition and post-processing board is developed. In the board, the random signal is sampled and digitalized as raw random data by an 8-bit analog-to-digital converter and then the raw data are transferred to a high-performance FPGA for real-time post-processing. At the same time, small form-factor pluggable (SFP) is employed to output the final random bits at a real-time speed of 3.2 Gbps, USB 2.0 and Gigabit Ethernet are also provided for different scenarios.
        Speaker: Mr Xiaoguang Zhang (State Key Laboratory of Particle Detection and Electronics and Department of Modern Physics, University of Science and Technology of China, Hefei, Anhui 230026, P.R.China)
      • 15:00
        FPGA online tracking algorithm for the PANDA straw tube tracker 1h 30m
        An FPGA based online tracking algorithm for helix track reconstruction in a solenoidal field, developed for the PANDA spectrometer, is described. Employing the Straw Tube Tracker detector with 4636 straw tubes, the algorithm includes a complex track finder, and a track fitter adopting Xilinx IP cores. Implemented in VHDL, the algorithm is tested on a Vertex4 FX60 FPGA chip with different types of events, at different event rates. A processing time of 7 $\mu$s per event for an average of 6 charged tracks is obtained. The momentum resolution is about 3\% (4\%) for $p_t$ ($p_z$). Comparing to the offline tracking algorithm running at CPU, an improvement of 3 orders of magnitudes in processing time is obtained, however at 3 times worse resolution. The algorithm can deal with severe overlapping of events which are typical for interaction rates above 10 MHz.
        Speaker: Dr Yutie Liang (Giessen University)
      • 15:00
        FPGA-based Image Analyser for Calibration of Stereo Vision Rigs 1h 30m
        On a modern 3D movie set around 6 terabytes of data are captured during each hour of recording (assuming: 4K resolution, 25 FPS, 10 bit Y’UV 4:4:4 data). The costs of single day of shooting are tremendous. These include wages of a large number of skilled technicians, lending various types of equipment, supplying hundreds kilowatts of electrical power, financing the actors and many others. It is hence essential to reduce the time required for preparing the equipment for the particular scene and to ensure that the takes will not be defective due to any technical problems. This could be ensured by a dedicated real-time video streams analyser. The cameras have to be perfectly aligned to ensure proper acquisition of 3D video sequence that allows the viewer to fully and positively experience the 3D scene. The vertical alignment of the cameras is done with a sub-pixel precision. Fault to do so, could easily cause dizziness to the viewers. To create the 3D effect camera optical axes are slightly shifted apart horizontally and converged, so that the collected pictures differ similarly as when the scene would be observed by human eyes. The currently used video cameras usually record the video directly to an SSD drive, while at the same time providing a high-quality SDI stream for preview. The Image Analyser captures two of such video streams, from left and right camera of around 3 Gb/s each, and compares them using a set of several predefined algorithms. The Analyser is based upon a recent Kintex-7 FPGA carrier card with a custom designed double-width FMC video interface module (FMC-3DV). The device offers several methods of the image comparison. It can generate a stereoscopic preview stream for 3D capable monitors. It offers two types of the anaglyph images for observing with glasses with colour filters. It is also capable of simple arithmetic operations like pixel-by-pixel average or difference. Finally, it can compare mean colour values between the two images in several areas. The analysis results are also provided in a form of video stream, which is outputted by means of a HDMI interface in the real-time.
        Speaker: Dariusz Makowski (Technical University of Lodz, Department of Microelectronics and Computer Science)
      • 15:00
        Framework Upgrade of The Detector Control System for JUNO 1h 30m
        The detector control system (DCS) of Daya Bay Reactor Neutrino Experiment was developed to support the running neutrino-oscillation experiment. The experiment has been taking data for almost 3 years and making steady progress. And the first results have already been released. The Jiangmen Underground Neutrino Observatory (JUNO) is the second phase of the reactor neutrino experiment. The detector of the experiment was designed as a 20k ton LS with a Inner diameter of 34.5 meters casting material acrylic ball shape. Due to the gigantic shape of the detector there are approximate 10k monitoring point of temperature and humidity. There are about 20k channels of high voltage of array PMT, electric crates as well as the power monitoring points. Since most of the software of DCS were developed on the framework based on windows, which is limited by operation system upgrade and commercial software the framework migration and upgrade are need for DCS of JUNO. The paper will introduce the new framework of DCS based on EPICS(Experimental Physics and Industrial Control System) under Linux. The implementation of the IOCs of the high-voltage crate and modules, stream device drivers, and the embedded temperature firmware will be presented. The software and hardware realization and the remote control method will be presented. As well as the development of the remote monitoring and control system interface by CSS (Control System Studio). The upgrade framework can be widely used in devices with the same hardware and software interfaces.
        Speaker: Mei YE (IHEP)
      • 15:00
        Fuzzy-PID based heating control system 1h 30m
        Many scientific devices have been used in Antarctic, and some parts of the device can not run at the low temperature as low as -60 degree such as a mechanical shutter in a scientific CCD camera. In such a condition, a heating system should be designed to satisfy the temperature requirement. For a CCD camera, we designed a heater system for the shutter including a heat-hold shutter house, temperature sensors, heater, a control board with fuzzy PID control for heater driver and temperature sensor sampling, a control software in the computer. PID is the abbreviation of Proportional, Integral and Differential. PID control algorithm is widely used in the industrial process control algorithm for its attribute of simple structure, strong robustness etc. However, in the wide range of temperature span, there exists many problems such as a large oscillation amplitude, long time-consuming for stable. Moreover, the performance of the system greatly depends on the initial parameters of the algorithm. The control algorithm we adopt is fuzzy PID control algorithm who is the combination of simple fuzzy control and PID control. When in the range of ±10℃ to the target temperature, the control algorithm is the fuzzy control algorithm. When below -10℃ to the target temperature, the duty cycle value is regulated by fuzzy control algorithm and secure control algorithm. It is the secure control algorithm that monitors the temperature and its variance rate on the MOSFET driver during the heating process and stops the change of the duty cycle value while each parameter exceeds their limits. When above +10℃ to the target temperature, an larger differential coefficient is put to use to accelerate the system response. The comparison between pure PID control algorithm and the control algorithm we adopt has been executed. If the fuzzy control is used, the automatic setting of the control parameters of the PID algorithm, such as KP, KI, KD, can be realized. Thus, not only the advantages of the simple principle, easy to use and strong robust, but also the flexible and the better control accuracy is achieved. We tested the fuzzy control and conventional PID control. After the comparison between the two results, we find that the fuzzy PID control algorithm has the capacity to reduce the overshoot of the temperature, decrease the time consuming for stable and increase the accuracy of control. Different with the conventional PID control algorithm, the fuzzy-PID control algorithm owns a larger KP and smaller KI to reduce the range of the first overshoot during heating (the range of the first overshoot in conventional PID is about 2℃ and in Fuzzy-PID is about 0.7℃). During the subsequent automatic tuning, the algorithm continually gets the KI larger and the KP smaller, this makes less time consuming for temperature stable than conventional PID.
        Speaker: Dr Jian Wang (Univ. of Sci. & Tech. of China)
      • 15:00
        Hardware-Based Light Weight TCP/IP for 10 Gigabit Ethernet 1h 30m
        Ethernet has been widely used and implemented in a variety of commercial products. As their high ratio of performance to cost, many backend systems have been designed as distributed structure using Ethernet. Transmission Control Protocol and Internet Protocol (TCP/IP) is also a standard and well known protocol implemented in all mainstream operating systems, it provides a reliable and in-order data delivery featuring flow and congestion control. Consequently, TCP/IP is becoming increasingly popular in front-end readout systems. One example is the JUNO experiment, 1GSps 10-bit ADC is used to sample photomultiplier tubes (PMT) for supernova explosion observation, the sampling time may be as long as 1~2 seconds. Another is the large area silicon pixels detector. The data size grows quickly as the pixel size decreases. The instantaneous bandwidth of 30 cm2 for DAQ may up to 10Gbps. This paper presents a simplified and unidirectional TCP/IP, called FeTCP (Fast electronic TCP), for 10 Gigabit Ethernet (10GbE) with field programmable gate array (FPGA) implementation. As the hardware can be built on one chip, the detectors can be built as modules with a common interface. We prototyped the design on a KC705 development board, the preliminary test shows that the throughput of FeTCP is about 808MBps. A mechanism for slow control over User Datagram Protocol (UDP) is also provided.
        Speaker: Dr Jie Zhang (Institute of High Energy Physics, Chinese Academy of Sciences)
      • 15:00
        High Speed Ethernet Application for the Trigger Electronics of the New Small Wheel 1h 30m
        The ATLAS detector will be upgraded in 2018. The main focus of the Phase-I ATLAS upgrade is on the Level-1 trigger, replacing the present muon small wheel (SW) with the "new small wheel(NSW)", which consists of small thin gap chamber(sTGC) and micromegas (MM). A versatile application-specific integrated circuit(ASIC), the VMM chip, has been developed to read out the signals of the sTGC and MM. The VMM has 64 channels. In order to test the performance of the VMM, high data transfer rate is needed. Meanwhile, it is required to implement the multi-board interconnection. We understand that a programmable platform would better suit our needs. It is proposed to apply the high-speed Ethernet-based network for the testing system, the high performance and full programmability of the network are our mainly considered. We design and implement a test platform named the Gigabit Ethernet Module(GEM). The core of the GEM is two layer protocol only for minimal network latency and minimal packet loss, meaning that all data transfers will be handled by Ethernet switches in Field Programmable Gate Array(FPGA). The main task of the GEM is to perform a communication with computer, which receives data from the computer and transfers data to the computer via Ethernet with the help of the MAC(media access control) protocol, which is a hardware implementation on a FPGA device, enabling it to achieve high speed data transfers and low data latency. We conducted performance tests on GEM, and the test result shows that the transfer rate can reach up to 926Mbps. Long term stability of the GEM is also tested, there is no errors for continuous operating two and an half hours. Also, the packet loss is tested in real time by embedding a sequence number into each packet sent to the computer, no errors have been observed. Subsequently, the GEM is applied in the pad front end board (pFEB) and the thin gap chamber simulation signal generator (SG). This paper introduces the implementation of the GEM, as well as its applications. The features of the systems are described in detail.
        Speaker: Kun Hu (University of Science and Technology of China)
      • 15:00
        High-speed continuous DAQ system for reading out the ALICE SAMPA ASIC 1h 30m
        The heavy-ion beam of CERN's LHC is expected to be colliding at 50 kHz (present rate ~ few kHz) during Run3 onwards of the ALICE experiment planned to start in 2020. Due to these new high collision rates, the Multi-Wire Proportional Chambers of the present ALICE TPC will be replaced by readout chambers featuring Gas Electron Multiplier (GEM) foils. A continuous readout system will replace the existing triggered readout. In continuous readout, the signals from the GEMs will be processed by Front-End Cards each consisting of five custom-made SAMPA ASICs and GigaBit Transceivers (GBTx). Every SAMPA ASIC has 32 signal processing channels, each containing a charge-sensitive pre-amplifier, a shaper, a 10 bit 10 MHz ADC and a digital filter, processing and compression chain. The output of the SAMPA is multiplexed and transmitted using GBTx via optical links to a Common Readout Unit (CRU). The CRU is an interface to the on-line computer farm, trigger and detector control system. The first prototype of the SAMPA ASIC with three channels was recently produced. To test its performance, a continuous data acquisition system was developed using an Altera System-on-Chip development board with Cyclone-V FPGA. A custom board was designed for the SAMPA to mount directly on the FPGA board. Data packets from the SAMPA are read out over a GigaBit Ethernet link provided by the FPGA board. The data samples are then stored in ROOT files as well as being analyzed in real-time using the CERN ROOT data analysis framework to monitor the data quality. To control, configure, and monitor the SAMPA and the FPGA board, a software-package with a graphical user interface was developed. The data acquisition system was successfully used for testing the three channel readout and is also easily scalable to 32 channels using data compression capabilities of the SAMPA chip. The presentation will give an overview of the readout-system design and its performance tests with SAMPA coupled to the GEM detector prototype.
        Speakers: Mr Arild velure (IFT, University of Bergen, Norway), Dr Ganesh J. Tambave (IFT, University of Bergen, Norway)
      • 15:00
        Implementation of ITER Fast Plant Interlock System using FPGAs with cRIO 1h 30m
        Interlocks are the instrumented functions of ITER that protect the machine against failures of the plant system components or incorrect machine operation. Regarding I&C, the Interlock Control System (ICS) ensures that no failure of the conventional ITER controls can lead to a serious damage of the machine integrity or availability. The ICS is in charge of the supervision and control of all the ITER components involved in the instrumented protection of the Tokamak and its auxiliary systems. It is constituted by the Central Interlock System (CIS), the different Plant Interlock Systems (PIS) and its networks. The ICS does not include the sensors and actuators of the plant systems but it is in charge of their control. The ITER interlock system shall be designed, built and operated according to the highest quality standards. The international standard IEC-61508 has been chosen as the reference. In both CIS and PIS cases two main architectures are used: a slow architecture, for those functions with response time requirements slower than 100ms (300 ms for central interlock functions), based on PLC technologies, and a fast architecture, based on FPGA technologies, for the functions with faster requirement times. The proposed design for fast PIS is based on the use of RIO (Reconfigurable Input/Output) technology from National Instruments (compactRIO platform). In order to provide a high integrity solution, a FMEDA (Failure Modes Effects and Diagnostics Analysis) has been conducted to analyse the components behaviour. According to the output of the FMEDA a set of diagnostics has been defined and additional redundancy was added to the architecture to improve the integrity figures. The defined configuration has been called the “double-decker solution”, with two chassis running in parallel, communicated between them using a synchronous high speed serial line, and using redundant modules to implement the input and output measurement/excitations and redundant analog and digital modules to implement the diagnostics of these input/output modules. The integrity figures for the “double decker” solution are obtained from the classification of the failure rates, obtaining for the different configurations a SFF (safe failure fraction) of 85% and a FPH (Probability of dangerous Failure per Hour) of less than 1E-07. The FPGA design includes all the hardware to support the data acquisition from the input modules, the implementation of the diagnostics functionalities for analog and digital modules, the voting schema and the activation/deactivation of digital outputs. The platform includes an external test platform, also based on compactRIO technology, to perform the validation of the system and to register the performance of the different interlock functions implemented. The response times obtained for the TTL input to TTL output interlock function ranges from 5µs to 20µs; for the analog input to TTL output the response time is in the range of 41 µs to 90 µs, and for interlock functions using 24V digital input to 24V digital output, the time can rise up to 643 µs.
        Speaker: Mariano Ruiz (Universidad Politecnica de Madrid)
      • 15:00
        Implementing a ReboT server on a Microblaze. 1h 30m
        Data acquisition over an IP network is convenient for diagnostics, monitoring and control applications. The ReboT protocol (Register Based Access Over TCP) extends the MTCA4U deviceaccess framework, letting it access supported hardware over TCP/IP. Using ReboT, the Python and Matlab bindings provided by the framework give application developers a convenient way to access hardware over the network. The poster discusses the server side implementation of ReboT on a Microblaze soft core. We present our experience implementing the code on the Microblaze using FreeRTOS and the Netconn API of the LWIP stack. We also compare network performance against an implementation realized using the Xilinx kernel and the socket API of the LWIP stack.
        Speaker: Geogin Varghese (DESY, Hamburg)
      • 15:00
        Integrating real time control applications into different control systems 1h 30m
        Porting complex device servers from one control system to another is often a major effort due to the strong code coupling of the business logic to control system data structures. Together with its partners from the Helmholtz Association and from industry, DESY is developing a control system adapter as part of the MTCA4U tool kit. It allows to write applications in a control system independent way, while still being able to update the process variables and react on control system triggers. Special attention has been paid to make the implementation thread safe and real time capable, while still providing the required abstraction and avoiding performance losses. We report on the status of the project and the plans to implement new features.
        Speaker: Nadeem Shehzad (DESY Hamburg)
      • 15:00
        MARTe real-time acquisition system of a Two-Color Interferometer for electron density measurements on FTU (Frascati tokamak upgrade). 1h 30m
        $\textbf{Introduction}$ In this work we presented a new real-time acquisition system of a two-color interferometer (Figure 1) installed on FTU that calculates, in real-time, the density along 2 fixed SIRIO (off-line density elaboration system) chords: central chord (CH3) at 0.935 m and external chord (CH4) at 1.17 m. sampled at 200 Khz. The electron density provided by the CO_2 and CO lasers, of a two-color interferometer, discussed ref.[1] can by computed on-line by eq.(1) during the pulse by MARTe framework running under a Linux operation system. $\textbf{Procedure}$ For the acquisition of interferometric data we adopted an industrial controller with two high speed acquisition boards and one Reflective Memory (RFM) module. The two board are externally synchronized by the gate signal (synchronizing all FTU devices). The first board DAQ has been devoted to the acquisition of four channels (sen(q), cos(q) for CO2 and CO lasers) to evaluate the central chord CH3 and similarly other four channel for the CH4 are acquired by the second one. Each one-half millisecond the system acquires 100 samples for each channel, and then read the plasma current, calculated by the real-time Feedback control system, using the RFM. As first, the software corrects the sine and cosine signals removing the offset from the two probing beam laser, then computes the CO_2 and CO phases of the probing laser beam and finally the electronic density with an average over 0.5 ms is computed using eq.(1) and distributed using the RFM module. $\textbf{Results}$ The Acquisition (200 Khz) and the density elaboration (2 Khz) has been successfully tested more than 60 shots with a wide range of plasma parameters during the last experimental campaign. In Figure 2 the comparison of the line density evolution elaborated by MARTe framework, SIRIO system and the actual real-time density elaboration system re-sampled by rtfeed1 system (solid blue line) is shown. The mean value of the executions time of our RT system is $500 \mu s$ and his variance, show in Figure 3 (b) and (d), is almost negligible ( $\approx 10^{-35}$). In Figure 2 we also show the density profile(blue line) of the existing real-time system as we can see our density (black line and cyan line) doesn't exhibit the time delay and has the same density profile of SIRIO density elaboration. $\textbf{Conclusions}$ The measure of the density using two LOS was successful computed during the last experimental campaign as show above. The next step will be the elaboration in real-time of the Two-color medium infrared scanning interferometer described in (ref.[3]). The use of such scanning interferometer will allow to improve the estimation of the runaway beam radial position in real-time enabling robust runaway beam suppression strategies (\ref.[4] and [5]).
        Speaker: Mr Mateusz Gospodarczyk (Università degli Studi di Roma "Tor Vergata")
      • 15:00
        MicroTCA.4 based RF and Laser Cavities Requlation Including Piezo Controls 1h 30m
        In the paper we are presenting universal solution for RF and laser cavities regulation including piezo controls based on MTCA.4 electronics. The RF field control electronics consists of RTM for cavity probes sensing and high voltage power source driving, AMC for fast data processing and digital feedback operation. The piezo control system has been setup with high voltage RTM Piezo driver and low cost AMC based FMC carrier. The laser cavity electronics uses the same hardware setup. The laser RF signal is a product of analog down conversion to intermediate frequency of nth harmonic of the laser repetition rate which is one above the reference frequency the laser needs to be locked. The fine tuning of the laser is carried out using cavity fiber stretcher. The coarse tuning of the supported optics is done using piezo motor driver application. The both channels can be operated using digital feedback controllers. The communication between AMC modules is performed using low latency link over the AMC backplane with data throughput up to the 3.125 Gbps. First results from CW operation of the RF field controller and the cavity active resonance control with the piezo tuners are demonstrated. The laser lock application performance using both fine and coarse channel feedbacks is shown and briefly discussed.
        Speaker: Mr Lukasz Butkowski (Deutsches Elektronen-Synchrotron)
      • 15:00
        Modular Software for MicroTCA.4 Based Control Applications 1h 30m
        The MicroTCA.4 crate standard provides a powerful electronic platform for digital and analogue signal processing. Besides excellent hardware modularity, it is the software reliability and flexibility as well as the easy integration into existing software infrastructures that will drive the widespread adoption of the standard. The DESY MicroTCA.4 User Tool Kit (MTCA4U) is a collection of C++ libraries which facilitate the development of control applications. The device access library allows convenient access to hardware with an extensible register based interface. Starting from PCI Express, which is used inside a MicroTCA.4 crate, the introduction of new, network based protocols extends its reach beyond a single crate and even MircoTCA itself. Features like register name mapping and automatic type conversion provide a level of abstraction which makes the software robust against firmware and even hardware changes. Bindings to widely used scripting tools like Matlab and Python as well as a graphical user interface complete the protfolio needed for fast prototyping and firmware development. We give an update on the project status and present new features which have recently been introduced or are currently being implemented.
        Speaker: Nadeem Shehzad (DESY Hamburg)
      • 15:00
        Mordicus-dhsm: a Distributed State-Machine Framework for DAQ 1h 30m
        Data Acquisition (DAQ) softwares are heavily distributed applications, whether this is for performance reasons or because of the intrinsic spatial repartition of embedded software nodes. They have a strong requirement concerning the coherency of the system state: when a data acquisition run is started, all participating nodes are supposed to be acquiring data, when the system is configuring or when it is idle, all processes should be. This is why the behavior of such a system is accurately modeled using a finite state-machine (FSM). However, implementing such a state-machine, enforcing synchronization of many distributed processes, is not a trivial task : it requires a robust communication protocol, a carefully crafted error handling strategy able to recover from the failure of any node subset, and a design that binds the application data state to the state-machine and minimizes harmful side effects between states that would compromise its integrity. From this assessment, we developed DHSM (Distributed Hierarchical State Machine): a C++ state-machine framework aimed at easily creating robust, state-machine enabled, DAQ softwares. The framework design, features and implementation will be presented, and its application to the MINOS experiment DAQ software will be detailed. There, we will expose the challenges and design decisions that were necessary to successfully take advantage of the framework features to create a robust a DAQ system.
        Speaker: Frederic Chateau (CEA/IRFU,Centre d'etude de Saclay Gif-sur-Yvette (FR))
      • 15:00
        Multiple Fast Controller Synchronization for ITER Control System Model 1h 30m
        ITER Control System Model (ICM) is a currently developed simulation platform for CODAC, which is a control system responsible for integrating and controlling all plant systems of ITER. ICM is a full-scale implementation of CODAC that follows all hardware and software standards, but does not include any I/O to physical components of ITER. This will serve as an excellent test environment for performance and scalability of upcoming plant system modules and new releases of CODAC software. ICM mimics CODAC infrastructure by combining both virtual and physical servers for different applications. Less demanding services are hosted as virtual instances on two dedicated hypervisors. More demanding real-time applications will be hosted on separate fast controllers connected over physical high performance networks that closely mimic the server infrastructure of ITER. Current real-time configuration of ICM consists of 4 operational fast controllers running Intel Xeon CPUs (2x @ 1.80 GHz, 1x @ 2.00 GHz and 1x @ 3.50 GHz). Configuration is scheduled to be expanded by 4 additional controllers (1x Xeon @ 3.50 GHz and 3x i7-4790S @ 3.20 GHz) and possibly more in the future. Each fast controller contains a timestamping-capable network port which is synchronized to a dedicated grandmaster clock using IEEE 1588-2008 protocol. In addition, each controller is paired with a 10GbE network expansion card connected over a dedicated high performance multicast-capable cut-through switch infrastructure that represents ITER Synchronous Databus Network (SDN). Similar SDN connections have already been tested on individual controllers with application-to-application latency below 50µs and relatively low jitter, which are expected to improve with upcoming releases of CODAC software. This test scenario investigates the performance and reliability of the distributed multiple fast controller synchronization and the associated deterministic communication infrastructure. This evaluation is necessary for developing future simulations that will incorporate demanding data processing and real-time control of the plasma.
        Speaker: Mr Martynas Prokopas (Fusion-DC / IPFN / ITER Organization)
      • 15:00
        NaNet: FPGA-based Network Interface Cards Implementing Real-time Data Transport for HEP Experiments 1h 30m
        NaNet is a modular design of a family of FPGA-based PCIe Network Interface Cards implementing low-latency, real-time data transport between its network channels and the the host CPU and GPU accelerators memories. The design feature a network stack protocol offloading module that operating in conjunction with a high performance PCIE Gen2/3 X8 core yields a low and predictable communication latency, making NaNet suitable for real-time applications. A reconfigurable processing module is also available to implement application-specific processing on inbound/outbound data streams with highly reproducible latency. As of now NaNet design has been specialized in the NaNet-1 (single 1GbE port) and NaNet-10 (four 10GbE ports) configurations employed in the GPU-based real-time trigger of the CERN NA62 experiment, and in the NaNet3 (four 2.5 Gbit optical channels) configuration adopted in the data acquisition system of the KM3NeT-Italia underwater neutrino telescope. Assessment of the real-time characteristics and performances of the resulting systems will be provided and analyzed.
        Speaker: Michele Martinelli (INFN)
      • 15:00
        New LLRF control system at LNL 1h 30m
        The Low-level Radio Frequency (LLRF) control system for linear accelerator at Legnaro National Laboratories (LNL) of INFN is being upgraded by a new digital Radio Frequency (RF) controller. This controller is critical to keep phase, amplitude and frequency stability of the RF field in Quarter Wave Resonator (QWR) cavities of the linear accelerator. These cavities work in superconducting condition. The resonance frequency of low beta cavities is 80 MHz, while medium and high beta cavities resonate at 160 MHz. Each RF controller can control at the same time eight different cavities. The RF signals picked-up from the cavities are sampled by RF ADCs. The digitized signals are fed into a field programmable gate array (FPGA) which implements the control loop. The signals processed by the FPGA are in-phase/quadrature modulated and sent to power amplifiers and hence to the cavities. The main feature of the new control system is an all-digital control loop that originates from direct sampling of the antenna RF signal. In-phase and quadrature components are obtained by a suitable choice of the undersampling frequency, while control of the field and phase in the cavity is based on a digital Complex Phase Modulator (CPM). This paper presents the FPGA firmware, the acquisition techniques and the performances of the new RF controller.
        Speakers: Dr Davide Pedretti (INFN - LNL), Dr Stefano Pavinato (INFN - LNL)
      • 15:00
        NSTX­U RedHawk Linux Realtime Security Measures and Their Effect on Determinism 1h 30m
        The National Spherical Torus Experiment Upgrade (NSTX­U) at the Princeton Plasma Physics Laboratory (PPPL) successfully began its first year of operations. NSTX­U is a magnetic fusion device whose major mission is to develop the physics basis for an ST­based Fusion Nuclear Science Facility (FNSF). The ST­based FNSF has the promise of achieving the high neutron fluence needed for reactor component testing with relatively modest tritium consumption. At the same time, the unique operating regimes of NSTX­U can contribute to several important issues in the physics of burning plasmas to optimize the performance of ITER. NSTX­U uses multiple realtime RedHawk Linux systems based on RedHat Enterprise Linux 6 (RHEL) for both coil protection and plasma control. NSTX­U further uses standard RHEL6 systems for support services such as housing configuration data and non­realtime user interface applications. All of these systems perform critical roles in the success of the NSTX­U project, and it is becoming increasingly apparent that there is a growing risk with respect to protecting these assets from a security standpoint. Typically, realtime assets stay hidden behind external protective measures such as virtual LANs (VLANs) and internal firewalls. With the evolving requirements that organizations place on all computing assets, these previously sufficient external approaches are no longer enough to meet all of their goals. Unfortunately, local security policies tend to have an adverse effect on the deterministic nature of a realtime Linux system, and most policies involve coarse and inflexible settings. As part of an ongoing initiative to protect computing assets from both malicious and accidental threats, NSTX­U developed multiple approaches to blend tight controls with careful study of realtime effects. Included here will be coverage of how NSTX­U managed to balance the primary purpose of the Linux systems with additional security constraints, including using Security Enhanced Linux (SELinux), specific firewall settings, Linux “capabilities” (that is, specific superuser privileges that do not require superuser access), and numerous other security measures. In all cases where a security change negatively affected realtime performance, that change was either mitigated or reverted. What remains is a grouping of safe alternatives that show that both security and realtime determinism are both practical and useful.
        Speaker: Keith Erickson (Princeton University)
      • 15:00
        Online calibration of the TRB3 FPGA TDC with DABC software 1h 30m
        The *TRB3* - Trigger Readout Board - features 4 FPGA based TDCs with a total of up to 264 channels and a time precision of 8 ps RMS **[1]**. It was applied for various beam tests and is going to serve as a standard DAQ hardware for *FAIR* detectors, such as *HADES*, *PANDA*, and *CBM*. To achieve the best time precision, however, each TDC channel must be calibrated individually. First of all, fine counter calibration should be done by means of random test inputs and it should be repeated, if the calibration function changes (in most cases due to temperature change). Alternatively, temperature dependency of each channel can be calculated in advance and compensated using the temperature information from the sensors around the FPGAs. Another compensation should be applied to the mean value deterioration caused by the temperature change. And finally, stretcher latency (used for ToT measurements), which also depends on the temperature change, should be measured in advanced and compensated during the measurement. All these calibration tasks can be carried out already during data taking within the event building DAQ software *DABC*. Produced time values can either be stored with the original raw data or replace them. The calibration analysis code has been implemented with the C++ *stream* framework and can run as plug-in for *DABC* as well as with *ROOT*-based analysis environments, like *HYDRA* or *Go4*. An HTTP server in the *DABC* process provides online monitoring and control of the TDC calibration from a standard web browser. **[1]** C. Ugur, S. Linev, J. Michel, T. Schweitzer, M. Traxler, *A novel approach for pulse width measurements with a high precision (8 ps RMS) TDC in an FPGA*, 2016 JINST 11 C01046
        Speaker: Joern Adamczewski-Musch (GSI)
      • 15:00
        Particle identification on an FPGA accelerated compute platform for the LHCb Upgrade. 1h 30m
        The current LHCb readout system will be upgraded in 2018 to a 'triggerless' readout of the entire detector at the LHC collision rate of 40 MHz. The corresponding bandwidth from the detector down to the foreseen dedicated computing farm (event filter farm), which acts as the trigger, has to be increased by a factor of almost 100 from currently 500 GBit/s up to 40 TBit/s. The event filter farm will pre-analyse the data and will select the events on an event by event basis. This will reduce the bandwidth down to a manageable size to write the interesting physics data to tape. The design of such a system is a challenging task, why different technologies are considered and have to be investigated for the different parts of the system. For the usage in the event building or in the event filter farm (trigger) an experimental FPGA accelerated computing platform is considered and therefore tested. FPGA compute accelerators are more and more used in standard servers like for Microsoft Bing search or Baidu search. The platform we use hosts a general CPU and an high performance FPGA linked via an high speed link. On the FPGA an accelerator is implemented. The used system is a two socket platform from Intel with a Xeon CPU and an FPGA. The CPU and the FPGA are connected via the point- to-point interconnect QPI, which is used to interconnect CPUs in industry standard server. The FPGA has cache-coherent memory access to the main memory of the server and can collaborate with the CPU. These cache-coherent architectures are better suited for real-time connections between FPGA and CPU as the usual PCIe FPGA accelerators. It is very likely that these platforms, which are built in general for high performance computing, are also very interesting for the High Energy Physics community. As First step it is tested to port the existing LHCb RICH particle identifcation to the experimental FPGA accelerated platform. We will compare the performance of the LHCb RICH particle identifcation running on a normal CPU with the performance of the same algorithm, which is running on the Xeon-FPGA compute accelerator platform. Furthermore, the performance results of smaller test cases performed at the beginning like sorting are presented. This work is done in collaboration with Intel Corporation.
        Speaker: Christian Faerber (CERN)
      • 15:00
        Performance evaluation of mTCA.4 High speed ADC card for direct sampling of RF signals in linear accelerator systems 1h 30m
        Nowadays monitoring and control systems for linear accelerators require very complex high-precision RF detection and measurements systems that incorporate receivers with multichannel down-converters and low noise LO generation systems. Increasing requirements for speed, bandwidth and latency while maintaining precision reveal limitations of classical RF receivers. Modern advanced technology made it possible to design data acquisition modules allowing direct sampling of high frequency accelerator signals with sufficient resolution without the need for down-converters. This paper describes the measurements and applications of an eight-channel mTCA.4 card developed for direct sampling RF signals above 1.3 GHz for linear accelerators and High Energy Physics Experiments. The board is equipped with eight 800 MSPS, 12-bit ADC channels each with an input bandwidth up to 2.7 GHz. The boards were tested in a laboratory environment as well as at the FLASH accelerator at DESY, Hamburg and the ELBE accelerator at HZDR, Dresden and revealed very good results. The paper shows results of the measured sampling parameters, noise, latency as well as results of non-IQ sampling schemes for acquiring the amplitude and phase of the detected RF cavity-field signals determining the precision of the analysis for LLRF and monitoring systems. Achieved results satisfy precision requirements for machines like The European XFEL and ILC accelerators.
        Speaker: Krzysztof Czuba (Warsaw University of Technology)
      • 15:00
        Production and Testing of the LO and CLK Generation Module Built in MicroTCA.4 Form Factor 1h 30m
        The local oscillator and clock generation module generates a low noise local oscillator out of the global reference that is distributed over the accelerator. The module is implemented such that it fits into the rear slots 15 and 14 of a standard MicroTCA.4 crate. In the contribution we present the manufacturing and testing process of 60 units that are being deployed in the European XFEL. Comparison between modules is performed based on the measured parameters.
        Speaker: Tony Rohlev (Sincrotrone Trieste)
      • 15:00
        Readout electronics and data acquisition for gaseous tracking detectors 1h 30m
        A complete solution for collecting and processing data from gaseous tracking detectors has been developed. The readout chain consists of front-end modules (FEE) equipped with PASTTREC ASIC chips and Trigger Readout Board v3 (TRBv3) as readout platform, together with control and monitoring mechanisms and data quality assessment software. PASTTREC chip is an 8-channel, fast amplifier and discriminator with Time-Over-Threshold (TOT). Highly configurable settings like gain (in a range between 1.8 and 10.5 mV/fC +- 25%), tail-cancellation, peaking time (10, 15, 20, 35 ns), individual baseline levels and common threshold allows for applications to various gaseous detector systems. Equivalent Noise Charge (ENC) remains in the range between 1000 and 1400 electrons even for the highest gain setting. The developed front-end modules have two PASTTREC chips installed and LVDS connection (slow-control and data channels) to the TRBv3 digitizing boards via dedicated adapters. Trigger Readout Board v3 is an advanced platform for universal, configurable and scalable readout systems. Module consists of 5 FPGA devices, from which, one is the controller and four can be configured with various firmware as Time-to-Digital Converters (TDC), data concentrators or any other data processing units via dedicated mezzanine extension modules. Multiple TRBv3 modules can be interconnected in master-slave mode assuring high scalability with the use of optical fibers and HUB extension modules. Communication between modules is realized by custom TrbNet protocol, developed for this platform. It is characterized by three logical channels: trigger, readout data and slow control messages exchange. The logical trigger channel has a deterministic latency in message distribution. Measurement data exits the system via Gigabit Ethernet links under a form of UDP packets, sent through standard networks to PCs, therefore the solution is adaptable for various DAQ systems. The PASTTREC as well as TRBv3 configuration is performed as register read/write messages exchange between PC and the master TRBv3 module, with a user-friendly, WEB based interface. Collected measurement data can be analyzed online by the Go4 framework or developed ROOT-based macros for in-depth data quality assessment, including track finder and visualization. The entire system has been evaluated in the laboratory as well as in-beam experiments. The results show drift-time measurement as well as TOT precision of 1 ns and a high counting rate performance, reaching up to 1MHz per channel. Measured PASTTREC operation characteristics as well as TRBv3 platform used for readout allow to adapt and integrate the system under discussion to the existing HADES spectrometer and PANDA detector, an experiment under construction, both located at FAIR facility in Darmstadt.
        Speaker: Dr Grzegorz Korcyl (Jagiellonian University)
      • 15:00
        Readout electronics for Belle II imaging Time of Propagation detector 1h 30m
        Belle II experiment at SuperKEKB collider opens a new era in beauty physics. To satisfy demands of Belle II improved particle identification, a novel 8192-channel imaging Time of Propagation (iTOP) detector is being built. In iTOP passage of hadrons through quartz panels generates Cerenkov light, which, after multiple reflections, gets collected by 16-channel microchannel plate photomultipliers (MCP-PMTs). Every photomultiplier anode wire is inserted in an individual socket of a so-called front board, which is parallel to the MCP-PMT back surface. The signals are routed to the pads mounted on the back plane of the front board. Two MCP-PMTs are served by one front board; thus 2x2 two MCP-PMT arrays with two connected front boards collect signals from 128 iTOP channels. At the heart of the iTOP readout system there is a custom designed Application Specific Integrated Circuit (ASIC) with a primary function to sample the amplified waveforms collected from the anodes. Sampling is done by switched-capacitor arrays that perform Wilkinson 12-bit analog-to-digital conversion, with one ADC bit corresponding to about 2 mV. Every ASIC digitizes 8 channels. Four ASICs are hosted by a so-called ASIC carrier board. The ASIC carrier board thus reads out 32 photomultiplier channels. Pogo pin assemblies are mounted at the edge of every ASIC carrier board. The pogo pins (one for each anode) are pressed against the pads of the front board; this way the photomultiplier waveforms get broadcasted to the ASIC carrier board. Then the input waveforms are amplified and later digitized. The digitization in four ASICs of every carrier board is controlled by Zynq XC7Z030 FPGA. Four ASIC carrier boards are interconnected, and one of the ASIC carrier boards is connected to a Standard Control Read-Out Data (SCROD) board which collects the data from four carrier boards. The main component of the SCROD board is Zynq XC7Z045 FPGA which controls the data collection and transfer, as well as triggering and clock distribution. A set of interconnected four ASIC carrier boards and one SCROD board, “a board stack”, represents 128-channel standalone front-end readout system. One iTOP module is read out by 16 MCP-PMTs, thus four board stacks are attached to one module and serve 512 iTOP channels. One high voltage divider is attached to every board stack and serves 8 MCP-PMTs. In total, 332 ASIC carrier boards and 84 SCROD boards were fabricated, tested and integrated in the board stacks (the quantities include spares). Performance of the individual ASIC carriers and of the board stacks was evaluated by a variety of measurements. Particularly, the time resolution of the ASIC channels from measuring 20 ns time difference between two 1.5 V analog 7 ns pulses was found to be about 30 ps. Consequently, the time resolution of the ASIC channels coupled with the MCP-PMTs at a laser test bench was found to be about 70 ps. Integration of the board stacks in the iTOP modules is underway.
        Speaker: Dr Dmitri Kotchetkov (University of Hawaii at Manoa)
      • 15:00
        Real Time Adaptive Treatment Planning for Proton Therapy Radiation Patients 1h 30m
        Background: In radiation treatments for cancer patients, there is a type of treatment called proton therapy. For these treatments, the first step is to simulate the patient's treatment by acquiring computed tomography (CT) scans of the tumor in the position of treatment and converting those images into material and density maps for creating the best plan of treatment. All anatomical structures, including the tumor, are delineated in the 3D image. Based on the tumor type and normal structures near the tumor, the treatment plan is created. This treatment plan consists of, on average, 3 beam portals, each having approximately 25,000 proton beam spots. Each spot can have an energy between 70 and 230 MeV, any position between 40x40 cm^2 area, and a weight of 2x10^6 to 5x10^9 protons per spot. With this many degrees of freedom and the complexity of the human anatomy, it takes a substantial amount of computation time and human input. On average, the process of creating this plan requires 5 days. This treatment plan is used to treat the patient every day for 30 days. During this time, changes in the patient can take place that make the treatment plan suboptimal. Purpose: We are developing a system to create, in near real time, a treatment plan of the day which can account for the variation in the patient and properly adapt the plan to give the optimal treatment for that day. Methods: We have developed a GPU accelerated Monte Carlo simulation for the radiation dose calculation. This is key, as fast and accurate dose calculation is of great importance. We have also developed a fast, GPU accelerated optimization system to develop the treatment plan. We are currently developing a fast deformable registration system to adapt to the anatomical structural changes without human intervention. Results: Currently we can optimize and calculate a plan in less than 20 minutes. Although fast, this calculation time can be improved, and the deformable registration needs to be included for the daily adaptive therapy to be realized for optimal treatment. Conclusion: We are close to having a prototype system for daily adaptive proton radiation therapy. This system has the potential to impact the outcomes of cancer treatments.
        Speaker: Chris Beltran (Mayo Clinic)
      • 15:00
        Real-time plasma electron density feedback control system based on FPGA on J-TEXT 1h 30m
        The J-TEXT newly deployed three-wave polarimeter-interferometer system provides a better time and spatial resolution of the plasma electronic density than the old HCN interferometer system. The plasma electronic density feedback control system is implemented on the already existing polarimeter-interferometer DAQ system which is based on FlexRIO FPGA. This DAQ system is able to acquire 16 channels of intermediate frequency signal from the polarimeter-interferometer diagnostic at 120 MS/s rate. Another FlexRIO board with an output module is added to implement the feedback control algorithm and feed the output to the piezoelectric crystal valve. The density feedback control system is able to extract the phase shift information from the intermedia frequency signal using FFT, calculate density of multiple channels and output control signal to the piezoelectric crystal valve in real-time. NI P2P technology is used to transfer processed data from a FlexRIO board to another in real-time without using the CPU. This assures the required deterministic performance. The control system is fully implemented on FlexRIO FPGA and does not affect the original function of the polarimeter-interferometer DAQ system. This system is also able to calculate density profile for future plasma control system. Keywords: polarimeter-interferometer, density feedback control, phase shift detection, FlexRIO, LabVIEW FPGA, fusion, J-TEXT tokamak
        Speaker: Dr Wei Zheng (Huazhong University of Science and Technology)
      • 15:00
        Real-time resonant magnetic perturbations feedback control system for tearing mode suppression on J-TEXT 1h 30m
        Tearing Modes (TMs) degrade the performance of tokamak plasma, and can even lead to disruption. Using externally exerted resonant magnetic perturbations (RMP) to suppress tearing mode is a promising and effective way. In order to suppress 2/1 tearing mode, 2/1 rotating RMP applied in given phase region to stabilize magnetic island and accelerate island rotation. The RMP feedback control system acquires 15-channels Mirnov poloidal signals, processes the acquired data and calculates the phase in real-time; outputs RMP power supply control signal by comparing with the given phase to drive RMP coil. The feedback control system is based on NI C-RIO and mainly using LabVIEW to develop. The typical 2/1 mode magnetic island on J-TEXT rotates at a frequency from 2 KHz to 10 KHz. To ensure the control precision within 2 degrees, the control period must be within 500 ns. Due to acquired signals are noisy, the feedback control system uses a series of error correction methods in real-time to obtain accurate phase. The feedback control system also need to control the output waveform duty cycle to protect the pulse power supply. The system has been set up on J-TEXT tokamak and has given a good result. Keywords: Tearing mode suppression, RMP, feedback control, C-RIO, LabVIEW FPGA, fusion, J-TEXT tokamak
        Speaker: Mr Feiran Hu (Huazhong University of Science and Technology)
      • 15:00
        Real-Time Tomographic Reconstructions in MARTe with GPU Computation 1h 30m
        Future fusion devices will depend on high throughtput of data for which present CPU capabilities are reaching their limits and GPUs(Graphical Processing Units) are appearing as very promising candidates for such solutions. Integrating GPUs capabilities with the current real-time software frameworks is a challenge that needs adressing. Tomography is a diagnostics which produces high data volumes, while also being one of the more realible diagnostics for a poloidal profile of the plasma density of Tokamaks. The ISTTOK tokamak is the only worldwide case where tomographic reconstruction is implemented in real-time and integrated with the control system which is built in the MARTe framework, and as such its current hardware configuration is being upgraded to obtain better and more realiable data. The former algorithm and geometry was only able to provide low spatial resolution images in order to meet 100$\mu s$ cycle time constraint, but due to the increase of resolution it became necessary to develop a more advanced solution. Simultaneously this solution would also have to be integrated with the MARTe software framework, in a way that mantained its functionality and modularity. In order to provide high definition images, GPU code was introduced into the computation line. However the code compatibility problem had to be solved, i.e. GPU directives needed to be functional inside the MARTe GAMs (General Acquisition Module). This was achieved by linking a precompiled GPU code to the GAM. GPU reconstruction proved to be very efficient in terms of latency and obtaining high resolution images. Within the time constrain of 100$\mu s$ reconstruction images up to the a resolution of 1600x1600 can be obtained with the new code versus the 15x15 reconstructions with the old code. This work serves as proof: (i) that GPU computation is viable in real-time applications in fusion science; (ii) that GPU computation can significantly improve the quality of image-based diagnostics; and (iii) that the MARTe framework can improve its functionality with the integration of GPU capabilities.
        Speaker: Mr Tautvydas Maceina (Consorzio RFX)
      • 15:00
        Realising real-time capabilities of an embedded control system for fast-neutron scintillation detectors 1h 30m
        Scintillation detectors offer a single-step detection method for fast neutrons and necessitate real-time acquisition, whereas this is redundant in two-stage thermal detection systems using helium-3 and lithium-6. The affordability of scintillation detectors and the associated fast digital acquisition systems have enabled entirely new measurement setups that can consist of sizeable detector arrays. These detectors in most cases rely on photo-multiplier tubes which have significant tolerances and result in variations in detector response functions. The detector tolerances and other environmental instabilities must be accounted for in measurements that depend on matched detector performance. This paper presents recent advances made to a high speed FPGA-based digitiser technology developed by Aston University (UK), Hybrid Instruments Ltd (UK) and Lancaster University (UK), with support from the European Joint Research Centre (Ispra) and the International Atomic Energy Association (Vienna). The technology described offers a complete solution for fast-neutron scintillation detectors by integrating multichannel high-speed data acquisition technology with dedicated detector high-voltage supplies. This unique configuration has significant advantages for large detector arrays that require uniform detector responses. We report on bespoke control software and firmware techniques that exploit real-time functionality to reduce setup and acquisition time, increase repeatability and reduce statistical uncertainties.
        Speaker: Vytautas Astromskas (Lancaster University)
      • 15:00
        Signal Processing Scheme for a Low Cost LiF:ZnS(Ag) Neutron Detector with Silicon Photomultiplier 1h 30m
        The NIST Center for Neutron Research is finalizing the design of a novel scintillating neutron detector for its CANDOR neutron scattering instrument. The detectors in the chromatic analyzer must be extremely thin (~1.5 mm) and highly efficient (~90% sensitivity for 3.3 meV neutrons. To that end the detectors consist of 6LiF:ZnS(Ag) plastic scintillator in which wavelength shifting (WLS) fibers have been embedded. Scintillation light collected in the WLS fibers is read out using a silicon photomultiplier (SiPM). The signal from the SiPM is digitized and processed by a field programmable gate array using a pulse shape discrimination algorithm. Discriminating neutron capture events from other phenomena presents a number of challenges for both raw sensitivity and count rate. We describe our efforts to cope with these issues. At the present time the detectors exhibit a neutron sensitivity of ~90% for 3.3 meV neutrons with a gamma rejection ratio of ~10E-7 at count rates exceeding 10000 counts per second.
        Speaker: Mr Kevin Pritchard (NCNR)
      • 15:00
        Software Integrity Analysis Applied to IRIO EPICS Device Support Based On FPGA Real-Time DAQ Systems 1h 30m
        Nuclear fusion environments require dependability and safety analysis to ensure a reliable design and a deterministic behaviour. Failure modes identification, risk assessment and mitigation, guarantee that quality control procedures at different architectural levels comply with all the well-defined prerequisites at all the commissioning stages. Therefore, exhaustive analysis based on Reliability, Availability, Maintainability and Safety (RAMS) must be an unavoidable activity in such a kind of undertaking. The results of this analysis impacts on the hardware and software development: invalidating inadequate software architectures and hardware components, and forcing a given development assurance level depending on its criticality (and thus its costs). This paper applies RAM analysis methodology for an advanced Data Acquisition System (DAQ) based on FPGA, using standards and techniques commonly used for critical systems developments. The proposed DAQ system interfaces with signals coming from different sensors, acquiring data at high sampling rates (up to hundreds of MS/s), and in some cases performing Real-Time pre-processing. In turn, it must provide acquired data to control system, where control loopback will be applied. This fact implies that the DAQ system shall guarantee integrity, continuity, availability and accuracy, providing with the necessary integrity level. This paper presents the analysis for the IRIO software tools as part of an EPICS IOC running under a hardware architecture compliant with the ITER catalogue for fast controllers. This analysis focuses on: RAM analysis to ensure the technical risk control and mitigation; criticality analysis and assessment of mixed-critical systems for failure propagation mitigation by the usage of segregation strategies, such as virtualization techniques by using hypervisors; the IRIO Software Integrity Level (SWIL) analysis, according to nuclear critical system requirements; and software verification methodology based on source code static analysis to reduce errors present in the final product. This analysis will provide confident methodologies to to be considered in future software implementations for minimizing costs and risks in such kind of nuclear environments.
        Speaker: Dr Diego Sanz (GMV)
      • 15:00
        Software tests and simulations for realtime applications based on virtual time 1h 30m
        Unit and integration tests are powerful tools to ensure software quality. Writing such tests for realtime applications accessing hardware requires not only replacing the real hardware with a virtual implementation in software. Also time must be controlled precisely. For a number of reasons the time scale in the simulated environment should not be identical to real time: computations needed for a complex plant model might just be too slow for a real time simulation, or some long-term software behaviour should be tested in a short-running test. Communications with devices often require a specific timing which should be subject of a unit test. These examples demand using a virtual time scale in software tests. We present the VirtualLab framework as part of the MTCA4U tool kit. It has been designed to help implementing such tests by introducing the concept of virtual time and combining it with an implementation basis for virtual devices and plant models. The framework is designed modularly so that virtual devices and model components can be reused to test different parts of the control system software.
        Speakers: Geogin Varghese (DESY, Hamburg), Nadeem Shehzad
      • 15:00
        Superconducting cavities cryo-module control challenges and LLRF system adaptation in case of long pulse operation mode. 1h 30m
        Superconducting niobium resonators are used nowadays in various high energy physics facilities as a particles accelerating structures. Projects which consist of systems that execute single cavity field regulation are operated in continuous (CW) or pulse RF mode. In case of some free electron lasers based on the TESLA technology cavities (like Free electron Laser in Hamburg - FLASH or European X-ray Free Electron Laser - X-FEL) control systems for accelerating field parameters have been designed to regulate common supply power for multiple cavities. Originally this experiments have been designed to operate in short (around 1,5 ms) RF pulses. Further technology developments give a possibility to extend this facilities operation to CW mode or long pulse mode. This effectively will allow to increase desired beam acceleration time slot and as an effect will increase overall machine availability for high energy physics experiments. The paper presents our approach in adaptation of low level radio frequency (LLRF) control system from short pulse to long pulse (up to 500ms) operation. This activities have been performed in DESY where Cryo-module test bench (CMTB) infrastructure have been used. In scope of this paper the difference between superconducting cavities behavior in short pulse and semi-CW is discussed. Additional effort in development of high level automation or frequency control and LLRF are also presented.
        Speaker: Dr Cichalewski Wojciech (LUT-DMCS)
      • 15:00
        TaskRouter: A newly desinged online data processing framework 1h 30m
        TaskRouter is a runtime software and framework for distributed computing. It can be used to facilitate the development of online processing system for High Energy Physics experiments. The framework takes the responsibility of data transmission. Users can determine how data being processed and routed on each node by implementing a single callback interface. One or more backup slaves can be configured for critical nodes in TaskRouter system, and single point failures such as sudden node power off can automatically recover without data loss. TaskRouter is flexible, easy to use and high available. This paper presents the core design of the framework and some performance test results with a dummy online processing procedure.
        Speaker: Mr Minhao Gu (IHEP)
      • 15:00
        TAWARA_RTM: A complete platform for a real time monitoring of contamination events of drinking water 1h 30m
        The security of drinking water is increasingly being recognized as a major challenge for municipalities and water utilities. In the event of a contamination, water spreads rapidly before the problem is detected. Consumption of contaminated water can induce major epidemics, disrupt economic life and create mass panic. Significant drinking water contamination events pose a serious threat to public and environmental health. Today’s laboratory-based contaminant testing systems coupled with the current practice of the use of contingency plans are impractical for daily monitoring usage. They operate too slowly for incident control and prevention since the full extent of the event can be rarely determined timely for efficient mitigation measures. A complete platform to control the quality of the tap water with respect to the radioactivity content will be presented. The platform is developed within the EU-funded project TAp WAter RAdioactivity Real Time Monitor (TAWARA_RTM).The TAWARA_RTM platform will provide a real time measurement of the activity in the water (measuring the gross alpha and beta activity) to verify whether the distributed water is far from the limits set by the EU legislation reaching thresholds that require rapid actions. The TAWARA_RTM platform will offer a system for real time on-site monitoring and it will be a three-device inspection system: •early warning device to monitor a significant change of the radioactive content of the water; •fast alarm device for crossing thresholds that require rapid actions on the tap water distribution system; •spectroscopic investigation to determine the type of contamination and decide the appropriate and effective countermeasures (The determination of the contaminants is needed to establish the effects on the population and produce a full information report to the Civil Security Authorities). The early warning is achieved by the Early Alarm Detector which is built using a large volume NaI:Tl. In case a substantial amount of gamma-ray emitters appear in the water, the EAD generates an alarm signal to shut off the water flux to further stages of the water treatment facility. The fast alarm device is the Real Time Monitor (RTM), a detection system for gross alpha and beta radioactivity, which is continuously monitoring the water quality. The water flows through the RTM device where a potential alpha or beta emitter will induce scintillation light in the detector foils. If the RTM or the EAD detector count rate exceeds the background threshold level, an alarm flag is set and the spectroscopy investigation step, the SPEC, will start, with the aim at identifying the radioisotopes using gamma-ray spectroscopy. The SPEC detector comprises a high purity CeBr3 scintillator shielded by an active anti Compton shield. In order to reduce the measurement time, a concentrator based on selective ion-exchange resins is placed close to the detector front face. Moreover, a dedicated ICT infrastructure has been foreseen to operate the system and manage the alarms that may occur during operation. The integrated system tests will be carried out in spring 2016, at Warsaw Waterworks.
        Speaker: Dr Sandra Moretto (Universita di Padova, Dipartimento di Fisica e Astronomia, Via Marzolo 8, 35131 Padova, Italy)
      • 15:00
        The Associative Memory System Infrastructure of the ATLAS Fast Tracker 1h 30m
        The Associative Memory (AM) system of the Fast Tracker (FTK) processor has been designed to perform pattern matching using the hit information of the ATLAS experiment silicon tracker. The AM is the heart of FTK and is mainly based on the use of ASICs (AM chips) designed on purpose to execute pattern matching with a high degree of parallelism. It finds track candidates at low resolution that are seeds for a full resolution track fitting. The AM system implementation is based on a collection of boards, named “Serial Link Processor” (AMBSLP), since it is based on a network of 900 2 Gb/s serial links to sustain huge data traffic. The AMBSLP has high power consumption (~250 W) and the AM system needs custom power and cooling. This presentation reports on the integration of the AMBSLP inside FTK, the infrastructure needed to run and cool the system which foresees many AMBSLPs in the same crate, the performance of the produced prototypes tested in the global FTK integration, an important milestone to be satisfied before the FTK production.
        Speaker: Ioannis Maznas (Aristotle Univ. of Thessaloniki (GR))
      • 15:00
        The Coil Control Module of a Feedback System of KTX in China 1h 30m
        The objective of this paper is to introduce the coil control module of a feedback system that is designed for the RFP (reversed field pinch) device named KTX (Keda Torus for eXperiment) which is under construction in University of Science and Technology of China. The module is designed for receiving 16 channels of data stream captured by a sample module and using high-speed DAC to give a feedback to the RFP system, which can change the voltage between the unique double-C structures to make the toroid field better. The coil control module is composed of a RS-485 half-duplex transceiver, a FPGA (Field Programmable Gate Array), a network interface and the high-speed DAC part to send the feedback data. RS-485 and network interface provide two different ways to gather data from upper module, which can give more choices for a lot of different kinds of situation. FPGA is the core of the whole module to control all the process, and the DAC we pick has high speed and sufficient accuracy. With the whole feedback system, the radial magnetic field around vertical gap can be reduced to achieve the goal.
        Speaker: Mr Tianbo Xu (USTC)
      • 15:00
        The Gas Injection Control and Diagnostic System for the ESTHER shock Tube 1h 30m
        The European Shock-Tube for High Enthalpy Research (ESTHER) is a combustion drive shock-tube that is now being installed at IST/CTN campus where experimental research on plasma radiation of high-speed (>10km/s) shocked flows will be carried to simulate the high pressure and temperature conditions of spacecraft re-entry in different atmosphere conditions. The shock wave will be driven by the deflagration of a stoichiometric H2, O2 and He gas mixture with up to 100 bar filling pressure inside a 50 litre combustion chamber. The ignited mixture rises the pressure up to ~600 bar which breaks a disposable diaphragm at the end of the combustion chamber creating the resulting wave front. An industrial partner, Air Liquide, installed the gas filling hardware for the combustion chamber including 15 pressure transducers, 22 controlled valves and 3 mass-flow controllers but the respective control system was developed entirely by the IPFN team using the open software EPICS and CS-Studio SCADA environment, embedded Linux computers and standard industrial automation programmable logic controllers (S7-1200 family PLC) ) on a configuration similar to the ITER CODAC I&C architecture and software technologies for slow control. The control system is responsible for handling the gas purge and injection, preparation for ignition, and exhaust burned or unburned mixtures assuring a safe, reliable and reproducible shock tube operation. The system includes an archiving and browsing system for the most important pressure, flows, filling volumes and temperature parameters and also embraces the connections to the independent security gas (H2, O2) alarm system, laboratory door-locking and audible warnings. A number of CS-Studio/BOY graphical user interface (GUI) panels were created both for mimic panels and the gas system operation. Finally a fast acquisition system (up to 125 MSPS) is able to acquire and synchronize the signals from a fast piezoelectric pressure sensor inserted in the camera and from an inductive sensor measuring the current flowing on the copper-nickel ignition wire. This system allowed a successful operation of ESTHER during the preparation phase, completing already more than 80 deflagration pulses using a reduced volume 3 litre combustion chamber (“bombe”) with filling pressures close to final ESTHER specifications. In addition we present a proposal for a fast triggering system for the ESTHER spectroscopy diagnostic using FPGA fast data processing.
        Speaker: Bernardo Carvalho (IPFN-IST)
      • 15:00
        The Network Monitoring System based on Cacti for EAST 1h 30m
        During the smooth running of EAST (Experimental Advanced Superconducting Tokamak), a perfect network management system guaranteeing a robust network is important. In the present complex network infrastructure, it is a daunting task to manage all the devices manually in a network and make sure they are not only up and running but also performing optimally. Therefore, a web-based software system is developed to implement the real-time monitoring of the EAST network in this paper. Written by the language of PHP, the system based on Cacti uses the RRDTool (Round-robin database tool) engine to store data, stores the systems configuration information by MySQL, and collects periodical data through Net-SNMP. It has realized data acquisition, network weathermap, fault alarm, user management and other modules. Compared with the previous management way, our system can dynamically monitor the network link state, bandwidth usage, the information of network devices load in real time, and give the real monitoring effect; it can also find the various faults and give alarm by sending text messages and emails respectively, so that we can take appropriate measures to resolve them in time. Compared to Email alarm, SMS (Short Message Service) based on the hardware of GSM Modem has the advantages of faster speed and more reliable communication signal. So far, the monitoring system has been successfully applied in the network of EAST and greatly improved the efficiency of network management.
        Speakers: Ms Chunchun Li (ASIPP), Dr Feng Wang (ASIPP), Dr Ping Wang (ASIPP), Dr Yong Wang (ASIPP), Prof. Zhenshan Ji (ASIPP), Dr Zuchao Zhang (ASIPP)
      • 15:00
        The online event selection architecture of the CBM experiment 1h 30m
        The Compressed Baryonic Matter (CBM) experiment is currently under construction at the upcoming FAIR accelerator facility in Darmstadt, Germany. Searching for rare probes, the experiment requires complex event selection criteria at an event rate of up to 10 MHz. To achieve this, all event selection is performed in a large online processing farm of several hundred nodes. The "First-level Event Selector" (FLES) compute farm will consist primarily of standard PC components including GPGPUs and many-core architectures. The data rate at the input to this compute farm is expected to exceed 1 TByte/s of time-stamped signal messages from the detectors. The distributed input interface will be realized using custom FPGA-based PCIe add-on cards, which preprocess and index the incoming data streams. At the intended high event rates, data from several events overlaps. Thus, there is no a priori assignment of data messages to events. Instead, event recognition is performed in combination with track reconstruction. Employing a new container data format to decontextualize the information from specific time intervals, data segments can be distributed on the farm and processed independently. This allows to optimize the event reconstruction and analysis code without additional networking overhead and aids parallel computation in the online analysis task chain. Time slice building, the continuous process of collecting the data of a time interval simultaneously from all detectors, places a high load on the network and requires careful scheduling and management. Using InfiniBand FDR hardware, this process has been demonstrated at rates of up to 6 GByte/s per node in a prototype system. The system design is optimized for modern computer architectures by minimizing copy operations of data in memory, using DMA/RDMA wherever possible, reducing data interdependencies, and employing large memory buffers to limit the critical network transaction rate. A fault-tolerant control system will ensure high availability of the event selector. In this presentation, we will give an overview of the online event selection architecture of the upcoming CBM experiment and discuss the premises and benefits of the design. The presented material includes results from studies on several prototype systems, including high-performance time slice building over an InfiniBand FDR network.
        Speaker: Jan de Cuveland (Johann-Wolfgang-Goethe Univ. (DE))
      • 15:00
        The time synchronization of CSNS neutron Instrument 1h 30m
        In CSNS(China Spallation Neutron Source) neutron instrument, the time of proton hit the target is called T0, which is the start point of TOF of neutron. T0 fanout system will provide the exactly T0 signal for detector electronics. But, this system lack of the information to synchronize the metadata from control system and the neutron data from different detector. In CSNS, a real time synchronization system is deployed to index the neutron data from different detector electronics and metadata from control system or other system. This time synchronization system use the UTC time from GPS as the time base and synchronize all node by White Rabbit network. All detector electronics, measurement node and control server spread in 100m2 are connected to this system by different way. The metadata of last sample set can be retrieved from index server and all history data can be obtained from history database for physical analysis. Some device and computer are developed in CSNS site, and a demo system is also established. In this paper, the architecture of this synchronization and synchronization way are explained, and some performance of this demo system are also illustrated in this paper.
        Speaker: Mr Liang Yi (GDWave)
      • 15:00
        Timing and Readout Control in the LHCb Upgraded Readout System 1h 30m
        In 2019, the LHCb experiment at CERN will undergo a major upgrade where its detectors electronics and entire readout system will be changed to read-out events at the full LHC rate of 40 MHz. In this paper, the new timing, trigger and readout control system for such upgrade is reviewed. Particular attention is given to the distribution of the clock, timing and synchronization information across the entire readout system using generic FTTH technology like Passive Optical Networks. Moreover the system will be responsible to generically control the Front-End electronics by transmitting configuration data and receiving monitoring data, offloading the software control system from the heavy task of manipulating complex protocols of thousands of Front-End electronics devices. The way in which this was implemented is here reviewed with a description of results from first implementations of the system, including usages in test-benches, measurements of timing and latency control and future developments.
        Speaker: Cairo Caplan (CBPF - Brazilian Center for Physics Research (BR))
      • 15:00
        Upgrade Of The Central Logic Board For The Phase-2 Of The KM3NeT Neutrino Telescope. 1h 30m
        The KM3NeT collaboration aims the construction of a multi-km3 high-energy neutrino telescope in the Mediterranean Sea consisting of thousands of glass spheres, each of them containing 31 photomultiplier of small photocathode area. The main elements of a neutrino telescope are, therefore, the sensitive optical detectors, which on the case of KM3NeT is the small photocathode area photomultipliers (PMT) distributed around the glass sphere of Digital Optical Module (DOM). Each DOM has 31 small PMT which collect the Cherenkov light and convert it onto electronic signals. In order to translate these signals onto the arrival time of the photons, they are processed by Time to Digital Converters (TDC). The firmware of the DOM is based on two LM32 microprocessors, an open source firmware microprocessor from Lattice. One of them is dedicated to the White Rabbit protocol which directly manages the tunable oscillators and the optical link traffic, in order to achieve a time synchronization of sub-nanosecond level with the Grand Master clock of the on-shore station. The rest of the modules are managed by the second microcontroller, which has access to all the communication interfaces (SPI, UART, GPIO and I2C) needed for the instrumentation devices, the acoustic and optical readout systems and the multiboot module. 31 TDCs are responsible to record the arrival time and the width (with 1 ns of resolution) of the hits incoming from the signals of the PMTs. An acoustic readout is dedicated to the decoding of the incoming AES3 formatted stream. All the data, together with some other slow control monitoring information (as temperature, humidity, tilt meter, compass, currents, etc. . ) are put in UDP packets and connected to an IP/UDP packet buffer stream selector (IPMUX). This IPMUX splits the data into separate streams, based on UDP port number and send to the shore station via the endpoint, a normal Ethernet MAC but it has time stamping capabilities allowing sub-nanosecond timing precision, such that it facilitates the Precision Time Protocol (IEEE588). The multiboot allows the selection of a different image from where to boot the electronic system and the presence of a Golden Image as a fallback solution, will make possible to remotely configure the firmware after the deployment of the DOM. All these features are implemented effectively on Artix FPGA, reducing resources, costs and power consumption with respect to the phase-1 prototype, based on Kintex-7. The control of the DOM is achieved through complex and robust embedded software running in the LM32. No operative system is used in order to reduce power consumption. The software allows multiple parties to work on it and extending it without compromising stability and clarity. The software has been layered into three main modules, named Common, which contain the common functions, macros and standard libraries. Platform layer, which includes the start-up code and drivers, and App layer for the application specific code. Each module has its own use and has security restrictions for functions in different levels (with exception of callbacks)
        Speaker: David Calvo (IFIC)
      • 15:00
        White Rabbit based sub-nsec time synchronization, time stamping and triggering in distributed large scale astroparticle physics experiments 1h 30m
        Time-Synchronization to sub-nsec precision between detector subsystems in large scale astroparticle physics experiments can efficiently be provided by White-Rabbit (WR), a new ethernet-based technology for time and frequency transfer. We discuss principles and advantages of WR for distributed detector arrays, which allows clock-synchronziation and trigger-time stamping at sub-nanosecond precision; as well as for complex and flexible topological trigger strategies, based on ethernet-routed timestamps. We describe a White-Rabbit implementation at the Gamma-Ray facility HiSCORE (Siberia) for airshower reconstruction; and first experience with the next generation Zynq-based WR-ZEN platform.
        Speaker: Martin Brückner (Paul Scherrer Institut)
    • 16:30 17:50
      Emerging Technologies / Feedback Centro Congressi (Padova)

      Centro Congressi

      Padova

      Conveners: Paolo Durante (CERN), Patrick Le Du (DAPNIA)
      • 16:30
        High throughput data acquisition with InfiniBand on x86 low-power architectures for the LHCb upgrade. 20m
        The LHCb Collaboration is preparing a major upgrade of the detector and the Data Acquisition (DAQ) to be installed during the LHC-LS2. The new Event Builder computing farm for the DAQ requires about 500 nodes, and have to be capable of transporting order of 32 Tbps. The requested performance can possibly be achieved using high-bandwidth data-centre switches and commodity hardware. Several studies are ongoing to evaluate and compare network and hardware technologies, with the aim of optimising the performance and also the purchase and maintenance costs of the system. We are investigating if x86 low-power architectures can achieve equivalent performance as traditional servers when used for multi gigabit DAQ. In this talk we introduce an Event Builder implementation based on InfiniBand network and show preliminary tests with this network technology on x86 low-power architectures, such as Intel Atom C2750 and Intel Xeon D-1540, comparing measured bandwidth and power consumption.
        Speaker: Matteo Manzali (Universita di Ferrara & INFN (IT))
      • 16:50
        A Lossless Network for Data Acquisition 20m
        The planned upgrades of the experiments at the Large Hadron Collider at CERN will require higher bandwidth networks for their data acquisition (DAQ) systems. The network congestion problem arising from the bursty many-to-one communication pattern, typical for these systems, will become more demanding. It is questionable whether commodity TCP/IP and Ethernet technologies in their current form will be still able to effectively adapt to the bursty traffic without losing packets due to the scarcity of buffers in the networking hardware. We continue our study of the idea of lossless switching in software running on commercial-off-the-shelf servers for data acquisition systems, using the ATLAS experiment as a case study. The flexibility of design in software, performance of modern computer platforms, and buffering capabilities constrained solely by the amount of DRAM memory are a strong basis for building a network dedicated to data acquisition with commodity hardware, which can provide reliable transport in congested conditions. In this presentation we extend the popular software switch, Open vSwitch (OVS), with a dedicated, throughput-oriented buffering mechanism for data acquisition. We compare the performance under heavy congestion of typical Ethernet switches to a commodity server acting as a switch, equipped with twelve 10 Gbps Ethernet interfaces providing a total bandwidth of 120 Gbps. Preliminary results indicate that software switches with large packet buffers perform significantly better, reaching maximum bandwidth, and completely avoiding throughput degradation typical for hardware switches that suffer from high packet drop counts. Furthermore, we evaluate the scalability of the system when building a larger topology of interconnected software switches, highlighting aspects such as management, port density, load balancing, and failover. In this context, we discuss the usability of software-defined networking (SDN) technologies, Open vSwitch Database (OVSDB) and OpenFlow, to centrally manage and optimize a data acquisition network. We build an IP-only leaf-spine network consisting of eight software switches running on separate physical servers as a demonstrator. We intend to show in this presentation that building a high bandwidth lossless network based on software switches dedicated for data acquisition is feasible and can be considered as a viable solution for future small and large-scale systems based on commodity TCP/IP and Ethernet.
        Speaker: Grzegorz Jereczek (CERN)
      • 17:10
        Operational Experience with the Readout System of the MINOS Vertex Tracker 20m
        The MINOS vertex tracker is a compact instrument built for in-beam spectroscopy of exotic nuclei. Its main component is a ~30 cm long hollow cylinder shape time projection chamber (TPC) surrounding a liquid hydrogen target. The anode of the TPC is read out by a Micromegas detector segmented in 18 concentric rings of 2 mm x 2 mm pads totaling 3604 channels. Space constraints near this detector necessitated the development of an advanced cabling solution based on sub-millimeter pitch micro-coaxial cables to connect all pads to the preamplifiers placed one meter away. Using this technology, tests in experimental conditions show that channel noise remains low, typically ~1500 electrons rms. A new readout system was designed for MINOS. Its built-in versatility allows exploiting a legacy readout chip, the AFTER designed for the T2K neutrino oscillation experiment, and its evolution, the AGET, made for active target TPCs. Both of these multi-channel ASICs (72 channels in AFTER and 64 in AGET) rely on a 512 cell switched capacitor array (SCA) to support a high sampling rate, up to 100 MHz, during a short time capture window (5.12 µs at 100 MHz sampling rate), with a typical power consumption as low as ~20 mW/channel. Besides its four ASICs, the front-end card used in MINOS houses an inexpensive commercial FPGA module based on a Xilinx Spartan 6 FPGA. This “System-On-Module” approach led to an extremely fast development time while the full performance of the AFTER and AGET was preserved by a careful partition of tasks between those implemented in the FPGA fabric and those handled by the embedded MicroBlaze processor. One of the major improvements of the AGET compared to its predecessor, and other comparable devices, is that it includes a discriminator for each channel. The resulting information can be used to elaborate a self-trigger signal, and it is also exploited by the chip itself during the readout phase to time multiplex towards the external ADC only the cells corresponding to hit channels. This selective digitization brings a first level of data reduction and substantially cuts dead-time when occupancy is low, because only a small fraction of the SCA matrix is read out. Two other techniques are used to further reduce dead-time: 1) The mapping of detector channels to front-end electronics is made so that expected track hits are spread on different readout cards and chips; 2) A mechanism implemented in FPGA logic determines on the fly which chips have to be readout and which are skipped, depending on the hit occupancy of each AGET. Further data reduction and pre-processing are performed by the FPGA module that controls the front-end chips and handles communication over Gigabit Ethernet with the data acquisition PC. Over the last three years, five experimental campaigns with MINOS were conducted, all successfully. We describe the prominent aspects of the readout system of this instrument, present performance measurements, and report on lessons learned during exploitation.
        Speaker: Denis Calvet (CEA/IRFU, Centre de Saclay (FR))
      • 17:30
        Low Phase Noise Local Oscillator and Clock Generation for Cavity Field Detection 20m
        When designing Low-Level RF (LLRF) system for the new generation of Free-Electron Laser (FEL) machines there are many considerations. First is superior performance of the front-end electronics focused on ultra-low phase noise which contributes to the quality of the electromagnetic field of superconducting RF cavities in accelerating modules, stability of accelerated electron bunches arrival time and finally to the output laser light of FEL. Other important goals for LLRF system are drift minimization, remote control and diagnostics, high reliability and serviceability. This can be achieved introducing to all subsystems highly integrated PCB modules that can accommodate all the system requirements. One of the crucial subsystems of the LLRF front-end is Local Oscillator (LO) generation and Clock (CLK) generation which are a subject of this article. The cavity probe signal of frequency 1300 MHz is mixed with ultra-low phase noise LO signal 1354 MHz and downconverted to an intermediate frequency (IF) of 54 MHz. This lower frequency IF signal holds the original amplitude and phase information of the field inside the cavity. The IF signal is sampled by analog-to-digital converter (ADC) with 81 MHz low jitter CLK and processed by digital part of the LLRF. The cavity field detection critically influences the regulation of the acceleration field. That is why phase noise performance of the LO and CLK signals is of our interest here. This is even more critical when vector sum of probe signals from multiple cavities is calculated and used for field regulation. In this paper different LO and CLK generation schemes have been presented and discussed. Performance and small form factor for PCB integration were of our interest. Phase-lock loop (PLL) based technique and mixing technique were considered the most promising for LO generation. Technologies limits where performance and the size of VCO’s and LO filters are in tradeoff have been met. Low jitter CLK generation circuit optimization has been described. 2 DUT phase detector test methodology has been used to measure residual phase noise of different LO and CLK generation circuits. Circuit of the best performance has been chosen for the final realization. Results at the level of single femtoseconds of the residual integrated RMS time jitter have been achieved. Based on that a family of LO and CLK generation modules has been developed.
        Speaker: Mr Mateusz Zukocinski (Warsaw University of Technology)
    • 17:50 18:10
      Bus Transfer to Padova City Center
    • 07:45 08:30
      Bus Transfer to Conference Venue

      Bus Transfer to Conference Venue

    • 08:30 10:00
      DAQ 2 / Medical Centro Congressi (Padova)

      Centro Congressi

      Padova

      Conveners: Martin Grossmann, Stefan Ritt (Paul Scherrer Institut (CH))
      • 08:30
        Digital SPAD Scintillation Detector Simulation Flow to Evaluate and Minimize Real-Time Requirements 30m
        Radiation detection used in positron emission tomography (PET) and calorimetric systems exploit the timing information to remove background noise and refine the position measurement through time-of-flight information (TOF). In PET, very fine time resolution (in the order of 10 ps FWHM) would not only improve contrast in the image, but would also enable real-time image reconstruction without iterative or back-projected algorithms. The current performance limitations will be pushed off through the optimization of faster light emission mechanisms (prompts photons), after which the burden of timing resolution will fall to the readout optoelectronics. Digital SPAD arrays offer compelling possibilities to minimize timing jitter in these future detector systems such per-cell timestamps granularity and per-cell configuration parameters, providing a highly flexible signal processing environment. However, processing hundreds of timestamps per detection event places a toll on the real-time processing, which increases rapidly with embedded channel count. Furthermore, if the processing is sent to an external device such as an FPGA, the bandwidth and related power requirements also increase. The goal of the presented simulation flow is to determine how many timestamps are actually required to reach the 10 ps FWHM CTR range for PET. Using this information, designers can estimate the compromises between timing performance, bandwidth requirements, data transmission, power consumption and real-time dataflow processing in the DAQ at the chip and system level. In a typical PET configuration with a standard LYSO, the simulations indicate that a digital SPAD array only needs as few as four photon timestamps to reach within 3% of the best possible timing. Similarly, a modified LYSO with 2.5% prompt emission rate would need 30 to 40 photon timestamps to reach 3% of the optimal timing. The real-time burden is thus very different in both situations. New SPAD devices should be designed to maximize performance for both current and future scintillation materials.
        Speaker: Marc-André Tétrault (Université de Sherbrooke)
      • 09:00
        Generic FPGA based platform for distributed IO in a Proton Therapy Patient Safety Interlock System 20m
        A new gantry for cancer treatment has been installed at the Center for Proton Therapy in the Paul Scherrer Institute (PSI), Switzerland, where already two in house developed ones operate since 1996 and 2013. The new gantry is a commercial device and had to be integrated into the existing control system of PSI. For this purpose it was necessary to develop a customized adapter, which supports an interface between the PSI systems and the vendor’s gantry safety system. The adapter is divided into two main components one is a VME-bus based logic controller containing all patient safety related functionality implemented on one Virtex 6 FPGA. The other is a generic IO platform to support the interconnection of signals from and to all subsystems, called the Signal Converter Board (SCB). The SCB is a generic platform based on a XILINX ARTIX-7 FPGA and 10 modular IO ports, each with up to 34 user configurable IO signals. All IO ports share a standard connector, and application specific modules can be connected as mezzanine plugins. For this project dedicated hardware interfaces were developed supporting different interface standards like 24V digital IO, optical IO and additional proprietary wiring standards. In case of power failure all output signals are set to a safe state. To communicate with the main logic controller the SCB supports up to 6 low latency optical high speed links. The communication link layer is based on the XILINX AURORA standard. On top of this, a new protocol called PaSS-IO (P-IO) link was developed. The P-IO link uses a deterministic streaming mechanism where the logic controller sends periodically the status of the output signals to the SCB and the SCB itself transmits the status of the input signals to the logic controller. To detect communication errors several supervision functions like frame CRC and link alive checks are implemented. The P-IO protocol and its supervision functions are implemented in a VHDL module supporting a simple interface to the XILINX AURORA core interface on one side and a user friendly interface to the user application on the other side. Only a few user specific configurations in a package file are required to integrate the design of P-IO link VHDL module into any FPGA application with AURORA communication links. For our safety system we use a serial link communication settings of 2 GBit/s and a frame cycle time of 1µs with a link load of 15% between the systems. With these communication parameters we achieve a reaction latency of less than 4µs from an input signal change to an appropriate reaction at an output signal. The separation of the system into a central VME based logic controller and a distributed IO platform allows optimizing the cabling installation of the whole system. The system has been successfully installed at PSI’s new treatment room, commissioning is ongoing and beam at isocenter was achieved in January 2016. Patient treatments are scheduled to start end of 2016.
        Speaker: Michael Eichin (PSI - Paul Scherrer Institute)
      • 09:20
        3D photon impact determination in monolithic scintillation crystals using FPGA processing 20m
        We present the implementation of novel methods to accurately determine the gamma ray impact position within monolithic scintillation crystals in PET systems. These methods are implemented in a Kintex7 FPGA installed in each ADC board of the data acquisition system (DAQ) of the brain PET insert named MINDView. Improving the system processing capabilities is of particular interest when high data transfer rates are expected, as is the case of this scanner. Algorithms such as Center of Gravity (CoG) allow to determine the gamma photon impact coordinates based on the addition of the input signals weighted by their own coordinates and normalized to the energy. Assuming the use of monolithic black-painted crystals where the light distribution is well preserved, this method has several known weaknesses since the light distribution is truncated at the crystal edges, which causes a misplacement of the true impact coordinates. Other approaches have been studied to minimize the CoG limitations. Among them, the so-called Rise to the Power (RTP) algorithm or Fitting methods to the light distribution. These methods help to better estimate the center of the light distribution and, thus, to better determine the photon impact position including regions near the detector edges. PET systems typically suffer from the parallax error, especially near borders of the FOV. This effect can be mitigated by determining the photon Depth of Interaction (DOI) inside the crystal which can be estimated when monolithic crystals are considered. Correcting the parallax error in the reconstruction stage (the true Lines of Response) improves the final image spatial resolution at the FOV borders. Either RTP or Fitting methods have been already tested off-line with promising results. This means analyzing raw data in a PC workstation once the acquisition is completed. The results showed quantitative improvements in the position determination and, therefore, in the final image quality. Both methods permitted to estimate the photon DOI. The aim of this work is to show the results obtained when these methods are implemented in an FPGA and, therefore, executed on-line, reducing the total amount of data transferred to the computer, also improving the data bandwidth. The bottleneck is the execution time of this process, which must be below the expected data rate (1M event/s), fixing the restriction of 750ns (a bit lower than the 1us expected). The current implementation with CoG takes around 700ns (DSP-based), being very close to the allowed limit. Assuming the new methods are more complex in terms of computation requirements, other approaches are needed to match the times. The RTP method and the DOI estimation have been simulated in the XC7K160T. Their implementation is based on LUTs and dividers, changing the current approach. The stages of multiplications, additions and divisions have been grouped, giving times below 500ns. The different methods will be compared in terms of performance (byte length, truncation factor, time execution, logic resources…) and, the best one in terms of time and resources, will be implemented in the ADC board of the MINDView DAQ system.
        Speaker: Dr Albert Aguilar (I3M)
      • 09:40
        The Data Acquisition System of the KOTO Experiment and RPT Upgrade 20m
        The KOTO experiment is a particle physics experiment located in J-PARC, Japan, aiming to measure the branching ratio for the $K_L\rightarrow\pi^0\nu\bar{\nu}$ decay and to explore new physics. This decay has not yet been observed and it has a branching ratio predicted by the Standard Model to be $(2.43\pm0.39)\times10^{-11}$. The current upper limit of $2.6\times10^{-8}$ was measured directly by E391a at KEK. The KOTO experiment achieved similar sensitivity to the E$391$a result with 24 kW proton beam in 2013. The present DAQ system, commissioned in 2015 runs with 42 kW beam power, is designed to reach the limit on the branching fraction coming from existing theoretical models of $1.5\times10^{-9}$. The KOTO DAQ system consists of ADC frontend modules, two hardware triggers (L1, L2), and one software trigger (L3). Two types of ADC modules with sampling rate of 125 MHz and 500 MHz are used to digitize detector waveform signals. The L1 trigger calculates the sum over of 3000 channels of calorimetric energy and checks 1000 channels of detector veto every 8 ns. The current L1 trigger rate during physics data taking is 18.5 kHz. The L2 trigger receives up to 0.5 MB of data per trigger and makes a decision based on the center of energy of the calorimeter. The average L2 trigger accept rate is 33$\%$ and the maximum data output rate is 17 Gbps. The L3 computing farm receives events from the L2 trigger via UDP and manages event building, data compression, and data storage. The 2013 DAQ system used a commercial Ethernet switch between the L2 and L3 trigger to build events. In 2015, we implemented online lossless data compression inside the ADC modules and enhanced the L3 computing farm with a 10 Gbps Infiniband network and 288 TB of local storage for more efficient event building. The improved system is able to concurrently perform data transfer to permanent storage (3 Gbps) with DAQ data taking. To accommodate future increases in proton beam intensity, we are developing an upgrade for the L2 hardware trigger using the RCE (Reconfigurable Clustering Element) Platform Technology (RPT) developed by SLAC with replicated mesh ATCA shelf. The RPT supports Rx/Tx link with higher input/output rate up to 120 Gbps per RCE. With full backplane connectivity and the new Zynq-7000 series FPGA computing power, event building and cluster finding processes can be done by the L2 trigger. We expect the fully optimized upgraded L2 trigger to support trigger rate 7 times greater than the current value. The RPT is used by experiments such as ATLAS CSC, LBNE, LCLS, and LSST. We will modify the RPT to support the KOTO experiment and aim to commission the upgraded L2 trigger system in 2017 run.
        Speaker: Ms Stephanie Su (University of Michigan)
    • 10:00 10:20
      Break: Coffee Centro Congressi (Padova)

      Centro Congressi

      Padova

    • 10:20 11:20
      CMTS Centro Congressi (Padova)

      Centro Congressi

      Padova

      Conveners: Alex Gong (Tsinghua University (CN)), Prof. Zhen-An Liu (IHEP,Chinese Academy of Sciences (CN))
      • 10:20
        A monitoring system for the beam-based feedbacks in the LHC 20m
        The system for the beam-based feedbacks in the LHC is one of the most complex in CERN’s accelerator complex. It is an essential system for the operation of the LHC and is routinely used to simultaneously control the beam orbit, machine tune, and radial-loop adjusting the beam energy. The system handles the input of over 2'000 measurements, and controls the current in over 1'000 superconducting dipole correction magnets, over 200 quadrupole correction magnets as well as the RF frequency used to generate the electric field accelerating the particle beams. Recently, a new team was charged with maintaining, documenting and upgrading the software in order to meet the requirements for LHC’s 2nd run. The team identified several requirements: 1) gather statistics on the relative offsets of the arrival of measurement data, 2) inspect RT I/O in a user-friendly way, 3) display summarized status information on the synchronism and content of the input data, 4) have the means to rapidly diagnose problems with the feedbacks during commissioning and operation in a non-intrusive way (i.e. without compromising the feedback system’s real-time behavior). This paper documents the design, integration and use of the resulting monitoring suite for the LHC beam-based feedback systems. The set-up comprises a FESA (framework for real-time systems developed at CERN)-based real-time server and a JavaFX-based graphical interface integrated into CERN’s operational software infrastructure. Concrete examples are given on how this system has contributed to a better understanding of the overall feedback behavior and aided in diagnosing operational problems. The paper will also summarize envisaged requirements for future releases.
        Speaker: Diogo Miguel Louro Alves (CERN)
      • 10:40
        uSOP: a microprocessor-based Service-Oriented Platform for Control and Monitoring 20m
        We present a Service-Oriented Platform (uSOP) designed for deep embedded applications in controls and monitoring of detectors, sensors as well as complex research instruments. uSOP is a single board computer based on the AM3358 1 GHz ARM Cortex A8 processor and it is equipped with standard uSD, USB and Ethernet interfaces. On board RAM and solid state storage allows hosting a full LINUX distribution including GNU compilers, tools, libraries, a window system, documentation and software frameworks for specific user tasks. The board supports SPI, I2C, JTAG and UART interfaces, all of them galvanically isolated and each equipped with a separate supply to power remote sensors and acquisition resources like ADCs, DACs, digital I/O expander, optocouplers. Non isolated digital I/Os allow the user to benefit from the Programmable Real time Units (PRU) available in the processor and from the sophisticated event capture and timer peripherals. Aiming at embedded applications, uSOP has been designed to offer resilient, low maintenance performance in harsh, limited access environment. The most critical system-level operations can be performed remotely by means of a specific LAN connection operated independently by the main processor. Such an approach allows the user to reset and power cycle the board, to flash the operating system on the storage unit and/or boot from the network, to redirect the system console on the LAN in order to troubleshoot hardware and software issues. The on board power grid has been segmented in order to provide the cleanest supply to the acquisition peripherals. The noisy digital domains are powered with high-efficiency Point-Of-Load switching regulators while linear regulators decouple the more demanding high-speed I/Os like USB and Ethernet. Thermal shutdown, over-current and short-circuit protection are guaranteed by design for safe operation in hazardous area. The PCB design has been tailored to achieve EMI immunity, including ground planes and rings to shape the current return paths. In this contribution we present and discuss the main aspects of the hardware and software design of uSOP, including details of the EPICS framework porting and application design. We show the tests done with state-of-art Delta-Sigma 24-bit ADC acquisition modules designed for this platform, assessing the noise level, ease of software development, CPU and network loading. uSOP is the backbone of the control system of the endcap electromagnetic calorimeter of the Belle2 experiment, presently under construction at the KEK Laboratory (Tsukuba, J). We present our experience in the design and deployment of this demanding control infrastructure.
        Speaker: Alberto Aloisio (Universita e INFN, Napoli (IT))
      • 11:00
        A multiple 10 Gbit Ethernet data transfer system for EIGER 20m
        Eiger is a single-photon counting x-ray pixel detector developed at the Paul Scherrer Institute for energies up to 25 keV with a pixel size of 75$\times$75 µm$^2$. The Eiger detector is designed for synchroton applications and consists of several modules each having a total of 500 kpixels. 1.5 Mpixel and 2 Mpixel detectors (3 and 4 modules) are being integrated in several beamlines and a 9 Mpixel detector (18 modules) is currently under construction. An Eiger module is subdivided into two half modules each having its independent but overall synchronized readout system consisting of a front-end board (FEB) and a back-end board (BEB). The maximum frame rate is 22 kHz independent of the detector size. The data input stream is sorted in two FPGAs on the FEB. The data rate here goes up to 22 Gb/s. A rate correction is applied to compensate for the counting loss at high count rates. Then the data stream is sent over eight 3.125 Gb/s highspeed transceivers to the BEB which receives and further processes the data. On the BEB the stream is buffered in a DDR2 memory. This allows image summation to extend the dynamic range from 12 bit to 32 bit and also extends the limited external data rate of the 10 Gbit/s Ethernet interface per half module. On a 9 Mpixel detector, with its 36 half modules, the maximum data rate reaches 45 GByte/s. To reduce the network load on the servers side a round robin procedure is implemented by sending the stream to several servers. Here the challenge of keeping the images in one piece has been taken into account. A second approach is online compression currently implemented to reduce the network load and to widen the Ethernet bottleneck. The firmware layout as well as the presented and implemented functions will be presented in detail.
        Speaker: Martin Brückner (Paul Scherrer Institut)
    • 11:20 12:30
      Mini Oral 2 Centro Congressi (Padova)

      Centro Congressi

      Padova

      Conveners: Denis Calvet (CEA/IRFU,Centre d'etude de Saclay Gif-sur-Yvette (FR)), Martin Grossmann
      • 11:20
        025 Engineering Array Tests of High-Resolution Front End Electronics for Water Cherenkov Air Shower Detectors equipped with Cyclone V 2m
        Speaker: Zbigniew Szadkowski (University of Lodz)
      • 11:22
        043 Detection of Neutrino-Induced Air Showers by the Artificial Neural Network FPGA Trigger 2m
        Speaker: Zbigniew Szadkowski (University of Lodz)
      • 11:24
        082 The Least Mean Squares Adaptive FIR Filter for RFI Suppression in Radio Detectors of Cosmic Rays 2m
        Speaker: Zbigniew Szadkowski (University of Lodz)
      • 11:26
        263 Experience with a Slow Control system based on industrial process control hardware and software for the Xenon1T Dark Matter Search 2m
        Speaker: Joao Cardoso (University of Coimbra)
      • 11:28
        266 Development of the EPICS-based Monitoring and Control System for EAST Fast Control Power System 2m
        Speaker: SHI LI
      • 11:30
        113 Data Processing for EAST Remote Participation 2m
        Speaker: Dr Xiaoyang Sun (Institute of Plasma Physics Chinese Academy of Sciences)
      • 11:32
        247 Mechanism of gamma-induced absorption bands formation at 665 nm in KS-4V and KI type quartz glasses 2m
        Speaker: Dr B.S. Fayzullaev (Institute of Nuclear Physics Academy of Sciences of Uzbekistan, Tashkent 100214, Uzbekistan)
      • 11:34
        244 Universal high-performance LO and CLK generation module for LLRF system receivers 2m
        Speaker: Mr Mateusz Zukocinski (Warsaw University of Technology)
      • 11:36
        112 Trigger System for BaF2 Detector Array Readout Electronics at CSNS-WNS 2m
        Speaker: Jiang Di (G)
      • 11:38
        194 Development of the electronics and data acquisition system for the triple-GEM detectors for the upgrade of the CMS forward muon spectrometer 2m
        Speaker: Gilles De Lentdecker (Universite Libre de Bruxelles (BE))
      • 11:44
        264 The Data Acquisition Architecture for the "Dark matter Experiment using Argon Pulse-shaped discrimination - DEAP-3600 - 2m
        Speaker: Mr Pierre-Andre Amaudruz (TRIUMF (CA))
      • 11:46
        083 The Trigger- Time-Event-System for Wendelstein 7-X: Overview and first Operational Experiences 2m
        Speaker: Jörg Schacht (Max-Planck-Institut für Plasmaphysik)
      • 11:48
        211 A new preprocessing and control board for the phase 2 electronics of AGATA experiment. 2m
        Speaker: Javier Collado Ruiz (Departamento de Ingeniería Electrónica - Universitat de València - Escola Tècnica Superior d'Enginyeria. Avinguda de la Universitat s/n 46100 Burjassot (Valencia))
      • 11:50
        140 The study of strip Readout Prototype for ATLAS Phase-I muon Trigger upgrade 2m
        Speaker: Feng Li (Univ. of Science & Tech. of China (CN))
      • 11:52
        259 EMBEDDED IMPLEMENTATION OF A REAL-TIME SWITCHING CONTROLLER ON A ROBOTIC ARM 2m
        Speaker: Giuseppe Ferrò (Università di Tor Vergata)
      • 11:54
        006 Data Chain Reconstructing Technology for the front-end electronics of the BESIII muon identification system 2m
        Speaker: Mr xiaoguang Zhang (University Of Science And Technology Of China)
    • 12:30 13:00
      Break: Lunch Centro Congressi (Padova)

      Centro Congressi

      Padova

    • 13:00 20:00
      Excursion Padova, Venice by bus

      Padova, Venice by bus

    • 07:45 08:30
      Bus Transfer to Conference Venue

      Bus Transfer to Conference Venue

    • 08:30 10:20
      DAQ 3 / Fusion Centro Congressi (Padova)

      Centro Congressi

      Padova

      Conveners: Adriano Francesco Luchetta (Consorzio RFX), Gabriele Manduchi
      • 08:30
        Distributed Real-time control software at ITER 30m
        This paper will provide an overview of the various real-time software processes which are distributed across ITER. This begins with the software processing done at a diagnostic level to process the initially acquired data and produce a meaningful signal for plasma control, which is typically a physics measurement (e.g. the plasma current). These signals are used -among many others- in central plasma control where they will be processed and control algorithms will be applied. The final aspect is the processing in the actuator systems to actually produce the desired control behavior. ITER CODAC has a varying degree of responsibility in all of these areas. Generally speaking, the responsibility for I&C functions is within the procurement packages and ITER CODAC supports the activities by setting standards, providing tools and advice. In case of diagnostics, CODAC and the ITER diagnostic division collaborate closely to provide a common solution for real-time processing (both CPU and FPGA/GPU-based). This solution will be used by the diagnostic plant systems in order to implement real-time functions in their scope as well as by CODAC to implement the central control tasks (e.g. the Plasma Control System). Work is ongoing to provide the appropriate software infrastructure to achieve the requirements. This paper will present an overview of the collected functional and performance requirements for the global real-time infrastructure and illustrate based on a few selected use cases the measures taken to implement these tasks. The focus will be on the diagnostic/central control interface as this more complex than initially foreseen, due to the highly non-diagonal coupling between physics measurement and contribution by the various diagnostics. An update of the ongoing design work for the CPU-based real-time infrastructure will also be presented.
        Speaker: Dr Axel Winter (ITER Organization)
      • 09:00
        Plasma current and shape control for ITER using fast online MPC 20m
        In a magnetically confined tokamak reactor, the Plasma Current and Shape Controller (PCSC) is the component of Plasma Magnetic Control (PMC) that commands the voltages applied to the poloidal field coils, to control the coil currents and the plasma parameters, such as the plasma shape, current, and position. The PCSC acts on the system pre-stabilised by the Vertical Stabilisation controller, which is another PMC component. The challenge of PMC is to maintain the prescribed plasma shape and distance from the plasma facing components, in presence of disturbances, such as H-L transitions or ELMs, and to changes of local dynamics in different operating points. Model Predictive Control (MPC) is an established advanced process control technique in the process industry. It has gained wide industrial acceptance by facilitating a systematic approach to control of large-scale multivariable systems, with efficient handling of constraints on process variables and enabling plant optimisation. These advantages are considered beneficial for PCSC, and potentially also for other control systems of a tokamak. The main obstacle to using MPC for control of such processes is the restriction of the most relevant MPC methods to processes with relatively slow dynamics due to the long achievable sampling times, because time-consuming on-line optimization problems are being repeatedly solved at each sample time of the CSC control loop for determining control actions. In this work we explore the practical feasibility of using MPC for PCSC in the ITER tokamak, employing complexity reduction techniques and recently developed fast on-line quadratic programming (QP) optimization methods. A survey of the available QP methods suitable for the on-line solution of MPC optimization problems is given, with emphasis on first-order methods, which have been recently considered as prime candidates for fast online MPC control. MPC is applied to a simulation model, where PMC makes use of a combination of ohmic in-vessel coils and superconducting poloidal field coils. The prototype MPC controller [1] is based on the control scheme of [2]. Using a modification of the QP solver QPgen [3], a five-fold speed-up compared to the state-of-the-art commercial solver CPLEX was achieved, with peak computation times around 10 ms on a laptop computer with a four-core Intel processor. This is already considered sufficiently fast for the 100 ms sample time estimated to be suitable for the ITER CSC control loop. [1] S. Gerkšič, G. De Tommasi, "Model predictive control of plasma current and shape for ITER", 28th Symposium on Fusion Technology (SOFT 2014), San Sebastián, Spain [2] G. Ambrosino et al., IEEE Trans. Plasma Science, 37(7), 2009, 1324-1331 [3] Giselsson P., Improving Fast Dual Ascent for MPC - Part II: The Embedded Case, arXiv (2014)
        Speaker: Dr Samo Gerksic (Jozef Stefan Institute)
      • 09:20
        Image acquisition and GPU processing application using IRIO technology and FlexRIO devices 20m
        The large amount of data generated by image diagnostics used in big physics experiments requires an efficient use of hardware technologies in real time data acquisition and processing applications. In order to get the best performance of the hardware, it is necessary to provide the hardware and software tools that enable a fast and easy way to deployment these kind of solutions. IRIO technology allows an easy development of advanced data acquisition applications and their integration in EPICS using National Instruments Reconfigurable Input/Output (RIO) FPGA-based cards. Using IRIO software tools, it is possible to minimize the development time to build specific application for different hardware configurations. IRIO uses the open source version of NI-RIO Linux device driver supporting direct DMA access from FlexRIO devices to NVIDIA GPUs. For the development of image processing applications the hardware platform selected has been implemented using a FlexRIO device with a cameralink adapter module and a NVIDIA Kepler architecture GPU. With the help of IRIO tools the user have to focus the development exclusively in the implementation of the FPGA application for the FlexRIO device using LabVIEW/FPGA and the GPU algorithm using NVIDIA CUDA tools. Additionally IRIO provides the EPICS integration for these applications using the software model developed by ITER and Cosylab that simplifies the development of EPICS device support by mean of Nominal Device Support approach. This is a set of libraries with C++ classes simplifying the development of these device supports. To demonstrate the full development cycle an algorithm for image compression based on JPEG2000 standard has been evaluated and tested using a hardware configuration with the same elements defined in the ITER fast controllers hardware catalog. This image standard allows high compression ratios, with or without losses, and can include additional metadata information related to the image. In addition, it allows to define regions of interest (ROI) in which it is possible to work with the maximum detail and execute a specific processing algorithm. All these arguments makes this standard a very interesting option for image-based diagnostic in physics experiments. These software tools has been tested in ITER CCS (Codac Core System).
        Speakers: Julián Nieto Valhondo (Universidad Politécnica de Madrid), Mariano Ruiz (Universidad Politécnica de Madrid)
      • 09:40
        The Readout and Data Acquisition Design of the sPHENIX Detector at RHIC 20m
        The recently established sPHENIX Collaboration at RHIC is upgrading the PHENIX detector in a way that will enable a comprehensive measurement of jets in relativistic heavy ion collisions. The upgrade will give the experiment full azimuthal coverage within a pseudorapidity range of $-1.1 < \eta < 1.1$. In addition to measuring heavy-ion collisions, the new apparatus will provide enhanced physics capabilities for studying nucleon-nucleus and polarized proton collisions, and eventually allow a detailed study of electron-nucleus collisions at an envisioned Electron Ion Collider at Brookhaven. The upgraded detector will be based on the former BaBar magnet and will include tracking detectors, a new electromagnetic calorimeter, and, for the first time at a RHIC experiment, a hadronic calorimeter. A new technology using a sampling Tungsten-scintillating fiber design for the electromagnetic calorimeter is what enables the full azimuthal coverage, as it achieves a radiation length of just about 7mm, which allows for a very compact design of the device. The calorimeter signals are sampled with silicon photomultipliers and waveform digitizing electronics. The digitized waveforms are read out with custom PCIe boards that allow multiple streams with bandwidths of up to 5GBit/s. The goal is to have a sustained event rate to disk of about 15KHz. Focusing on the calorimeters, we will describe the goals and design of the sPHENIX experiment, the design of the digitizers and other parts of the data acquisition system, and the results we got with current prototypes. By the time of the conference, we will have data from a test beam at FermiLab that will test the readout under beam conditions. We will detail the design of the FPGA-based readout cards, and how we implement the so-called "multi-event buffering" in the front-end, which has traditionally enabled PHENIX to take data at rates rivaling, or exceeding, the LHC experiments.
        Speaker: Martin L. Purschke (Brookhaven National Laboratory (US))
      • 10:00
        The NA62 GigaTracker Detector 20m
        The GigaTracker (GTK) system is a magnetic spectrometer made of 3 detector stations and 4 achromat magnets for the NA62 experiment. NA62 aims to measure the branching ratio of the ultra-rare ${K}^{\pm}\rightarrow {\pi}^{+}\nu \bar{\nu}$ at the CERN SPS. The detector measures the momentum, direction and crossing time of all the secondary beam charged particles. The detector has to cope with a non-uniform beam rate as high as 750 MHz, with a an expected peak rate of 1.3 MHz/mm$^2$ around the centre and provide a time resolution better than 200 ps. Each detector station is built using hybrid silicon pixel detectors (60.8mm x 27mm active area each) installed in vacuum and it is cooled through an innovative micro-channels system, etched inside a few hundred of microns thick silicon plate. Each station is made of a 200 $\mu$m thick silicon sensor read out by 2x5 custom 100 $\mu$m thick ASIC (TDCPix), for a total thickness of less than 0.5% of X/X$_0$ and 18000 channels. The TDCPix was specifically developed using a 130 nm CMOS process and incorporates all the electronics needed to perform hit arrival time measurements. Each TDCPix chip contains 40 x 45 asychronous pixels, each one 300 $\mu$m x 300 $\mu$m, and has 100 ps bin TDC converters. To cope with the high rate each TDCPix is readout via four 3.2 Gb/s serialisers sending continously data to custom FPGA made off-detector read-out boards (GTK-RO) placed outside of the experimental area. Each boards receives the data of one chip and buffers them while waiting L0 trigger decision which arrives with a maximum latency of 1 ms. Upon reception of a trigger decision, the boards select the data that fall in a 75 ns time window around the selected timestamp. Then they send the data to sub-detector PCs: each board uses UDP packets over two 1 Gb links. The maximum trigger frequency that the GTK-RO board must sustain is 1 MHz. The boards also distribute timing and control signals to the TDCPix. The purpose of the sub-detector PCs is to merge the data fragments coming from the 10 GTK-RO boards serving one GTK station and send complete events to the online farm. Since the foreseen rate for each GTK-RO is of the order of 1 Gbps, we use an ethernet switch with 24 gigabit ports and 2 10Gb ports as a multiplexer: each PC uses one 10Gb link to receive the data coming from 10 GTK-RO boards, and a second 10Gb link to send the data to the online farm. Each PC runs Linux, and to achieve the full 10Gbps throughput we use the "zero copy" module of the PF_RING libraries to avoid unnecessary memory-to-memory copy.
        Speaker: Alberto Gianoli (Universita di Ferrara & INFN (IT))
    • 10:20 10:40
      Break: Coffee Centro Congressi (Padova)

      Centro Congressi

      Padova

    • 10:40 12:00
      RTA 1 Centro Congressi (Padova)

      Centro Congressi

      Padova

      Conveners: Jinyuan Wu (Fermi National Accelerator Lab. (US)), Kay Rehlich (DESY)
      • 10:40
        Control System of the European XFEL Accelerator 20m
        The European XFEL is an X-Ray Free Electron Laser in the final construction and commissioning phase. While the injector is already operating the stepwise installation and commissioning of the RF stations and beam lines is still ongoing. The start of delivering electrons with 17.5GeV through the 3.4km long facility is planned for Q4 2016. Since more than 200 MicroTCA crates are spread out along the beam lines the synchronization of all subsystems and distributed data acquisition is provided by a precision timing system. Triggers and clocks are distributed to the hardware with a 10ps RMS precision. And the FPGAs and front-end software are receiving various data blocks to allow synchronization of the data flow and information to qualify all electron bunches. The machine protection systems, beam distribution devices and feedback software use this information. This paper describes the architecture of the MicroTCA based front-end and server based middle layer hardware and the software with a synchronized data flow up to the central fast data acquisition system. With 27000 bunches per second the facility generates more than 1GB/sec continuous data flow that is used in operator displays, fast feedbacks and is finally stored to provide offline analysis of special events like interlocks.
        Speaker: Kay Rehlich (DESY)
      • 11:00
        A JESD204B-compliant Architecture for Remote and Deterministic-Latency Operation 20m
        High-speed analog-to-digital converters (ADCs) are key components in a huge variety of systems, such as wireless infrastructure transceivers, software defined radios, radar, secure communications, medical imaging systems and trigger and data acquisition (TDAQ) systems of Nuclear and Sub-nuclear Physics experiments. In fact, the usage of high-speed ADCs for digitizing analog pulses produced by the front-end electronics opens the way to a fully digital processing, which can be implemented by means of application specific integrated circuits (ASICs) or field programmable gate arrays (FPGAs). Over the last decades, the sample rate and dynamic range of high-speed ADCs underwent a continuous growth and it required the development of suitable interface protocols. In order to overcome bandwidth limitations of previous standards and to simplify the printed circuit board routing, the Joint Electron Device Engineering Council has proposed the JESD204B serial interface protocol. JESD204B supports data rates of up to 12.5Gbps per serial lane and foresees dedicated features to guarantee a deterministic timing of the conversion and to support the synchronization of multiple converters in the same system. The timing predictability of the protocol is of great interest for TDAQ systems, where it is often required to operate the whole apparatus synchronously in order to preserve critical trigger information and timing-related data. It is important to note that JESD204B is designed for local operation, i.e. the data producer and consumer chips are meant to be on the same board or anyway at distances of the order of few centimeters, while TDAQ systems may require the converter to be remote (e.g. on-detector) with respect to the logic receiving the data (e.g. off-detector). In this work, we present an original JESD204B-compliant architecture we designed, which is able to operate an analog-to-digital converter in a remote fashion. Our design includes a deterministic-latency high-speed serial link, which is the only connection between the local and remote logic of the architecture and which preserves the deterministic timing features of the protocol. By means of our solution it is possible to read data out of several converters, even remote to each other, and keep them operating synchronously. Our link also supports forward error correction (FEC) capabilities, in the view of the operation in radiation areas (e.g. on-detector in TDAQ systems). We discuss an implementation of our concept in a latest generation FPGA (Xilinx Kintex-7 325T), its logic footprint, frequency performance and power consumption. We present measurements of the timing jitter and latency stability of JESD204B timing-critical signals forwarded over the link. We discuss the radiation-effect mitigation strategies we adopted for protecting the firmware in the on-detector FPGA, such as triple modular redundancy and configuration scrubbing. We also describe a demo application of our architecture with a high-speed ADC running a 16-bit dual channel conversion at 370 Msps corresponding (7.4 Gbps line rate).
        Speaker: Dr Raffaele Giordano (Universita' degli Studi di Napoli "Federico II" and INFN)
      • 11:20
        IRIO Technology: developing applications for Advanced DAQ systems using FPGAs 20m
        IRIO Technology is a set of software tools together with National Instruments Reconfigurable Input/Output (RIO) hardware, simplifying the development cycle and the integration of advanced data acquisition applications in EPICS. RIO devices are implemented using XILINX FPGAs. These reconfigurable devices have to be programmed using the LabVIEW for FPGA tool that works directly with the XILINX compiler. EPICS (Experimental Physics and Industrial Control System) is a very well-known middleware used as distributed control system in scientific facilities running complex experiments. Among the most important facilities, EPICS is used at APS (Advanced Photon Source), ALS (Advanced Light Source), The International Thermonuclear Experiment Reactor (ITER), and the European Spallation Source (ESS). The main objective of this contribution is to provide a method to develop EPICS IOCs based on FPGA devices in which the user has defined a specific functionality. A traditional DAQ system is based on a specific hardware with vendor-defined functionalities. Therefore, the EPICS devices support is implemented according to these fixed specifications. If another hardware model with different functionalities is selected, the user needs to modify the EPICS device support. On the contrary, IRIO provides an EPICS device support capable of connecting with any implementation in the FPGA if the user has followed a set of rules, named profiles. Using these rules the user can develop different implementations with multiple resources: DMA channels to move acquired data, registers for input/output operations, etc. The use of IRIO shorten the development cycle because the user only needs to design the specific application for the FPGA using LabVIEW for FPGA. With the help of templates the user develops its specific applications and compiles the design with the XILINX tools, this is transparent for the user because LabVIEW for FPGA interacts with XILINX tools. Once the user obtains the bitfile, it only needs to create the IOC and instantiate the PV templates provided according with the resources implemented in the FPGA. Therefore, the integration with EPICS is reduced only to a configuration process eliminating the necessity of developing code for device supports. IRIO tools also contain additional software packages to implement EPICS device support using ITER Nominal Device Support (NDS) approach or standalone applications directly using the RIO devices without the intervention of EPICS. IRIO software tools are currently integrated in ITER CODAC Core System as the main component to develop applications for the cRIO and FlexRIO devices in the ITER fast controller catalog. IRIO is distributed with a GPL V2 license to other research facilities using EPICS or other users interested in the development of advanced data acquisition applications.
        Speaker: Mariano Ruiz (Universidad Politécnica de Madrid)
      • 11:40
        An FPGA-Based Track Finder for the L1 Trigger of the CMS Experiment at the High Luminosity LHC 20m

        A new tracking system is under development for operation in the CMS experiment at the High Luminosity LHC. It includes an outer tracker which will construct stubs, built by correlating clusters in two closely spaced sensor layers for the rejection of hits from low transverse momentum tracks, and transmit them off-detector at 40 MHz. If tracker data is to contribute to keeping the Level-1 trigger rate at around 750 kHz under increased luminosity, a crucial component of the upgrade will be the ability to identify tracks with transverse momentum above 3 GeV/c by building tracks out of stubs. A concept for an FPGA-based track finder using a fully time-multiplexed architecture is presented, where track candidates are identified using a projective binning algorithm based on the Hough Transform. A hardware system based on the MP7 MicroTCA processing card has been assembled, demonstrating a realistic slice of the track finder in order to help gauge the performance and requirements for a full system. This paper outlines the system architecture and algorithms employed, highlighting some of the first results from the hardware demonstrator and discusses the prospects and performance of the completed track finder

        Speaker: Thomas Schuh (KIT - Karlsruhe Institute of Technology (DE))
    • 12:00 13:15
      Break: Lunch Centro Congressi (Padova)

      Centro Congressi

      Padova

    • 13:15 14:15
      RTA 2 Centro Congressi (Padova)

      Centro Congressi

      Padova

      Conveners: Fukun Tang (University of Chicago (US)), Pierre-Andre Amaudruz (TRIUMF (CA))
      • 13:15
        Phase stabilization over a 3 km optical link with sub-picosecond precision for the AWAKE experiment 20m
        The Advanced Wakefield Experiment (AWAKE) project aims at studying the proton-driven plasma wakefield acceleration technique for the first time. For that purpose, a testing facility is currently being built at CERN at the former CNGS (CERN Neutrinos to Gran Sasso) facility. The proton beam at an energy of 400 GeV from the Super Proton Synchrotron (SPS) is used to accelerate an electron beam to the GeV scale over ten meters of plasma. Previous experiments using electron and positron beam drivers showed acceleration gradients up to 50 GV/m: three orders of magnitude higher than RF cavities currently used. The wakefield acceleration principle requires very precise synchronization between the driver (proton) beam, a laser pulse seeding an instability in a rubidium plasma and the witness (electron) beam. By design, the reference frequency (2997.9 MHz) of the electron beam and the laser repetition rate (88.2 MHz), used for generating the plasma, are derived locally at the AWAKE site and, therefore, the corresponding achievable timing errors are very small. However, the low-level RF (LLRF) system controlling the proton beam of the SPS accelerator needs to synchronize its reference frequency (400.8 MHz) to the plasma (laser) and electron references. Even though the SPS LLRF system is located about 3 km away from the laser and electron beam electronics, the maximum phase drift between the three references has been specified to be in the sub-picosecond range, in order to achieve the maximum energy transfer from the driver to the witness beam. Since 2014, phase drift measurements over long optical fiber links have been performed to validate the optical medium for reference signal transmission with low phase drifts. In addition, a phase drift measurement logging system has been set up, storing continuously outside temperature and phase difference between a local reference signal and a copy of the same signal sent over a 6 km optical fiber loopback. In order to cope with the requirements of the AWAKE experiment, we have designed a custom VME board that includes optical transceivers for the synchronization signals to be sent/received, a phase-error discrimination stage and a digital control part embedded in a Xilinx Spartan-6 FPGA. High quality delay lines are used to compensate changes in the phase drift of the optical medium. The delay lines are set using coarse and fine tuning. A real-time control algorithm only performs coarse steps when the SPS proton beam is not present in between acceleration cycles. In addition, an evolved algorithm to compensate for power supply noise and variations of board temperature conditions has been developed.
        Speaker: Dr Diego Barrientos (CERN)
      • 13:35
        Commissioning and performance of the common readout system for the Belle II experiment 20m
        The Belle II experiment, aiming to search for physics beyond the standard model by precision measurement, is scheduled to start in 2017. The target luminosity of SuperKEKB, an asymmetric electron–positron collider, is 8x10^35 cm^-2 s^-1 and the Level-1 trigger rate is estimated to reach as much as 30 kHz. The readout system of the Belle II experiment is supposed to receive a large amount of data from different front-end electronics of several sub detectors, process and send the data downstream before the event-rate reduction is performed by the high level trigger system. We have developed the common readout system for all sub detectors except for the innermost pixel detector, which consists of COmmon Pipelined Platform for. Electronics Readout (COPPER) boards and readout PC servers. A COPPER board equips with receiver daughter cards, called as High Speed Link Board (HSLB) which provide common interface with front-end electronics employing a home brew protocol called as “Belle2link” based on Rocket I/O and optical fibers. The COPPER board also has a PCI mezzanine processor card, which equipped with 1.6 GHz Atom CPU to format data and send them downstream via Gigabit Ethernet. The data are further processed by readout PCs, which perform the check of data corruption, data reduction by merging redundant information and partial event-building before sending the data to the full event-builder and high-level trigger PC farm. To check the performance of the readout system, we did a stress test using dummy data produced by FPGA on HSLB boards. We estimated the event size of each sub-detector with the help of Monte Carlo simulation and produced data flow, whose throughput is expected to be same as when the accelerator reaches its target luminosity. In this test, it was achieved that the target event rate of 30 kHz from each detector can be handled by the readout system. In addition to that, we performed commissioning of the readout system by using Electromagnetic CaLorimeter (ECL) and Central Drift Chamber (CDC) for the Belle II experiment in their cosmic ray tests. Although the even rate was lower than that of the actual Belle II experiment, the basic functionality of the readout system worked well with real data from the detectors.
        Speaker: Satoru Yamada (KEK)
      • 13:55
        Real-time implementation in JET of the SPAD disruption predictor using MARTe 20m
        One of the major problems in present tokamaks is the presence of disruptions. The disruptive event can produce serious damage to the device due to the emission of large quantities of energy to small areas of the plasma facing components and the strong electro-magnetic forces produced. Mitigation techniques must be applied to prevent or reduce this damage. Obviously, a pre-requisite for mitigation techniques is the existence of accurate and reliable disruption predictors. In this paper, the real-time implementation in JET of a new type of disruption predictor is presented. The new predictor, Single signal Predictor based on Anomaly Detection (SPAD), does not require past discharges for training purposes. The implementation is based on the Multi-threaded Application Real-Time executor (MARTe) framework. MARTe implementations consist of several software blocks called Generic Application Modules (GAMs) that can be chained in series or parallel. The implementation consists of 6 configurable MARTe GAMs that are executed sequentially. These GAMs process the Locked Mode signal sampled at 1 kHz every 2ms using the latest 32 samples. The process includes the calculation of the Haar Wavelet Transform approximation coefficients, the Mahalanobis distance of a feature vector with respect to a cluster formed by all the previous feature vectors, and an outlier factor depending on the mean and standard deviation of all previously calculated Mahalanobis distances. Due to the real-time requirements, some optimizations over the original algorithm were implemented, including: calculation of Haar approximation coefficients in one step, independently of the Haar transform level applied; and updating the mean, standard deviation and covariances used in the Mahalanobis distance and the calculation of the outlier factor. All this processing is performed in less than 1ms. Analysis over all JET’s ITER-like Wall campaigns show that SPAD has better prediction results than the Advanced Predictor Of DISruptions (APODIS) and the Locked Mode Predictor based on Threshold criterion (LMPT) with 8.98% of false alarms, 10.60% of missed alarms, 3.18% of tardy detections, 83.57% of valid alarms, 2.65% of premature alarms and average anticipation time of 389 ms. The optimizations, performance results, and possible improvements will be presented.
        Speaker: Sergio Esquembri Martínez (Universidad Politecnica de Madrid)
    • 14:15 15:20
      Mini Oral 3 Centro Congressi (Padova)

      Centro Congressi

      Padova

      Conveners: Jinyuan Wu (Fermi National Accelerator Lab. (US)), Mr Pierre-Andre Amaudruz (TRIUMF (CA))
    • 15:20 15:40
      Break: Coffee Centro Congressi (Padova)

      Centro Congressi

      Padova

    • 15:40 16:20
      Invited talk Centro Congressi (Padova)

      Centro Congressi

      Padova

      Conveners: Patrick Le Du (DAPNIA), Rejean Fontaine (Université de Sherbrooke)
      • 15:40
        Evolution of Data Acquisition and Processing in Medical Imaging with Radiation 40m

        With the discovery of X-rays and radioactivity more than a century ago, the need for data acquisition and processing soon became one of the foundations of medical imaging. Contrary to the early devices where simple integration of the radiation signal over time was the way to build contrast in images, modern imaging systems rely on the detection and real-time characterization of every quanta of radiation to extract relevant information for forming images reflecting underlying physical parameters. The most common parameters of interest in imaging modalities using ionizing radiation, such as positron emission tomography (PET) and X-ray computed tomography (CT), are position of interaction, energy and time. Traditionally, such information was extracted through pulse shaping and processing with analog components prior to digitization and data acquisition. Nowadays, most systems rely on early digitization of the signals by sampling with free-running analog-to-digital converters and by replacing analog processing with real-time digital algorithms implemented in high-density FPGA and DSP. While such fully digital data acquisition architecture improves flexibility, scalability and upgradability, future requirements make this approach impractical due to performance limitations, power management constraints, and prohibitive cost. For instance, the number of channels had to be raised from the few hundred of early devices to several tens of thousands to reach sub-mm spatial resolution in PET. Similarly, the coincidence time resolution had to be improved from tens of ns to the few hundred ps regime for achieving time-of-flight measurement. The introduction of single-photon counting spectral CT requires individual events no only to be counted, but the energy to be simultaneously recorded. New technological advances must be sought to cope with these stringent requirements. The challenges to reach these next frontiers in medical imaging will be reviewed with special emphasis on the significance for real-time data acquisition and processing.

        Speaker: Roger LECOMTE
    • 16:20 17:50
      Trigger 1 Centro Congressi (Padova)

      Centro Congressi

      Padova

      Conveners: Christian Bohm (Stockholm University (SE)), David Abbott (Jefferson Lab)
      • 16:20
        Errors detection, handling and recovery at the High Level Trigger of the ATLAS experiment at the LHC 30m
        The complexity of the ATLAS High Level Trigger (HLT) requires a robust system for error detection and handling during online data-taking, it also requires an offline system for the recovery of events where no trigger decision could be made online. The error detection and handling system ensures smooth operation of the trigger system and provides debugging information necessary for offline analysis and diagnosis. In this presentation, we give an overview of the error detection, handling and recovery of problematic events at the ATLAS HLT.
        Speaker: Mark Stockton (McGill University (CA))
      • 16:50
        Two-dimensional wavelet trigger in radio detection of cosmic rays 20m
        The radio technique allows a detail study of the electromagnetic part of an air shower in the atmosphere and provide information complementary to that obtained by surface detectors water Cherenkov tanks, which are predominantly sensitive to the muonic content of an air shower at the ground. A large-scale radio detector array needs a sophisticated self-trigger, due to the limited communication data rate. One of the promising attempts to observe ultra high-energy cosmic rays (UHECRs) by the detection of their coherent radio emission is a wavelet trigger based on a FPGA. The main motivation of a development based on much more sophisticated algorithms is that the efficiency of the radio self-trigger is often very low. Most registered events contain only noise. A significant improvement of the trigger efficiency is the crucial factor. A lot of off-line data analysis requires a non-negligible amount of man-power. A much wiser approach would be to develop a much more efficient trigger. The developing wavelet trigger is an alternative proposal to the currently operating algorithms. The paper presents first results from the two-dimensional wavelet trigger, implemented to the new Front-End Board based on the Cyclone V FPGA 5CEFA9F31I7. The board contains 3 inputs for 3 PMTs from water Cherenkov surface detector + 2 inputs for radio detector with two polarizations. The wavelet trigger investigates a distribution of partial power contributions for two Fourier indices, simultaneously in time and frequency domains. Radio signals were measured by the LPDA antenna in relatively contaminated environment in Lodz (Poland). We tested the wavelet trigger for radio signals: direct and cleaned by various filters.
        Speaker: Zbigniew Szadkowski (University of Lodz)
      • 17:10
        The Level 1 Trigger System for Belle II CDC 20m
        The Belle II experiment at the SuperKEKB collider at KEK is aiming at high precision measurements in B physics. To select the interested physics events at high luminosity peak at 8x10^35/cm^2/s, a multi-layer trigger system is developed for the central drift chamber detector (CDC). The CDC is a multi-wire drift chamber for charged particle tracking. It comprises of 14 thousand sense wires in 9 super-layers, 5 in axial direction and the other 4 with stereo angles. The CDC trigger system first collects the wire hit information from all super-layers and finds the track segments in each super-layer. The identified track segments are passed to various tracking stages. The 2-dimensional tracking applies a Hough transformation on axial super-layers to perform a track pattern recognition. To further remove tracks from the beam background, a sophisticated 3-dimentional tracking is developed, which uses all available hit information to achieve a z-vertex trigger at 1 cm precision. In addition, a complementary neural-network tracking runs in parallel to ensure the total efficiency. The results from all tracking stages are fed to a global decision logic module (GDL) to make a final trigger decision. Two types of FPGA based electronics boards, merger and universal trigger board (UT3), are designed for this trigger system. The trigger data flow is pipelined through gigabit optical serial links at 32 MHz data rate. Three types of serial transceiver ports of the FPGA, GPT, GTX, and GTH, at bandwidth from 3 Gbps to 11 Gbps are used in the trigger chain. To conclude a trigger decision within 5 microseconds, a user defined protocol is developed to reduce the latency for optical transmission. A set of transmission rules is defined for data flow control and synchronization among all stages. The design detail, current status, and performance studies will be presented. Index Terms—Belle II, CDC, real time trigger, FPGA, optical transceiver
        Speaker: Dr Jing-Ge Shiu (National Taiwan University)
      • 17:30
        A VXS [VITA41] Trigger Processor for the 12GEV Experimental Programs at Jefferson Lab 20m
        Now that the 12GeV experimental program at Jefferson Lab is successfully underway in Hall D (GlueX experiment) a new version of the trigger processor has been designed and will be commissioned in the CLAS12 experiments starting in the fall of 2016. The new trigger processor is a second generation development and combines the functions of two current modules, the Crate_Trigger_Processor[CTP] and Global_Trigger_Processor[GTP]. The new board, called the VXS_Trigger_Processor[VTP], will be used in front end and global trigger VXS crates. In front-end crates it can receive and process trigger information from a variety of existing Jefferson Lab modules over the VXS backplane (64 full duplex lanes at up to 8.5Gbps each): up to 256 channels of flash ADC, up to 1536 channels of discriminated drift chamber hits, and up to 16384 channels of silicon vertex tracker hits. Cluster and track reconstruction is performed on the Virtex 7 FPGA and the results are sent to the Sub_System_Processor[SSP] in the global trigger crate using up to 4 of the QSFP transceivers (up to 34Gbps each). In the global-trigger the VTP will perform final high level trigger decisions (detector coincidence/geometry matching) based on the trigger information it receives from up to 16 Sub_System_Processor[SSP] modules that each receive up to 8 VXS crates each for a total capacity of 128 front-end VXS crates. Additionally, the VTP uses a 1GHz Zynq SoC processor to provide a configuration, control, and diagnostic Ethernet interface. The Zynq contains a 40Gbps Ethernet interface that will be combined with a hardware accelerated TCP/IP stack residing in the programmable logic section of the chip. In the future the 40Gbps interface will be used for front-end module readout, eliminating the use of the VME64x/2eSST interface which operates at a much lower rate of 200MB/s. This readout upgrade is intended to provide flexibility for future data hungry experiments or level 3 CPU/GPU farm triggers. This paper details the hardware performance, implementation and some of various trigger algorithm requirements and implementations for the CLAS12 experiment.
        Speakers: Mr Benjamin Raydo (Jefferson Lab), Chris Cuevas (Jefferson Lab)
    • 17:50 18:20
      Bus Transfer to Padova City Center
    • 19:30 20:15
      Concert: San Gaetano church San Gaetano Church, Via Altinate (Padova)

      San Gaetano Church, Via Altinate

      Padova

      • 19:30
        999 Concert Program 45m

        anonymous (XVI century):
        Ave Maria

        Bernardo Pasquini (1637 - 1710 AD):
        Toccata per organo

        Tomas Luis da Victoria(1548-1611 AD):
        Ave Maria a 4 voci (attributed)
        Ave Maria a 8 voci

        Bernardo Pasquini:
        Sonata I a due bassi per organo

        Carlo Gesualdo (1566-1613 AD):
        Ave dulcissima Maria
        mottetto a 5 voci

        Andrea Gabrieli (1533 - 1585 AD):
        Intonazione per organo

        Claudio Monteverdi (1567-1643 AD):
        Domine, ne in furore tuo
        mottetto a 6 voci

        Antonio Vivaldi (1678-1741 AD):
        Magnificat RV 610 per soli, coro ed organo
        Magnificat (Coro)
        Et exultavit (Soli)
        Et misericordia (Coro)
        Fecit potentiam (Coro)
        Deposuit potentes (Coro)
        Esurientes (Soli)
        Suscepit Israel (Coro)
        Sicut locutus est (Soli)
        Gloria - Amen (Coro)

    • 20:15 22:45
      Conference Dinner Centro Culturale Altinate / S. Gaetano, Via Altinate, 71, 35121 Padova (Padova)

      Centro Culturale Altinate / S. Gaetano, Via Altinate, 71, 35121 Padova

      Padova

      Via Altinate, 71, 35121 Padova
    • 07:45 08:30
      Bus Transfer to Conference Venue

      Bus Transfer to Conference Venue

    • 08:30 10:10
      Trigger 2 Centro Congressi (Padova)

      Centro Congressi

      Padova

      Conveners: Patrick Le Du (DAPNIA), Pierre-Andre Amaudruz (TRIUMF (CA))
      • 08:30
        GPU for triggering in HEP esperiments 20m
        The aim of the GAP project is the deployment of Graphic Processing Units (GPU) in real-time applications, ranging from high-energy physics online event selection (trigger) to medical imaging reconstruction. The final goal of the project is to demonstrate that GPUs can have a positive impact in sectors different for rate, bandwidth, and computational intensity. General-purpose computing on GPUs is emerging as a new paradigm in several fields of science, although so far applications have been tailored to the specific strengths of such devices as accelerator in offline computation. With the steady reduction of GPU latencies, and the increase in link and memory throughputs, the use of such devices for real-time applications in high-energy physics data acquisition and trigger systems is becoming ripe. We will discuss the use of online parallel computing on GPU for synchronous low level trigger, focusing on tests performed on CERN NA62 experiment trigger system. GPUs typically show deterministic behaviour in terms of processing latency, but assessment of real-time features of a standard GPGPU system takes a careful characterization of all subsystems. The networking subsystem results the most critical one in terms of latency fluctuations. Our envisioned solution to this issue is NaNet, an FPGA-based PCIe Network Interface Card (NIC) to enable GPUDirect connection. Results obtained parassitically with respect to the NA62 trigger system, during standard data taking, will be shown. The use of GPU in higher trigger system is also considered. In particular we discuss how specific trigger algorithms can be parallelized and thus benefit from the implementation on the GPU architecture, in terms of the increased execution speed and more favourable dependency on the complexity of the analyzed events. Such improvements are particularly relevant for the foreseen LHC luminosity upgrade where highly selective algorithms will be crucial to maintain a sustainable trigger rates with very high pileup. We will give details on how these devices can be integrated in a typical LHC trigger system and benchmarking their performances. As a study case, we will consider the Atlas experimental environment and propose a GPU implementation for a typical muon selection in a high-level trigger system.
        Speaker: Michele Martinelli (INFN)
      • 08:50
        Fast online reconstruction and online calibration in the ALICE High Level Trigger 20m
        ALICE (A Large Heavy Ion Experiment) is one of four major experiments at the Large Hadron Collider (LHC) at CERN. The ALICE High Level Trigger (HLT) is a cluster of 200 nodes, which reconstructs collisions as recorded by the ALICE detector in real-time. It employs a custom online data-transport framework to distribute data and workload among the compute nodes. ALICE employs subdetectors sensitive to environmental conditions such as pressure and temperature, e.g. the Time Projection Chamber (TPC). A precise reconstruction of particle trajectories requires the calibration of these detectors. Performing the calibration in real time in the HLT improves the online reconstructions and renders certain offline calibration steps obsolete speeding up offline physics analysis. For LHC Run 3, starting in 2020 when data reduction will rely on reconstructed data, online calibration becomes a necessity. Reconstructed particle trajectories build the basis for the calibration making a fast online-tracking mandatory. The main detectors used for this purpose are the TPC and ITS. Reconstructing the trajectories in the TPC is the most compute-intense step. We present several components of the ALICE High Level Trigger used for fast event reconstruction and then focus on newly developed components for online calibration. The TPC tracker employs GPUs to speed up the processing and is based on a Cellular Automaton and the Kalman filter. It has been used successfully in proton-proton, lead-lead, and proton-lead runs between 2011 and 2015. We have implemented a wrapper to run ALICE offline analysis and calibration software inside the HLT. Normally, the HLT works in an event-synchronous mode. We have added asynchronous processing capabilities to support long-running calibration tasks. In order to improve the resiliency, an isolated process performs the asynchronous operations such that even a fatal error does not disturb data taking. We have complemented the original loop-free HLT chain with ZeroMQ data-transfer components. The ZeroMQ components facilitate a feedback loop, that after a short delay inserts the calibration result created at the end of the chain back into tracking components at the beginning of the chain. On top of that, these components are used to ship QA histograms to the Data Quality Monitoring (DQM) and to obtain information of pressure and temperature sensors needed for calibration. All these new features are implemented in a general way, such that they have use-cases aside from online calibration. In order to gather sufficient statistics for the calibration, the asynchronous calibration component must process enough events per time interval. Because the calibration is only valid for a certain time period the delay until the feedback loop provides updated calibration data must not be too long. A first full-scale test of the online calibration functionality was performed during the 2015 heavy-ion run under real conditions. We present a timing analysis of this first online-calibration test, which indicates that the HLT is capable of online TPC drift time calibration fast enough to calibrate the tracking via the feedback loop.
        Speaker: Dr David Rohr (Johann-Wolfgang-Goethe Univ. (DE))
      • 09:10
        High-speed, low-latency readout system with real-time trigger based on GPUs 20m
        Significant new challenges are continuously confronting the High Energy Physics (HEP) experiments at the Large Hadron Collider (LHC) at CERN. The quest for rare new physics phenomena leads to the evaluation of a Graphics Processing Unit (GPU) enhancement for the existing high-level trigger (HLT), made possible by the current flexibility of the trigger system, which not only provides faster and more efficient event selection, but also includes the possibility of new complex triggers that were not previously feasible. At HLT, when the efficient many-core parallelization of event reconstruction algorithms is possible, the benefit of significantly reducing the number of the farm computing nodes is evident. At lower levels, where typically severe real-time constraints are present, we envisioned the possibility to meet the real-time constrains and to reduce data transfer latency and its fluctuations, by injecting readout data directly from the FPGA into the GPU memories without any intermediate buffering, therefore offloading the CPU, avoiding OS jitter effects. In order to satisfy such constraints at lower levels, we have developed a custom FPGA-based readout card and implemented a new concept of Direct Memory Access (DMA) capable to move the data from FPGA to system memory and/or GPU memory. The readout card is equipped with a Xilinx Virtex-7 FPGA and it is connected to a GPU farm by a generation 3 PCIe x16 data link, capable of a net throughput of up to 13 GB/s. We have integrated the DMA engine with AMD's “Direct GMA” technology to enable data transfers to GPU memory with a measured data throughput of up to 6.4 GB/s in x8 lanes operation mode. For GPU algorithm, a tracking algorithm for transverse momentum pT trigger is evaluated on a NVIDIA Tesla K40 GPU using Hough-transform methods. A prominent result shows that 500 Stubs are elaborated in only 13 μs with only one GPU core. These results show that low GPU elaboration times combined with low latency and high throughput electronics open a new prospective for a GPU-based low-level trigger system for the CMS experiment. Benchmarks for latency and bandwidth for the proposed readout system are presented, followed by a performance analysis on case studies of the GPU-based low level trigger for the CMS experiment. In addition, the use of DMA in the form of NVIDIA's “GPU Direct” and InfiniBand for low-level trigger will be discussed. Finally, we give an outline of future project activities.
        Speaker: Dr Michele Caselle (Karlsruhe Institute of Technology)
      • 09:30
        ATLAS Level-1 Topological Trigger Performance 20m
        The LHC will collide protons in the ATLAS detector with increasing luminosity through 2016, placing stringent operational and physical requirements to the ATLAS trigger system in order to reduce the 40 MHz collision rate to a manageable event storage rate of 1 kHz, while not rejecting interesting physics events. The Level-1 trigger is the first rate-reducing step in the ATLAS trigger system with an output rate of 100 kHz and decision latency smaller than 2.5 μs. It consists of a calorimeter trigger, muon trigger and a central trigger processor. During the LHC shutdown after the Run 1 finished in 2013, the Level-1 trigger system was upgraded including hardware, firmware and software updates. In particular, new electronics modules were introduced in the real-time data processing path: the Topological Processor System (L1Topo). It consists of a single AdvancedCTA shelf equipped with two Level-1 topological processor blades. They receive real-time information from the Level-1 calorimeter and muon triggers, which is processed by four individual state-of-the-art FPGAs. It needs to deal with a large input bandwidth of up to 6 Tb/s, optical connectivity and low processing latency on the real-time data path. The L1Topo firmware includes measurements of angles between jets and/or leptons and of many other kinematic variables based on lists of selected or sorted trigger objects that need to be done within 200 ns. Over one hundred VHDL algorithms are producing trigger outputs and are incorporated into the logic of the central trigger processor, responsible of generating the Level-1 acceptance signal. The addition of the new selections in Level-1 will improve the ATLAS physics reach in a harsher collision environment. The system has been installed and commissioning started during 2015 and will be continued during 2016. As part of the firmware commissioning, the physics output from individual algorithms needs to be simulated and compared with the hardware response. An overview of the commissioning process and the early impact on physics results with the new L1Topo system will be illustrated.
        Speaker: Marek Palka (Jagiellonian University (PL))
      • 09:50
        The LHCb Trigger in Run-II 20m
        The LHCb trigger system consists of a hardware level, which reduces the event rate of 30 MHz of inelastic collisions to 1 MHz, at which the detector is read out. In the subsequent High Level Trigger, based on a farm of PCs, the event rate is reduced to a level that can be stored and processed offline. For Run-II, the system has been upgraded such that the output of the first stage of the HLT is buffered to the disks on the farm nodes, with a total capacity of 5 PB, while detector alignment and calibration tasks are performed in real time. This, together with improvements to the reconstruction algorithms and an increase in the performance of the PC farm, allows LHCb to be the first high energy collider experiment to perform full offline quality event reconstruction within the trigger. A result of this is that LHCb can now perform real time physics analysis based on the trigger level information which is written to storage in a more compact event format and processed through the "Turbo" stream. The output rate of the trigger can thus be increased from 5 kHz in Run-I, up to 12.5 kHz in Run-II. We discuss the performance of the system with reference to the first physics measurements with the 2015 data, which are based on the Turbo stream. We also discuss the impact of this real-time analysis scheme on the physics programme of the LHCb upgrade, relying entirely on the HLT that will perform an offline-like reconstruction on the full 40 MHz LHC bunch crossing rate in real-time.
        Speaker: Rosen Matev (CERN)
    • 10:10 10:30
      Break: Coffee Centro Congressi (Padova)

      Centro Congressi

      Padova

    • 10:30 12:30
      Poster Session 2 Centro Congressi (Padova)

      Centro Congressi

      Padova

      • 10:30
        $CLOUD^{CLOUD}$ : general-purpose instrument monitoring and data managing software 1h 35m
        An effective experiment is dependent on the ability to reliably store and deliver data and information to all participant parties regardless of their degree of involvement in the specific parts that make the experiment a whole. Having fast, efficient and ubiquitous access to data will increase its visibility and discussion, resulting in strengthened conclusions regarding it. The $CLOUD^{CLOUD}$ project aims at providing users with a general purpose data acquisition, management and instrument monitoring platform that is easy to use, lightweight and accessible to all participants of an experiment. This work is now implemented in the CLOUD experiment at CERN and will be fully integrated with the experiment as of 2016. Despite being used in an experiment of the scale of CLOUD, this software can also be used in any size of experiment or monitoring station, from single computers to large networks of instruments to monitor any sort of instrument output without influencing the individual instrument's DAQ. Instrument data and meta data is stored and accessed via a specially designed database architecture and any type of instrument output is accepted using our continuously growing parsing application. Multiple databases can be used to separate different data taking periods or a single database can be used if for instance an experiment is continuous. A simple web-based application gives the user total control over the monitored instruments and their data, allowing data visualization and download, upload of processed data and the ability to edit existing instruments or add new instruments to the experiment. When in a network, new computers (nodes) are immediately recognized and added to the system and are able to monitor instruments connected to them. This is achieved by a local and lightweight python-based parsing agent that communicates with a main server application. These agents, along with the server application guarantee that all instruments assigned to that computer are monitored. Data parsing is also guaranteed with user defined intervals as low as milliseconds and error information is intuitively displayed so that users can take actions for themselves. This software (server+agents+interface+database) comes in easy and ready-to-use packages that can be installed in any operating system, including Android and iOS systems. It is ideal for use in modular experiments or monitoring stations with large variability in instruments and measuring methods or in large collaborations, where data requires homogenization in order to be effectively transmitted to all involved.
        Speaker: Antonio Dias
      • 10:30
        10-Gbps True Random Number Generator Accomplished in ASIC 1h 35m
        Random number generators are wildly used in many applications in a diverse set of areas ranging from statistics to cryptography. For most applications, pseudo random number generators (PRNGs) are quite satisfactory. However, for cryptographic and security applications, true random number generators (TRNGs) are required for the unpredictability. Random number generated from the quantum entropy is considered the best random number. Even so, the quantum TRNG is usually a large system which takes volume, and some methods may not generate number at a fixed frequency. High density and high data output rate are as important as the quality of the TRNG in the nowadays true random number required devices and instruments. We present the design and the primary test results of our 10-Gbps TRNG, which is named TRNG2015, in the paper. The entropy source of the TRNG2015 is the jitter of oscillator rings. The TRNG2015 is fabricated in a 130 nm CMOS process and assembled in a 6mm x 6 mm QFN48 package. It has one LVDS clock input and ten LVDS random data outputs. The output data rate depends on the input clock which is up to 1 GHz, and the output data rate is up to 1 Gbps per channel and up to 10 Gbps in total. In the TRNG2015 design, a SPI bus is used to configure the entropy source, to enable the channel and to select the post processing structure. With the clock depended data rate and configuration ability, we can balance the power dissipation and the generator function. In the primary test, the ASIC chip is fully functional. All the ten output channels have 1 Gbps output with a 1 Gbps clock input. The output random number can pass the NIST statistical tests. With ten channels working at 1 Gbps, the power dissipation is only about 700 mW in total. The TRNG2015 with a very small size and a low power dissipation can generate true random number at an ultra-high data rate. It can satisfy most of the random number demands from the cryptographic and security applications in real-time. With the ultra-high data rate, the applications can use the random number as needed. The TRNG2015 even could upper the performance of the application which is limited by the random before.
        Speaker: Xinzhe Wang (University of Science and Technology of China)
      • 10:30
        20-Channel 14MeV Neutron Detector Electronics Readout System 1h 35m
        -------------------------------- To measure neutron flux accurately, dozens of neutron detectors are often arranged at different locations of the experimental device. These detectors require steady high voltage power supplies during working, and output the high count rate signal mixed with a large number of X-ray and γ-ray signals. We designed a 20-channel neutron detector readout electronics system to read multi-channel neutron detector signals, and to provide stable high voltage power supplies that the swing is less than 0.5%. The system require four functions, discrimination and selection of neutron signals, anti-saturation of X-ray and γ-ray signals, high voltage driving of multi-channel detector and consistency correction of multi-channel, to meet the special requirements of 1Mc/s counting rate, 0.1% stability high voltage, multi-channel and standard case. ------------------------------------ The system, packaged in the 19 inch 3U standard case with low voltage power supplies hardware, consists of signal readout board and high voltage power supply board. The signal readout board contain two parts: the front-end circuit containing protection diode to realize the function of anti-saturation; the digital system based on FPGA to accomplish the function of pile up rejection of signal, counting, measuring time control, high voltage control and monitoring, etc. The high voltage power supply board adopts the design of modular power supplies, high precision DACs and the high voltage feedback circuit, and has the features of security and stability, high integration, etc. ------------------------------ The software developed with the virtual instrument technology can control and monitoring instrument, analyze measurement results, manages the experimental data, and has the function of multi-channel consistency correction. ---- The result of electrical test shows that the typical value of the neutron count rate of the system is 10Mc/s. High voltage stability is less than ±2.5V, the error of threshold is 0.145%, and the measured timing error is less than 1ns. So the system achieves the design goal of the 10Mc/s average count rate, anti-saturation, and stable high voltage output of 20 channel.
        Speaker: Mr Shengquan Liu (Univ. of Science&Tech. of China(CN))
      • 10:30
        A cosmic ray readout system for qualifications of small-strip Thin Gap Chambers of the ATLAS Muon Spectrometer Phase-I upgrade 1h 35m
        The ATLAS experiment at the CERN Large Hadron Collider (LHC) will be upgrading its Muon Spectrometer during LHC phase-I upgrade in around 2019 to benefit from high luminosity and high energy runs at the LHC. The upgrade will replace the innermost station (namely Small Wheel) of the Muon Spectrometer in the forward region with the so-called New Small Wheel (NSW), in order to improve its Level-1 trigger in the high background rate environment. The NSW employs two types of high rate capable gaseous detectors, namely MicroMesh Gaseous Structure (Micromegas) and small-strip Thin Gap Chamber (sTGC), for on-line reconstruction of muon segments with pointing accuracies of 1 mrad. sTGCs, primary trigger detectors similar to those Thin Gap Chambers instrumented in the present ATLAS Muon Spectrometer but with fine-pitch readout strips, will utilize about 400k readout channels to discriminate bunch crossing in 25 ns and determine hit positions with a precision of about 100 μm per detector layer. Stringent requirements on the timing and spatial measurement precisions, large number of readout channels all impose significant challenges to the design of the readout electronics system. The readout front-end boards under development for the sTGC detector will carry four to eight 64-channel sophisticated amplifier and digitization ASICs, four trigger data processing ASICs as well as readout and slow control chips while it’s physical size is limited to 14cm * 6.5cm in order to be installed on the chamber. These boards are expected to carry hundreds of channels of sensitive analog signals as well as high speed serial lines with speeds up to 4.8 Gbps to shift out trigger data off detectors. Large amount of data to be processed on detector and moved out in both trigger and precision readout paths with low latency requirement are of big concern. We will present the development of the first prototype of the front-end board for the sTGC detector, readout scheme and firmware for the mini data acquisition system that has been used to characterize the amplifier and digitation ASIC as well as for integration test with a prototype detector. Results from the front-end board and the prototype detector integration with the cosmic ray as well as plans to develop a full data rate acquisition system to verify the front-end electronics design will be discussed.
        Speaker: Mr Xu Wang (Univ. of Sci.&Tech. of CHN(USTC))
      • 10:30
        A DAQ Prototype for the ATLAS small-strip Thin Gap Chamber Phase-I Trigger Upgrade 1h 35m
        ATLAS is one of four experiments that is located at the Large Hadron Collider (LHC) now being constructed at CERN, Geneva, Switzerland. The ATLAS detector accepts the events from the proton–proton collision at a rate of 40 MHz. The Tracks containing events of interest are selected via a series of trigger decisions. However, the low energy protons, generated in the magnet materials between the small wheel (SW) and the end-cap muon detector (EM), hits the end-cap trigger chambers, thus producing fake triggers. As the already existing Muon Trigger is not capable of determining the direction of the muon before the magnetic field, muon not emerging from the Interaction Point (IP) can be misidentified as primary trigger candidates. In order to cope with these problems, the ATALS detector will be upgraded in 2018. we will present a DAQ prototype designed for the ATLAS small-strip Thin Gap Chamber (sTGC) Phase-I trigger upgrade. The prototype includes two VMM chips developed to read out the signals of the sTGC, a Xilinx Kintex-7 FPGA used for the VMM2 configuration and the events storage, and a Gigabit Ethernet Transceiver (GET) working at the physical layer. The VMM2 chip is composed of 64 linear front-end channels. Each channel integrates a Charge Sensitive Amplifier (CSA), a shaper, a stable band-gap referenced baseline, several ADCs and other functions. For large data transmission, a large data transfer rate is needed. The test result shows that the transfer rate of the GET can reach up to 900Mb/s without missing code. In order to test the performance of the developing sTGC detector in the future, an event identity is added behind each event, which is implemented via a counter. The GUI panel mainly achieves several functions: the global reset and parameter set, the VMM configuration, VMM data acquisition and some Test modes.
        Speaker: Mr Xu Wang (Univ. of Science & Technology of China(USTC))
      • 10:30
        A FPGA-based Pulse Pile-up Rejection Technique for the Spectrum Measurement in PGNAA 1h 35m
        Prompt gamma neutron activation analysis (PGNAA) is a non-destructive nuclear analytical technique for the determination of elements. By detecting the prompt gamma which is emitted from thermal neutron capture or neutron in-elastic scattering reactions, the type of elements and their amount can be obtained through the prompt gamma spectrum. As a on-line and in situ inspection method, the PGNAA technique need to acquire an accurate spectral information in a relatively short measurement time. This requires that the signal processing circuit should possess a high measurement counting rate. The pulse pile-up can cause the energy spectrum destruction in a high counting rate condition. To making the pulse duration of the various detectors as narrow as possible by using pulse-shaping techniques, the probability of the pulse pile-up can be decreased. The traditional Gauss pulse-shaping technique can obtain a better signal-to-noise ratio (SNR). However, the quasi-Gaussian waveform have a long tailing edge. If the tailing edge of the shaped pulse can quickly return to the baseline after the peak collecting, then the pulse width will be reduced while the energy information is preserved. Therefore, a pulse pile-up rejection technique is developed. The signal generated by the detector passes the CR-(RC)m shaping circuit and becomes a quasi-Gaussian pulse. The pulse is digitalized by the analog-to-digital converter (ADC). The digital signal is discriminated and recorded in field-programmable gate array (FPGA). Once the peak is found, the feedback signal is send back to the shaping circuit and control an analog switch to discharge. Then the tailing edge of the pulse is cut off quickly, and the baseline is recovered. The test resu