

## **ITS DCS Status**

Paolo Martinengo CERN

12<sup>th</sup> ALICE ITS Upgrade, MFT and O2 Asian Workshop 인하대학교, 인천, 대한민국

## Outline



- Detector control chain
- ALPIDE monitor
- Detector calibration
- FLP/CRU/RU/Staves topology
- Outlook

## Reminder & acknowledgments



- DCS meeting embedded in WP10 meeting
   (Tuesday, 3:00 PM Geneva time), quite useful and productive discussions
- Presentation by Peter Chochula: https://indico.cern.ch/event/766029/
- Presentation by Markus Keil: https://indico.cern.ch/event/767536/
- Presentation by Johan Alme: https://indico.cern.ch/event/772143/
- Presentation by Simon Voigt Nesbø: https://indico.cern.ch/event/772143/
- Presentation by Sylvain Chapeland: https://indico.cern.ch/event/751993/

## **Detector Control chain**





• ALF FRED is: universal, flexible, scalable framework for integrating the new detector front-ends into the DCS

Peter



## ITS test bench

Full chain from WINCC to PB







Figure 1: Actual State of ALFRED server

# Powerboard WINCC OA panel



- Control chain deployed also in 167
- Performance under test
- First version of PB control panel
- So far working with prototype version, production one already available (including manual)





# **ALPIDE** monitor implementation

#### Johan & Simon



#### ALPIDE Registers of Interest

- > 77 register in the ALPIDE in total
  - > 22 ADC value registers
  - > The rest is mostly control, and some status registers
- > The most interesting data to monitor is probably:
  - Temperature
  - > AVDD
  - > DVDD
  - > DVSS
  - Single Event Upset (SEU) counter
- > But other registers may be interesting as well
- We need a flexible solution that can cater to future needs



#### Alpide Monitor Preliminary Specifications

- > Readout of values on Alpide control bus, without colliding with trigger signals
  - Abort gap trigger input in triggered mode
  - > Deterministic timing for Alpide control bus access
- Monitored values available via wishbone bus
  - Possible to adapt design for «push scheme» of Alpide monitor data to DCS, on CAN/SWT
- 1 Hz refresh rate of ADC values for all chips
- Configurable monitoring
  - Monitoring of all ADC values
  - Monitoring of user specified registers



#### Alpide Monitor Modes of Operation

- » «Idle mode»
  - When there is no beam
  - Monitoring readout at will, or periodically
- Triggered mode
  - Perform monitoring readout on abort gap trigger
  - > This is the most challenging mode to implement
- Continuous mode
  - > RU generates periodic triggers for the ALPIDEs
  - > Perform monitoring readout after trigger



#### **Abort Gap**

- 119 bunch abort gap per orbit (this is the number mentioned by Paolo)
- $\rightarrow$  119 x 25 ns = 2975 ns
- This is sufficient to read out one register on the ALPIDE control bus (350 ns margin)
- LHC orbit is 11.2455 kHz
- There 2 abort gaps per orbit, we can read ALPIDE registers at a rate of 22.5 kHz
- There are some other gaps, but none over 2625 ns



Figure 1: Schematic of the Bunch Disposition around an LHC Ring for the 25ns Filling Scheme

LHC filling schemes (maybe old, 2003?) https://cds.cern.ch/record/691782/files/project-note-323.pdf



#### Alpide Monitoring Overview

#### **Work in progress**





#### Alpide Registers Refresh Rate

#### **Outer barrel:**

- With just one Alpide Control module:
- $\rightarrow$  22 x 3 writes + (32 reads x 7 x 7 x 4) = 6338 transactions
- Assumes that broadcast write can be performed on all links, but links must be read individually
- Assumes that ctrl link is used on every abort gap. If we wait for each conversion to complete, this number will be lower.
- Theoretical register refresh rate:
- $\rightarrow$  22.5 kHz / 6338 = ~3.5 Hz



#### Alpide Monitoring Overview

- Monitoring of Alpide Registers will be split into two submodules:
  - > Sequencer
    - > Has an instruction memory accessible from the wishbone bus
    - FIFO interface that connects to Alpide Control
    - Sequencer puts instructions on FIFO, Alpide Control reads and executes the instructions when it can (e.g. abort gap trigger)
  - Monitor/sniffer
    - Has a BRAM buffer with dedicated memory locations for each Alpide register, for up to 196 chips
    - > BRAM buffer accessible from wishbone bus
    - Accepts data from any read transaction on Alpide CTRL bus
    - > Puts the data into the BRAM according to register address and chip id
    - Completely decoupled from sequencer

#### Challenging but powerful and flexible



## **Detector calibration**

#### Markus

#### **Detector Calibration**



#### Minimum set of calibration measurements:

- Digital Scan
- Threshold Tuning
- Threshold Scan
- Noise occupancy

#### **Threshold Scan**



#### **Current Procedure:**

#### Main loop:

- nChips configuration commands (set charge)
- 50 Pulse commands (Triggers generated on-chip)
- Readout

# -x 50 charge steps - x 512 rows

#### Full scan:

- 5M configuration commands / stave
- 1.28M triggers (pulse)
- Total raw data volume O(100 GB / stave), typically 300 400 GB

Requires continuous swapping/synchronization between DCS (change of configuration) and

O2 (data readout & processing)
Can we avoid this?

#### Possible Simplifications



#### 1) Broadcast Write

- Using broadcast write for the configuration commands nChips -> 4 / stave 5M -> 100k
- Can be tested already now with new-alpide-software

#### 2) Sequencer

- Ideal use case for local sequencer
- E.g. innermost loop one sequence does the scan of entire row

```
for value = start_dac to stop_dac:
    WRITE_BC(dac_address, value)
    for i = 0 to npulse:
        SEND_OPCODE(PULSE)
```

All scans could have same basic sequence with some configuration changes

Is it the bottleneck?

- Would reduce full threshold scan to 512 executions of the scan sequence
- Each sequence would result in 2500 events, yielding the thresholds of the entire row
- Only configuration overhead: initial configuration of chips and sequencer, stepping rows



## Could ALF be an amphibian?

- ALF eco-system is the FLP, traditionally O2 domain
- Developed in the DCS environment and considered part of it but can flourish in the O2 one as well



Sharing the control of ALF between FRED and O2 would avoid to continuously swap between DCS/O2



# FLP/CRU/RU/Staves topology

#### Some numbers



The ALICE CRU has 24 optical inputs, 24 optical outputs
The RU has 3 optical outputs (data), 1 optical input (control)

We assume to connect 3 uplink fibers per RU, i.e. 8 RU/CRU 192 RU (48, 54, 90), 24 CRU (6, 6.75, 11.25)

1 optical link (up/down) has 3.2 Gb/s bandwidth

|       | RU  | CRU   |
|-------|-----|-------|
| IB    | 48  | 6     |
| MB    | 54  | 6.75  |
| OB    | 90  | 11.25 |
|       |     |       |
| Total | 192 | 24    |



192 RUs 24 CRUs, 12 FLPs: minimum HW to operate the ITS (TDR scenario)

Do we have enough computing power to clusterize the data?

#### More numbers



#### Adam's simulation, 20us frame duration

| Layer         | 0      | 1    | 2    | 3         | 4    | 5    | 6    |
|---------------|--------|------|------|-----------|------|------|------|
| 50kHz F       | Pb-Pb: |      |      |           |      |      |      |
|               | 1.86   | 1.11 | 0.76 | 0.51      | 0.40 | 0.52 | 0.48 |
| <u>100kHz</u> | Pb-Pb  |      |      |           |      |      |      |
|               | 3.47   | 2.06 | 1.41 | 0.81      | 0.59 | 0.69 | 0.61 |
| 200KHz        | Pb-Pb  |      |      |           |      |      |      |
|               | 6.59   | 3.89 | 2.63 | 1.43      | 0.99 | 1.03 | 0.89 |
|               |        |      | (Gia | abit/s/RU | J)   |      |      |

#### Strong indications that:

- · Load balance desirable/needed
- Extra computing power, i.e. more FLPs (24), needed



| No balance |         |   |   |   |   |   |   |   |   | Total<br>STAVES |   |      |    |     |
|------------|---------|---|---|---|---|---|---|---|---|-----------------|---|------|----|-----|
|            | CRU     |   |   |   |   |   |   |   |   |                 |   | (RU) |    |     |
|            |         | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8               | 9 | 10   | 11 | 192 |
| IB         | Layer 0 | 8 | 4 |   |   |   |   |   |   |                 |   |      |    | 48  |
|            | Layer 1 |   | 4 | 8 | 4 |   |   |   |   |                 |   |      |    |     |
|            | Layer 2 |   |   |   | 4 | 8 | 8 |   |   |                 |   |      |    |     |
| MB         | Layer 3 | 8 | 8 | 8 |   |   |   |   |   |                 |   |      |    | 54  |
|            | Layer 4 |   |   |   | 8 | 8 | 8 | 6 |   |                 |   |      |    |     |
| ОВ         | Layer 5 | 8 | 8 | 8 | 8 | 8 | 2 |   |   |                 |   |      |    | 90  |
|            | Layer 6 |   |   |   |   |   |   | 8 | 8 | 8               | 8 | 8    | 8  |     |

| Mixing only in the IB |         |     |   |   |   |   |   |   |   | Total<br>STAVES |   |      |    |     |
|-----------------------|---------|-----|---|---|---|---|---|---|---|-----------------|---|------|----|-----|
|                       |         | CRU |   |   |   |   |   |   |   |                 |   | (RU) |    |     |
|                       |         | 0   | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8               | 9 | 10   | 11 | 192 |
| IB                    | Layer 0 | 2   | 2 | 2 | 2 | 2 | 2 |   |   |                 |   |      |    | 48  |
|                       | Layer 1 | 2   | 2 | 3 | 3 | 3 | 3 |   |   |                 |   |      |    |     |
|                       | Layer 2 | 4   | 4 | 3 | 3 | 3 | 3 |   |   |                 |   |      |    |     |
| MB                    | Layer 3 | 8   | 8 | 8 |   |   |   |   |   |                 |   |      |    | 54  |
|                       | Layer 4 |     |   |   | 8 | 8 | 8 | 6 |   |                 |   |      |    |     |
| ОВ                    | Layer 5 | 8   | 8 | 8 | 8 | 8 | 2 |   |   |                 |   |      |    | 90  |
|                       | Layer 6 |     |   |   |   |   |   | 8 | 8 | 8               | 8 | 8    | 8  |     |





Number of cores needed to process 50 TF/s



Extrapolated from measurements on a 2x10 cores Intel Silver 4114 CPU @ 2.20GHz

DELL R740 – as used for CRU readout in O2 lab



### **Executive summary**

- Likely the FLPs will have 25 30 cores (50 cores machines uncommon)
- Balance by mapping helps but does not solve the problem unless we add more CRUs & FLPs (expensive)
- Balancing via network possible but does not help if the FLP has ~ 30 cores unless we add FLPs without CRUs
- Most promising solution is to perform/complete cluster finding in the EPN

#### Outlook



- All basic building blocks available
- Proof of principle of the full chain up to the detector
   Ready to include the detector (IB HIC) in Bergen & CERN
- Strategy to perform complex operations under definition (T monitoring, configuration, threshold scan etc.)
   Need experimental input (time), test bench(s) available
- Supervision/coordination layer needed to integrate DCS/O2



# Back-up



## **Architecture DCS**



#### From the RU to WinCC

- ALF (Alice Low level Frontend)
  - General purpose for all detectors: communication interface to/from CRU (writing/reading)
- **FRED** (Front End Device)
  - Detector specific customization
  - Dedicated server
  - Commands from WinCC to ALF
  - Data from ALF to WinCC
- One RU board available to start working @ ITS communication protocols
  - Setup under installation @ DCS group lab at CERN (Kosice)



**GBT** 

## **Architecture ITS**



#### ITS readout overview

#### From ALPIDEs through RU to CRU ···

- What we want to monitor/control?
  - Temperature, Voltages and Currents
- From?
  - ALPIDE / Power Unit / Readout Unit
- To?

|       | ALPIDES             | Staves | Readout<br>Units | Power<br>Boards |
|-------|---------------------|--------|------------------|-----------------|
| IB    | 9x48                | 48     | 48               | 24              |
| ОВ    | 8x14x54<br>14x14x90 | 144    | 144              | 144             |
| Total | 24120               | 192    | 192              | 168             |

CRU via GBT links and CAN bus (backup/emergency) → First Level Processing (FLP)





# (Filippo Costa)



## **ROC** bench tool performance



Superpage size

Channel GB/s



# 1 CRU/FLP IB, MB, OB mixed load equalized, twice computing power (cluster finding, Sylvain's simulation)



https://indico.cern.ch/event/698929/contributions/2866596/attachments/1620721/2596687/WP10\_PRR\_RU\_Firmware\_v0.pdf **Jo**, **RU PRR** 



Bandwidth is a (potential) issue only in the IB

If the number of FLPs is a free parameter (<= 24) in the fit then
different optimizations are possible:

- keep 2 CRUs/FLP for the MB, OB, add machines only for the IB
- if we free also the number of CRUs we could further optimize connecting only 4 RU/CRU

#### N.B.

So far simulation suggests that computing power for <u>cluster finding</u> requires 24 FLPs (equally loaded)

# (Personal) Conclusions



- (Too) many ways to optimize the layout
- Jo's solution most elegant (all machines equal), optimization of computing power for cluster finding
- Layer-oriented layout more intuitive, matches power tree (which can change)
- Fibers from the RU will be connected to a patch panel in CR4, then to the CRUs Can any RU be connected to any CRU?
- Can we design the patch panel to allow for "universal" configuration?
- Check cluster finding simulation (50 kHz vs 100 kHz)





- In spite of its peculiar feature it seems possible to use the built-in sensor to monitor (variation of) the temperature.
- Optimal algorithm to be defined while refining the measurements
- This applies to IB HIC, OB HIC to be checked
- It would be good to give more HICs to Rune
- Temperature reading concurrently to R/O to be checked
- Important to move to RU + PB

# Frontend developments



RU delivered to the DCS team

(Matteo, Piero & Jo)

- Card installed in the lab
- Software prepared by DCS team (credits to Ombretta)
   can access all functionality of the SCA chip using WINCCOA and ALF
- Verified CANbus communication
- Software framework ready (FRED)

SWT not yet implemented (needs also modification of ALF)

ITS RU low-level access panels









## **OUTLOOK**



- DCS architecture taking shapes, although not yet frozen
- Temperature measurement to be clarified, next week?
- 2-6 CRUs and 2 FLPs available from June for commissioning
- Urgent to provide the DCS colleagues with the latest version of the HW (RU + PB) and FW
- Define sequence/commands to operate the detector
- Dedicated slot in Tuesday WP10 meeting?
- A WP10 dedicated person would also help

# Alpide Monitor Spec

#### **Johan**



39

- Task: Read I, V, T ++ from ALPIDEs during operation
- Challenge: Using ALPIDE control bus shared with trigger distribution
- Specs (short form):
  - Support 3 modes of operation:
    - Idle mode
    - Continous mode
    - Triggered mode (most challenging)
  - Deterministic timing for ALPIDE control bus access
  - Flexible solution with user specified ALPIDE registers to monitor
  - Monitored values available via Wishbone bus for pulled DCS access
    - Should also be easibly adaptable for push mechanisms

20/11/2018

# **Proposed Architecture**

#### Johan



- Using a sequencer to set up which registers to monitor and the actions needed to monitor them
  - Based on broad discussion in WP10 group to ensure flexibility for future needs
- A result memory for storage of the returned values.
  - Challenge:
    - Large memory for outer barrel (196 chips)
    - Need indirect wishbone addressing
- Note: The overall structure is more or less defined, but not all details are decided yet
- Continous mode:
  - Issue slow control command after strobetrigger
- Triggered mode:
  - Issue slow control command upon receiving abortgap trigger
  - Slowest: Worst case refresh cycle: ~3 Hz (outer layer with 196 ALPIDEs)
- Idle mode:
  - Running continously or upon command from software



For more details see the DCS talk (Simon Voigt Nesbø

20/11/2018 40



#### Values to be monitored per Power Boad (2 PU)

#### Status words (5):

Total: 3962 clk, 80 ms @ 50kHz

Same exercise needed for ALPIDE & RU

#### Some (relevant) numbers



Parameters:

**Bandwidth:** data from RU to FLP, from FLP to EPN

Computing power available in each FLP node for cluster finding

The ALICE CRU has 24 optical inputs, 24 optical outputs
The RU has 3 optical outputs (data), 1 optical input (control)

We assume to connect 3 uplink fibers per RU, i.e. 8 RU/CRU 192 RU (48, 54, 90), 24 CRU (6, 6.75, 11.25)

1 optical link (up/down) has 3.2 Gb/s bandwidth

|       | RU  | CRU   |
|-------|-----|-------|
| IB    | 48  | 6     |
| MB    | 54  | 6.75  |
| OB    | 90  | 11.25 |
| Total | 192 | 24    |

#### Reading T during collisions



- Temperature reading incompatible with the presence of trigger (share the same control line)
- The temperature can be readout only while the trigger is not issued
- To readout T (and AVDD) we need:

| 1. | Set the ADCCtrlRegister      | 60 BC |
|----|------------------------------|-------|
| 2. | Send the measurement command | 60 BC |

Wait => 5 ms (ADC conversion time)

3. Read the output register 105 BC

#### No difference between IB/OB, Master/Slave

(calibration required after power-on, perhaps at SOR, omitted)





Figure 2.1: Proton bunches in the PS, SPS and one LHC ring. Note the partial filling of the SPS (3/11 or 4/11) and the voids due to kicker rise-time. One LHC ring is filled in  $\sim 3$  min.

( https://edms.cern.ch/ui/file/445762/3/Vol3 Chap2 v4.pdf )

The LHC filling scheme has an interval of 119 BC w/o collisions (only B1 or B2 @ P2 during the same BC)

# **Proposed Architecture**

#### Johan



- Using a sequencer to set up which registers to monitor and the actions needed to monitor them
  - Based on broad discussion in WP10 group to ensure flexibility for future needs
- A result memory for storage of the returned values.
  - Challenge:
    - Large memory for outer barrel (196 chips)
    - · Need indirect wishbone addressing
- Note: The overall structure is more or less defined, but not all details are decided yet
- Continous mode:
  - Issue slow control command after strobetrigger
- Triggered mode:
  - Issue slow control command upon receiving abortgap trigger
  - Slowest: Worst case refresh cycle: ~3 Hz (outer layer with 196 ALPIDEs)
- Idle mode:
  - Running continously or upon command from software



For more details see the DCS talk (Simon Voigt Nesbø

20/11/2018 45



# ITS topology

|       | ALPIDES             | Staves | Readout<br>Units | Power<br>Boards |
|-------|---------------------|--------|------------------|-----------------|
| IB    | 9x48                | 48     | 48               | 24              |
| ОВ    | 8x14x54<br>14x14x90 | 144    | 144              | 144             |
| Total | 24120               | 192    | 192              | 168             |

