## The development and application of digital BPM signal processors at SSRF

Longwei Lai\*, Yongbin Leng, Yingbin Yan

lailongwei@sinap.ac.cn

BI, SSRF



### Outline

- Introduction
- Applications on SSRF
- Applications on FEL
- New processors for SHINE

## SSRF introduction



- SSRF synchrotron radiation facility @ phase II
- soft x-ray FEL @ user facility
- hard x-ray FEL: SHINE @ tunnel construction



## **BPM Signal Processor Milestones**

### MMM Project Start

### 2010.12 Principle Prototype: RF front-end and Digital Signal Processing

- Lai Longwei, Leng Yongbin\*, Yi Xing, et al. DBPM signal processing with field programmable gate arrays[J], NST 22(2011), 129-133.
- Yi Xing, Leng Yongbin\*, Lai Longwei, et al. RF front-end for digital beam position monitor signal processor[J], NST 22(2011), 65-69.
- 赖龙伟,冷用斌\*,阎映炳,杨桂森等,数字BPM信号处理算法研究,核技术,2010年,第33卷第10期,734-739

### • 2011.6 Version-I and Beam Tests

- 易星,冷用斌\*,赖龙伟等.基于软件无线电的新型数字束流位置处理器[J],核技术,2012年,第35卷第5期
- X. D. Sun, Y. B. Leng\*, An DBPM Calibration Method Implemented on FPGA, IBIC 2012, Tsukuba, Japan
- Leng Yongbin, Yi Xin, Lai Longwei, et al. Online Evaluation of New DBPM Processor at SINAP[C]// Prof of ICALEPCS2011,
- 冷用斌,易星,赖龙伟等.新型数字BPM信号处理器研制进展[J],核技术,2011年,第33卷第5期,326-330
- X.D. Sun, Y.B. Leng. Implementation and integration of a systematic DBPM calibration [J], NST 25(2014), 020401-1-6.
- X.D. Sun, Y.B. Leng. MATLAB Simulation of DBPM Digital Down Conversion [J] AMM, 333(2013), 680-683.
- 赖龙伟,冷用斌,易星等.数字束流位置信号处理算法优化[J],强激光与粒子束,2013年,第25卷第1期,109-113
- 2014.6 Small amount(5) tests
- 2015.6 Optimization, Intelligent Trigger Application Development
- ・ L.W. Lai, Y.B. Leng, AN INTELLIGIENT TRIGGER ABNORMAL BEAM OPERATION MONITORING PROCESSOR AT THE SSRF, IPAC2015
- 赖龙伟,冷用斌等.数字BPM信号处理器研制进展[J],原子能科学技术,2015

### • 2016.6 Version-II, Volume Application on SXFEL, DCLS and Sirius LINAC

- L.W. Lai, Y.B. Leng, BATCH APPLICATIONS OF DIGITAL BPM PROCESSORS FROM THE SINAP, IBIC2016
- L.W. Lai, Y.B. Leng, DESIGN AND PERFORMANCE OF DIGITAL BPM PROCESSOR FOR DCLS AND SXFEL, IPAC2017
- 2017.12 Firmware and software upgrade for SXFEL
- L.W.Lai, , et.al, UPGRADE OF DIGITAL BPM PROCESSOR AT DCLS AND SXFEL, IPAC2018
- 2017 Ideas on direct RF sampling BPM processor for C band cavity BPM
- L.W.Lai, et.al, THE APPLICATION OF DIRECT RF SAMPLING SYSTEM ON CAVITY BPM SIGNAL PROCESSING, IBIC2017
- 2018.1 Firmware and software upgrade for SSRF
- L.W.Lai, et.al, THE DEVELOPMENT AND APPLICATIONS OF THE DIGITAL BPM SIGNAL PROCESSOR AT SINAP, FLS2018
- L.W.Lai, et.al, THE DEVELOPMENT AND APPLICATIONS OF DIGITAL BPM SIGNAL PROCESSOR ON SSRF, IBIC2018
- 2019.4 10 units on-line operation on SSRF
- 2019 Start new processor design for SHINE

### **Processor Overview**





## **Applications on SSRF**

# Applications on SSRF

Small amount of processors are installed on SSRF:

- LINAC 1/3
- LTB 1/3
- Booster 3/30
- BTS **4**/5
- Storage ring 1/140

More will be installed gradually...









Version-I DBPM Beam Tests on SR@2012



#### Version-II DBPM Beam Tests on SR@Jan. 2018 200 Power DBPM Pickup divider Amplitude /dB 100 **RF** signal 50 50 10 20 30 40 Frequency /MHz 30 600 40 total number=235 total number=16384 total number=1171 FA10kHz RMS=0.09µm TBT RMS=0.34µm FA50kHz RMS=0.15µm 30 400



K=10mm, Turn-by-turn resolution:0.34µm

# Mentrol panel and data





### Turn-by-turn data during injection



# SSRF Beam Test—Check With Brilliance





BPM pickup sum signal is divided into 8 channels and put into DBPM and Brilliance, similar to beam passing through BPM center. The output position value should be stable.

The DBPM output is drifting when beam current decays from 260mA to 200mA. Brilliance output is stable when crossbars are switched off.

The main reason is the inconsistency between the four channels.



Fit polynomial to data. P = POLYFIT(X,Y,N), N=3 X: SA channel read out Y: current, mA Y=P(1)\*X^3 + P(2)\*X^2+P(3)\*X + P(4)

## SSRF Beam Test—Correction





X fits well after correction.

Y not very good.

Correction effect is obvious during injection.

## Beam test on booster

Streaming data or capture data for:

- ADC raw data
- turn-by-turn data
- 7.9kHz data(TBT/210), cover ramping period within 2000 points







Bunch Charge Monitor 4 ADCs on wideband RF board make bunch-by-bunch charge measurement with interleaved sampling. Optimization is ongoing.



Interleaved Sampling



## Applications on FEL/LINAC

Stripline BPM Processor Cavity BPM Processor BAM processor



DCLS



Sirius LINAC

# Mass Application on SXFEL/DCLS



## **MMMMMM** SBPM Evaluation



Vertical displacement is getting larger along the beam direction.

Charge measurement accuracy is getting worse when the beam moving from center.



Phase calibration tests.

Longwei Lai, BI, SSRF

K calibration tests.



## **Processor for SHINE**

Cold button BPM processor Stripline BPM processor Cavity BPM processor



.....

## Processor Overview

- standalone structure based on Xilinx SOC
- common platform for beam signal processing...
- FMC ADC and timing mezzanine cards
- IF sampling / RF sampling ADC card, 1MHz repetition rate



## Hardware

- Refer to the design of Xilinx ZCU102 evaluation board
- Zynq Ultrascale SOC FPGA ZU19EG
- ≥500MSPS, ≥14bits IF sampling processor(AC&DC), also can be used as BxB processor on synchrotron facility
- RF direct sampling technique (bandwidth >5GHz) is also studied for C band cavity BPM signal processor, ≥14bits
  High IF Superheterodyne Receiver to a Direct RF-Sampling Receiver



## FPGA

### Table 1: Virtex-5 FPGA Family Members

|           | Configurable Logic Blocks (CLBs) |                                   |                                |                                 | Block RAM Blocks     |       |             |                     | PowerPC             | Endpoint                     |                                 | Max RocketlO |     | Total                       | Max                        |
|-----------|----------------------------------|-----------------------------------|--------------------------------|---------------------------------|----------------------|-------|-------------|---------------------|---------------------|------------------------------|---------------------------------|--------------|-----|-----------------------------|----------------------------|
| Device    | Array<br>(Row x Col)             | Virtex-5<br>Slices <sup>(1)</sup> | Max<br>Distributed<br>RAM (Kb) | DSP48E<br>Slices <sup>(2)</sup> | 18 Kb <sup>(3)</sup> | 36 Kb | Max<br>(Kb) | CMTs <sup>(4)</sup> | Processor<br>Blocks | Blocks for<br>PCI<br>Express | Ethernet<br>MACs <sup>(5)</sup> | GTP          | GTX | I/O<br>Banks <sup>(8)</sup> | User<br>I/O <sup>(7)</sup> |
| XC5VSX50T | 120 x 34                         | 8,160                             | 780                            | 288                             | 264                  | 132   | 4,752       | 6                   | N/A                 | 1                            | 4                               | 12           | N/A | 15                          | 480                        |
| XC5VSX95T | 160 x 46                         | 14,720                            | 1,520                          | 640                             | 488                  | 244   | 8,784       | 6                   | N/A                 | 1                            | 4                               | 16           | N/A | 19                          | 640                        |

#### Zynq UltraScale+ MPSoC: EG Device Feature Summary

#### Table 13: Zynq UltraScale+ MPSoC: EG Device Feature Summary

|                                         | ZU2EG                                                                                                                 | ZU3EG   | ZU4EG   | ZU5EG   | ZU6EG   | ZU7EG   | ZU9EG   | ZU11EG  | ZU15EG  | ZU17EG     | ZU19EG    |
|-----------------------------------------|-----------------------------------------------------------------------------------------------------------------------|---------|---------|---------|---------|---------|---------|---------|---------|------------|-----------|
| Application Processing Unit             | Quad-core Arm Cortex-A53 MPCore with CoreSight; NEON & Single/Double Precision Floating Point; 32KB/32KB L1 Cache, 1N |         |         |         |         |         |         |         |         | Cache, 1MB | 2 Cache   |
| Real-Time Processing Unit               | Dual-core Arm Cortex-R5 with CoreSight; Single/Double Precision Floating Point; 32KB/32KB L1 Cache, and TCM           |         |         |         |         |         |         |         |         |            |           |
| Embedded and External<br>Memory         | 256KB On-Chip Memory w/ECC; External DDR4; DDR3; DDR3L; LPDDR4; LPDDR3;<br>External Quad-SPI; NAND; eMMC              |         |         |         |         |         |         |         |         |            |           |
| General Connectivity                    | 214 PS I/O; UART; CAN; USB 2.0; I2C; SPI; 32b GPIO; Real Time Clock; WatchDog Timers; Triple Timer Counters           |         |         |         |         |         |         |         |         |            |           |
| High-Speed Connectivity                 | 4 PS-GTR; PCIe Gen1/2; Serial ATA 3.1; DisplayPort 1.2a; USB 3.0; SGMII                                               |         |         |         |         |         |         |         |         |            |           |
| Graphic Processing Unit                 | Arm Mali-400 MP2; 64KB L2 Cache                                                                                       |         |         |         |         |         |         |         |         |            |           |
| System Logic Cells                      | 103,320                                                                                                               | 154,350 | 192,150 | 256,200 | 469,446 | 504,000 | 599,550 | 653,100 | 746,550 | 926,194    | 1,143,450 |
| CLB Flip-Flops                          | 94,464                                                                                                                | 141,120 | 175,680 | 234,240 | 429,208 | 460,800 | 548,160 | 597,120 | 682,560 | 846,806    | 1,045,440 |
| CLB LUTs                                | 47,232                                                                                                                | 70,560  | 87,840  | 117,120 | 214,604 | 230,400 | 274,080 | 298,560 | 341,280 | 423,403    | 522,720   |
| Distributed RAM (Mb)                    | 1.2                                                                                                                   | 1.8     | 2.6     | 3.5     | 6.9     | 6.2     | 8.8     | 9.1     | 11.3    | 8.0        | 9.8       |
| Block RAM Blocks                        | 150                                                                                                                   | 216     | 128     | 144     | 714     | 312     | 912     | 600     | 744     | 796        | 984       |
| Block RAM (Mb)                          | 5.3                                                                                                                   | 7.6     | 4.5     | 5.1     | 25.1    | 11.0    | 32.1    | 21.1    | 26.2    | 28.0       | 34.6      |
| UltraRAM Blocks                         | 0                                                                                                                     | 0       | 48      | 64      | 0       | 96      | 0       | 80      | 112     | 102        | 128       |
| UltraRAM (Mb)                           | 0                                                                                                                     | 0       | 13.5    | 18.0    | 0       | 27.0    | 0       | 22.5    | 31.5    | 28.7       | 36.0      |
| DSP Slices                              | 240                                                                                                                   | 360     | 728     | 1,248   | 1,973   | 1,728   | 2,520   | 2,928   | 3,528   | 1,590      | 1,968     |
| CMTs                                    | 3                                                                                                                     | 3       | 4       | 4       | 4       | 8       | 4       | 8       | 4       | 11         | 11        |
| Max. HP I/O <sup>(1)</sup>              | 156                                                                                                                   | 156     | 156     | 156     | 208     | 416     | 208     | 416     | 208     | 572        | 572       |
| Max. HD I/O <sup>(2)</sup>              | 96                                                                                                                    | 96      | 96      | 96      | 120     | 48      | 120     | 96      | 120     | 96         | 96        |
| System Monitor                          | 2                                                                                                                     | 2       | 2       | 2       | 2       | 2       | 2       | 2       | 2       | 2          | 2         |
| GTH Transceiver 16.3Gb/s <sup>(3)</sup> | 0                                                                                                                     | 0       | 16      | 16      | 24      | 24      | 24      | 32      | 24      | 44         | 44        |
| GTY Transceivers 32.75Gb/s              | 0                                                                                                                     | 0       | 0       | 0       | 0       | 0       | 0       | 16      | 0       | 28         | 28        |
| Transceiver Fractional PLLs             | 0                                                                                                                     | 0       | 8       | 8       | 12      | 12      | 12      | 24      | 12      | 36         | 36        |
| PCIe Gen3 x16                           | 0                                                                                                                     | 0       | 2       | 2       | 0       | 2       | 0       | 4       | 0       | 4          | 5         |
| 150G Interlaken                         | 0                                                                                                                     | 0       | 0       | 0       | 0       | 0       | 0       | 1       | 0       | 2          | 4         |
| 100G Ethernet w/ RS-FEC                 | 0                                                                                                                     | 0       | 0       | 0       | 0       | 0       | 0       | 2       | 0       | 2          | 4         |

### ZU19EG VS. XC5VSX50T

- $\geq$  16 times LUTs
- 6.8 times DSP

•

- 7.28 times block RAM
- **1.4** times user I/O
- Quad-core Arm

#### Notes:

1. HP = High-performance I/O with support for I/O voltage from 1.0V to 1.8V.

2. HD = High-density I/O with support for I/O voltage from 1.2V to 3.3V.

3. GTH transceivers in the SFVC784 package support data rates up to 12.5Gb/s. See Table 14.

Mession overview

| AD)<br>Interf | C<br>ace         | Signa<br>I<br>(algor | ll Proces<br>Module<br>ithm co | ssing<br>nfig.)         |                                | Da<br>C<br>M                   | ta Flow<br>ontrol<br>anager |        | Timing<br>Interface | 2  |
|---------------|------------------|----------------------|--------------------------------|-------------------------|--------------------------------|--------------------------------|-----------------------------|--------|---------------------|----|
| Cloo<br>Conf  | Clock<br>Config. |                      | 1A                             | Tri                     | gger<br>nfig.                  |                                | System<br>Monitor           |        | Interlock<br>Module |    |
| SFI<br>Interf | o<br>ace         | SDRAM<br>Interface   |                                | (de<br>cor<br>me<br>cor | elay<br>ntrol,<br>ode<br>nfig. | Register<br>Control<br>Manager |                             | r      | User IO<br>& LED    |    |
| PL PL PL      |                  |                      |                                |                         |                                |                                |                             |        |                     |    |
| АРР           | SD               | QSPI UAR             |                                | ETł                     | 4                              | USB                            | HDMI                        | Driver | OS                  | PS |





| Facility            | Number |
|---------------------|--------|
| Brazil Sirius LINAC | 2+     |
| DCLS                | 19+    |
| SXFEL test facility | 60     |
| SXFEL user facility | 50     |
| SSRF                | 10+    |
| SHINE               | 300+   |



## Thanks for your attention