

# BWS motion control electronics

BI technical board – 09.06.2016

J. Emery & P. Andersson

BWS motion control electronics - J. Emery - 09.06.2016



### Overview: Control and Acquisition electronics



# CERN

### **Overview:** External systems connections





- Beam Wire Scanner meeting on the 19.05.2016 to expose the options for the digital system of the "Intelligent drive"
- These slides are extracted for this presentation
- It was decided to go for VME board + Mezzanine solution as other BI projects use this solution



# Three options for the digital platform

- Option 1:
  - Standalone VME board + custom mezzanine
  - Use of the VFC in standalone mode
  - Dedicated analog/optical mezzanine
- Option 2:
- Combined analogue-digital board
  Use the FPGA reference design (Altera)
  Add dedicated analog/optical circuits
  Combine the 2 boards we have today

- Option 3: Starter-kit + mezzanine
  - Use Arria V SoC Dev kit
  - Dedicated analog/optical mezzanine







### Summary table

| Criteria               | teria Option 1:<br>VFC modified |   | Option 3:<br>DevKit |  |
|------------------------|---------------------------------|---|---------------------|--|
| Board size             | 3                               | 1 | 2                   |  |
| Powering               | 3                               | 1 | 2                   |  |
| EMI                    | 3                               | 1 | 2                   |  |
| FPGA resources         | 2                               | 1 | 1                   |  |
| Board interfaces       | 1                               | 2 | 2                   |  |
| External Memory        | 2                               | 1 | 1                   |  |
| Testability            | 1                               | 1 | 1                   |  |
| Code reuse             | 1                               | 1 | 1                   |  |
| Methodology            | 2                               | 1 | 1                   |  |
| FW readiness           | 3                               | 2 | 1                   |  |
| HW readiness           | 2                               | 3 | 1                   |  |
| Hardware design effort | 2                               | 3 | 1                   |  |
| Firmware design effort | 3                               | 2 | 1                   |  |





## Scanner control architecture

- One intelligent drive (ID) per scanner
  - avoids multiplexing
  - constant monitoring/control
  - allow parallel scans
  - But imply more control and acquisition systems
- Deported processing from VME to ID
- Local monitoring and fault diagnostics
- One VME crate for multiple scanners
   Number depends on CPU-Memory load
- So we try to minimize the ID size
- Will still require more space than current installation: (PSB: 3 racks instead of 1)









# Space availability for the board in ID

<u>Board realisation</u>
1) Boards size
2) Powering requirements
3) Cabling and EMI protection







<u>Board realisation</u>1) Boards size2) Powering requirements3) Cabling and EMI protection



- 1. Standard VFC + standard mezzanine
- Missing board space for components





### Option 1: VFC + MEZZANINE

<u>Board realisation</u>1) Boards size2) Powering requirements3) Cabling and EMI protection



### 2. Standard VFC + maximized mezzanine

• Missing space for the optical components



# Option 1: VFC + MEZZANINE

### Board realisation

Boards size
 Powering requirements
 Cabling and EMI protection

- 3. Customized VFC + maximized mezzanine
- Remove the lower SPF cages
- Check power supply VADJ
- VFC in standalone mode
- Stack-up of VMC and mezzanine
- Dedicated analogue/optical mezzanine
- None standard FMC mezzanine shape due to the numerous components
- VME connectors for powering the boards

Boards integration and design W. Vigano & P. Andersson













• Use the FPGA reference design (Altera)

Option 2: Combined analog-digital board <sup>1)</sup> Boards size <sup>2)</sup> Powering requirements <sup>3)</sup> Cabling and EMI protection

**Board realisation** 

# Option 3: Starter-kit + mezzanine

- Board realisation
  1) Boards size
- Powering requirements
   Cabling and EMI protection

- Same configuration as today
- FPGA platform Arria V SoC Dev kit
- Modification of the wire-scanner mezzanine to fit the box





Boards combinations & integration: P. Andersson





### Powering requirements

VFC powering the BWS analog board: 12 [V] -> OK 3.3 [V] -> OK 1.8 [V] -> to be checked

1.8[V] is needed to operate the fast ADC, the whole board is using this voltage.





VFC power scheme



 $\Rightarrow$  Confirmation from Andrea, 1.8V is possible  $\Rightarrow$  Is this functionality already tested on the board for another project?





Cablings:

- Similar philosophy of the cablings for all 3 solutions (use of top connectors)
- Other connections slightly worst on the DevKit since uses all sides

EMI interferences:

- All 3 options will use same box => same shielding from external sources
- Only remains perturbations between analog and digital part

| Options   | Electrical coupling |
|-----------|---------------------|
| 1. VFC    | -                   |
| 2. Custom | +                   |
| 3. DevKit | -                   |
|           |                     |
|           |                     |
|           |                     |

=> We are preparing a new mezzanine which will sit next to the VFC to overcome potential issue

Ethernet – optical link









BWS motion control electronics - J. Emery - 09.06.2016

## Digital architecture related criteria

• FPGA logic elements:

with the 3 options

- Future implementation:

Mbits, 100 to 40 Mbits)

More flexibility using ARM CPU

- ok for today's implementation:

Depends on processing complexity

External memory potential limitation

• VFC TCP/IP Data transfer will probably

be 4 to 10x slower than today (400

**Digital architecture** 1) FPGA internal resources 2) Board interconnects 3) External memory

|                                           | Custom design Analog-Digital           | VFC                                |  |
|-------------------------------------------|----------------------------------------|------------------------------------|--|
| Code status (for 2016)                    | 95%                                    | 50% (6 months)                     |  |
| Evolutivity                               |                                        | Neutral                            |  |
| Use of IP developped for VFC              | Same FPGA & transceiver                |                                    |  |
|                                           |                                        |                                    |  |
| FPGA use as today ALM [%]                 | 14                                     | 21                                 |  |
| Memory [%]                                | 10                                     | 16<br>16                           |  |
| DSP Blocks [%]                            | 13                                     |                                    |  |
|                                           |                                        |                                    |  |
| FPGA type                                 | ARRIA V - SOC                          | ARRIA V                            |  |
| Туре                                      | 5ASXFB3H4F40C5N*                       | 5AGXMB1G4F40C4N                    |  |
| Nbr Gates                                 | 362K                                   | 300K                               |  |
| ALM (adaptive logic module)               | 136880                                 | 113208                             |  |
| Memory (M10k)                             | 17,260                                 | 15100                              |  |
| DSP Blocks                                | 1045                                   | 920                                |  |
|                                           | 2x ARM processor at 1GHz               | Software NIOS II at 200 MHz**      |  |
| Logic use for soft CPUs                   |                                        | 4753                               |  |
| Transfert TCP/IP (as today)               | >400 Mbits tested point-to-point       | 40 - 100 Mbits max 🛛 🧹             |  |
|                                           |                                        |                                    |  |
| memory controller                         | 3 hard memory controllers              | 2 hard memory controllers          |  |
| Processor side 2                          | 2x 256 x 16bits + 1 x 256 x 16bits ECC | -                                  |  |
|                                           |                                        |                                    |  |
| DDR3 SDRAM type                           | MT41K256M16HA-125:E                    | MT41K512M16HA-125:E                |  |
| Organisation                              | 256 M x 16                             | 512 M x 16                         |  |
| FPGA side                                 | 4x 256M x 16bits                       | 2x 512M x 16 bits                  |  |
| memory total in bytes                     | 2048 Mbytes                            | 2048 Mbytes                        |  |
| shared with program RAM                   | no                                     | yes                                |  |
| Nbr of measurement saved (worst case SPS) | 2048/336 = 6                           | 2048/336=6                         |  |
| Maximum theoritical transfert             | 4 x 1600 Mword = 12.8 Gbyte            | 2 x 1600 Mword = 6.4 Gbyte         |  |
| Implemented interface tested              | 4 x 800 Mword = 6.4 Gbyte/s            | Extrapolation: 2x800/4 = 0.8 Gbit/ |  |
|                                           |                                        |                                    |  |
|                                           |                                        |                                    |  |
| SPS: 336 Mbytes burst read time           | 0.0525                                 | 0.105                              |  |

slides

With NIOS **Softcores** 

\*As today on the started kit \*\*Altera "Nios II Performance Benchmarks" 16.12.2015



### Depends on the use cases

- Scans duration change a lot with speeds 20 [m/s] -> 48 [ms] (767 pts) 1 [m/s] -> 570 [ms] (9119 pts)
- Time between IN and OUT (We will limit this time to about 1s)
- Number of scans per user: min. 2 if we limit INOUT time to get same functionality max. determined by memory depth, mode of operation (expert/op), required repetition rate (can we fix it?)
- For SPS:

With time between IN and OUT of 1s Worst Case: 2048/336 = 6 scans (full record OPS and resolver data) Best Case: 2048/5 = 409 scans (no offline processing of OPS/Res)

External memory depth requirement

VFC or for custom options have the same memory depth (2048 Mbytes)











355ms to transfer 101 Mbytes => 298 Mbit/s Not possible with TCP/IP and VFC Needs full implementation using VME (Phase 2)



### Memory depth for the PSB

### Expert mode (detailed data)

| Tangential speed [m/s]Angular speed (PSB)movement duration [s]max. INOUTfeedback + wire data [Mbits]Optical encoder [Mbits]resolver raw [Mbits]201330.040.5120.33360.11360.11 | ] total [Mbits]<br>740.54<br>765.65 | 92.57  |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------|--------|
| 20 133 0.04 0.51 20.33 360.11 360.11                                                                                                                                          |                                     |        |
|                                                                                                                                                                               | 765 65                              |        |
| 15 100 0.053 0.504 21.02 372.31 372.31                                                                                                                                        | 705.05                              | 95.71  |
| 10 67 0.0785 0.491 22.33 395.51 395.51                                                                                                                                        | 813.34                              | 101.67 |
| 1 6.7 0.485 0.245 41.86 741.58 741.58                                                                                                                                         | 1525.02                             | 190.63 |
| 20 133 0.04 0.51 20.33 0.58 0.29                                                                                                                                              | 21.20                               | 2.65   |
| 15 100 0.053 0.504 21.02 0.60 0.30                                                                                                                                            | 21.92                               | 2.74   |
| 10 67 0.0785 0.491 22.33 0.64 0.32                                                                                                                                            | 23.28                               | 2.91   |
| 1 6.7 0.485 0.245 41.86 1.20 0.60                                                                                                                                             | 43.66                               | 5.46   |

- 1) One measurement cycle (one IN, one OUT)
- 2) Data recorded continuously between in and out movement
- 3) INOUT time calculated for IN=275ms, OUT=805ms
- 4) Expert mode => Motion data and raw encoders data storage Will be used until we have the optical position sensor digitalised in the VME
- 5) Op Mode => Motion data and processed encoders data storage

Operational mode (OPS & RESOLVER processed)

> IN-OUT delay used for the calculation: 20 m/s => 805-20-275=510ms 15m/s => 805-53/2-275=504ms 10m/s => 805-79/2-275=491ms 1m/s => 805-570/2-275=245ms







- Multiple users with different durations 1.2s to >20s
- Challenge arises when managing multiple scanning cycle per user
- <u>Clear use cases must be given by</u> <u>OP/ABP</u> to calculate the among of produced data and what data reduction we need to apply:
  - at the FPGA levels (ID and/or AS)
  - at the VME CPU levels





### Memory depth for the SPS

**Digital architecture** 1) FPGA internal resources 2) Board interconnects (detailed data)

| Tangantial speed [m/s]    | Angular spood (SDS)  | movement duration [s] | INOUT | feedback + wire data [Mbits] | Optical encoder [Mbits]  | resolver raw [Mbits]   | total [Mbits]  | [Mbyte]  |
|---------------------------|----------------------|-----------------------|-------|------------------------------|--------------------------|------------------------|----------------|----------|
| Taligential speed [III/s] | Aliguiai speeu (SPS) | movement duration [s] | INCOT | reeuback + wire data [wbits] | Optical encoder [wibits] | Tesolvel Taw [IVIDITS] | total [wibits] |          |
| 20                        | 110                  | 0.048                 | 1     | 37.76                        | 668.95                   | 668.95                 | 1375.65        | 171.96   |
| 15                        | 82                   | 0.064                 | 1     | 38.87                        | 688.48                   | 688.48                 | 1415.82        | 176.98   |
| 10                        | 55                   | 0.0933                | 1     | 40.88                        | 724.24                   | 724.24                 | 1489.37        | 186.17   |
| 1                         | 5.5                  | 0.57                  | 1     | 73.73                        | 1306.15                  | 1306.15                | 2686.04        | 335.75   |
| 20                        | 110                  | 0.048                 | 1     | 37.76                        | 1.08                     | 0.54                   | 39.38          | 4.92     |
| 15                        | 82                   | 0.064                 | 1     | 38.87                        | 1.11                     | 0.56                   | 40.53          | 5.07     |
| 10                        | 55                   | 0.0933                | 1     | 40.88                        | 1.17                     | 0.58                   | 42.64          | 5.33     |
| 1                         | 5.5                  | 0.57                  | 1     | 73.73                        | 2.11                     | 1.05                   | 76.89          | 9.61     |
|                           |                      |                       |       |                              |                          |                        |                | 5 at 1 a |
| Tangential speed [m/s]    | Angular speed (SPS)  | movement duration [s] |       | feedback + wire data [Mbits] | Optical encoder [Mbits]  | resolver raw [Mbits]   | total [Mbits]  | [Mbyte]  |
| 20                        | 110                  | 0.048                 | 10    | 347.86                       | 6162.11                  | 6162.11                | 12672.08       | 1584.01  |
| 15                        | 82                   | 0.064                 | 10    | 348.96                       | 6181.64                  | 6181.64                | 12712.24       | 1589.03  |
| 10                        | 55                   | 0.0933                | 10    | 350.98                       | 6217.41                  | 6217.41                | 12785.80       | 1598.22  |
| 1                         | 5.5                  | 0.57                  | 10    | 383.83                       | 6799.32                  | 6799.32                | 13982.47       | 1747.81  |
| 20                        | 110                  | 0.048                 | 10    | 347.86                       | 9.94                     | 4.97                   | 362.77         | 45.35    |
| 15                        | 82                   | 0.064                 | 10    | 348.96                       | 9.97                     | 4.99                   | 363.92         | 45.49    |
| 10                        | 55                   | 0.0933                | 10    | 350.98                       | 10.03                    | 5.01                   | 366.02         | 45.75    |
| 1                         | 5.5                  | 0.57                  | 10    | 383.83                       | 10.97                    | 5.48                   | 400.28         | 50.04    |

- Only one measurement cycle (one IN, one OUT) 1)
- Data recorded continuously between in and out movement 2)

**Operational mode** (OPS & RESOLVER processed)

Expert mode

- Large data grow due to INOUT delay => Can we limit INOUT to a maximum of 1s and do multi-scans per cycle? 3)
- Expert mode => Motion data and raw encoders data storage 4) Will be used until we have the optical position sensor digitalised in the VME
- Op Mode => Motion data and processed entrodetisn data storages Emery - 09.06.2016 5)



### External memory access organisation

Digital architecture 1) FPGA internal resources 2) Board interconnects 3) External memory

Is the external memories connections could be a limitation?







Is the FPGA selection will determine the system and Firmware testability?

<u>Simulation level:</u>

Most of the final code written in VHDL => Verification (VHDL, SystemVerilog, Simulink) on simulator for all options.

<u>Component level:</u>

JTAG (and JTAG link) probing will be available on all options in the lab and on fields prototypes.

• <u>System level</u>: lab debugging and field validations: Same method could be used (TCP/IP access to large internal data with expert application), transfer rate will vary.



### Hardware link to VME

- Transport and integrity of data
- Transparent links between SoC domain
- memory mapping between the FPGAs
- JTAG link between FPGAs

Implementation Sep 2016-January 2017

=> THIS COULD BE REUSED FOR OTHER PROJECTS



## Code reuse between options?

Is the FPGA selection will determine code reusability?

- Yes partially, because we have already large working code (option 2 and 3)
- No, because all main functionalities will be in VHDL (reusable for all options)
- Not really, not much reuse of existing VFC code for all options (no need of VME, BST, etc ...)





### Condition monitoring: Survey all system variables





- Condition monitoring and decision in real time.
- To react to unexpected even during a movement
- Large number of parameters to take into account
- Target reaction time within one feedback period 62.5us: 12k instructions Nios 62k instructions ARM





## Challenges of the ID processing

- Position, speed and torque precise controlling fully written in VHDL first version operational for 2016 Second version foreseen in 2017 (improve precision and flexibility)
- On-line data processing and fault detection Will be used for survey system conditions (mechanical, electrical, controls) Prototyping foreseen in Simulink/MatLab and C in the drive Implemented in VHDL for critical ones, leave
- Special functionalities

Processing/Area to foresee for future functionalities: Tails measurements procedure, <u>Delayed multi-scans (reconstruct small beams)</u>, vibrations on-line compensation.

=> I will start detailed work on this subject in September 2016 (MSE)



# Additional slides

BWS motion control electronics - J. Emery - 09.06.2016



### Design related criteria

Is the FPGA selection will drive design methodology?

- Yes, hardware processors can allow on-line prototype of processing to run in real time.

Methodology:

- Prototyping data processing in MatLab on existing raw data 1)
- 2) Simulink modelling of the algorithms and test on Dspace
- 3) Implementation in C => Fast to write in C and test on real system Needs fast processing units Needs memory access to data being recorded
- Final version must be in VHDL: 4)
  - Parallel processing independent to any OS or other running tasks
     Powerful verification in siulation

  - Powerful tools to do verification on the FPGA
  - But: Long development & verification time



# Profiles: online calculation vs pre-calculated

- Operate at different top speed 20 [m/s] -> 48 [ms] (767 pts) 1 [m/s] -> 570 [ms] (9119 pts)
- Today pre-calculated into 3 tables included in the FPGA as ROM (safe).
- Needs 3 tables for each preset
- Alternative: Online calculation based on system properties. (3 parameters to play with: Jmax, duration, ratio acc/cst speed).

Optimised iterative Algorithm to be written in C and monitored by FPGA



## Feedback implementation in VHDL

CÉRN



