

# Investigation on the digital controller for the BWS

Beam wire scanner meeting - 19.05.2016

J. Emery & P. Andersson





# Introduction

Context of this presentation:
 BI management requested to have technical details on implementation

options for the "intelligent drive"

- Goals of this presentation:
- 1) Provide a sensible overview of options to build final electronics
- 2) Discuss the pro and cons for each of them
- 3) List some considerations on FPGA & boards
- 4) Recommend options based on various constraints









- Focus on the Intelligent drive electronics
- Technical presentation
- Target installation of these drives
- Space in the drive, Power requirement, EMI and measurements
- Options investigated
- FPGA resources, CPU power, design process, DDR
- BWS Firmware architecture and status



# Three options for the digital platform

- Option 1:
  - Standalone VME board + custom mezzanine
  - Use of the VFC in standalone mode
  - Dedicated analog/optical mezzanine
- Option 2:
- Combined analogue-digital board
  Use the FPGA reference design (Altera)
  Add dedicated analog/optical circuits
  Combine the 2 boards we have today

- Option 3: Starter-kit + mezzanine
  - Use Arria V SoC Dev kit
  - Dedicated analog/optical mezzanine







# Scanner control architecture

- One intelligent drive (ID) per scanner
  - avoids multiplexing
  - constant monitoring/control
  - allow parallel scans
  - But imply more control and acquisition systems
- Deported processing from VME to ID
- Local monitoring and fault diagnostics
- One VME crate for multiple scanners
   Number depends on CPU-Memory load
- So we try to minimize the ID size
- Will still require more space than current installation: (PSB: 3 racks instead of 1)











# List of selection criteria

- **Board realisation** ullet
  - 1) Boards size

  - 2) Powering requirement3) Cabling and EMI protection
- Digital architecture 1) FPGA internal resources **FPGA** interconnects
  - 3) External memory
- Design process ulletTestability Code reuse 2 Methodology







# Space availability for the board in ID

<u>Board realisation</u>
1) Boards size
2) Powering requirements
3) Cabling and EMI protection







<u>Board realisation</u>1) Boards size2) Powering requirements3) Cabling and EMI protection



- 1. Standard VFC + standard mezzanine
- Missing board space for components





# Option 1: VFC + MEZZANINE

<u>Board realisation</u>
1) Boards size
2) Powering requirements
3) Cabling and EMI protection



# 2. Standard VFC + maximized mezzanine

• Missing space for the optical components



# Option 1: VFC + MEZZANINE

### Board realisation

Boards size
 Powering requirements
 Cabling and EMI protection

- 3. Customized VFC + maximized mezzanine
- Remove the lower SPF cages
- Check power supply VADJ
- VFC in standalone mode
- Stack-up of VMC and mezzanine
- Dedicated analogue/optical mezzanine
- None standard FMC mezzanine shape due to the numerous components
- VME connectors for powering the boards

Boards integration and design W. Vigano & P. Andersson







- Combination of the 2 boards we have today
- Use the FPGA reference design (Altera)



**Board realisation** 2) Powering requirements 3) Cabling and EMI protection

Boards combinations & integration: P. Andersson



# Option 3: Starter-kit + mezzanine

- Board realisation 1) Boards size
- Powering requirements
   Cabling and EMI protection

- Same configuration as today
- FPGA platform Arria V SoC Dev kit
- Modification of the wire-scanner mezzanine to fit the box





Boards combinations & integration: P. Andersson





# Powering requirements

VFC powering the BWS analog board: 12 [V] -> OK 3.3 [V] -> OK 1.8 [V] -> potential problem, limited to 2.5V

1.8[V] is needed to operate the fast ADC, the whole board is using this voltage.







VFC Vadj. Powering schematics





Cablings:

- Similar philosophy of the cablings for all 3 solutions (use of top connectors)
- Other connections slightly worst on the DevKit since uses all sides

EMI interferences:

- All 3 options will use same box => same shielding from external sources
- Only remains perturbations between analog and digital part

| Options                | Electrical coupling |
|------------------------|---------------------|
| 1. VFC                 | -                   |
| 2. Custom              | +                   |
| 3. DevKit              | -                   |
|                        |                     |
|                        |                     |
| - <u>(****</u> ) er er |                     |



**Beams Department** 



BWS discussion on the FPGA platform - J.Emery - 19.05.2016

**BE** Beams Department

# Digital architecture related criteria

Digital architecture 1) FPGA internal resources 2) Board interconnects 3) External memory

|                                           | Custom design Analog-Digital           | VFC                                 |
|-------------------------------------------|----------------------------------------|-------------------------------------|
| Code status (for 2016)                    | 95%                                    | 50% (6 months)                      |
| Evolutivity                               |                                        | Neutral                             |
| Use of IP developped for VFC              | Same FPGA & transceiver                |                                     |
|                                           |                                        |                                     |
| FPGA use as today ALM [%]                 | 14                                     | 21                                  |
| Memory [%]                                | 10                                     | 16                                  |
| DSP Blocks [%]                            | 13                                     | 16                                  |
|                                           |                                        |                                     |
| FPGA type                                 | ARRIA V - SOC                          | ARRIA V                             |
| Туре                                      | 5ASXFB3H4F40C5N*                       | 5AGXMB1G4F40C4N                     |
| Nbr Gates                                 | 362K                                   | 300K                                |
| ALM (adaptive logic module)               | 136880                                 | 113208                              |
| Memory (M10k)                             | 17,260                                 | 15100                               |
| DSP Blocks                                | 1045                                   | 920                                 |
|                                           | 2x ARM processor at 1GHz               | Software NIOS II at 200 MHz**       |
| Logic use for soft CPUs                   |                                        | 4753                                |
| Transfert TCP/IP (as today)               | >400 Mbits tested point-to-point       | 40 -100 Mbits max 🦳                 |
|                                           |                                        |                                     |
| memory controller                         | 3 hard memory controllers              | 2 hard memory controllers           |
| Processor side                            | 2x 256 x 16bits + 1 x 256 x 16bits ECC | -                                   |
|                                           |                                        |                                     |
| DDR3 SDRAM type                           | MT41K256M16HA-125:E                    | MT41K512M16HA-125:E                 |
| Organisation                              | 256 M x 16                             | 512 M x 16                          |
| FPGA side                                 | 4x 256M x 16bits                       | 2x 512M x 16 bits                   |
| memory total in bytes                     | 2048 Mbytes                            | 2048 Mbytes                         |
| shared with program RAM                   | no                                     | yes                                 |
| Nbr of measurement saved (worst case SPS) | 2048/336 = 6                           | 2048/336= 6                         |
| Maximum theoritical transfert             | 4 x 1600 Mword = 12.8 Gbyte            | 2 x 1600 Mword = 6.4 Gbyte          |
| Implemented interface tested              | 4 x 800 Mword = 6.4 Gbyte/s            | Extrapolation: 2x800/4 = 0.8 Gbit/s |
|                                           |                                        |                                     |
| SPS: 336 Mbytes burst read time           | 0.0525                                 | 0.105                               |
| PSB: 190 Mbyte burst read time            | 0.0296875                              | 0.059375                            |
|                                           |                                        |                                     |

With NIOS Softcores

\*As today on the started kit \*\*Altera "Nios II Performance Benchmarks" 16.12.2015



slides

- FPGA logic elements:
   ok for today's implementation: with the 3 options
  - Future implementation:

Depends on processing complexity

- More flexibility using ARM CPU
- External memory potential limitation
- VFC TCP/IP Data transfer will probably be 4 to 10x slower than today (400 Mbits, 100 to 40 Mbits)



# **Overview:** External systems connections

Digital architecture 1) FPGA internal resources 2) Board interconnects 3) External memory





# External memory depth requirement

# Depends on the use cases

- Scans duration change a lot with speeds 20 [m/s] -> 48 [ms] (767 pts) 1 [m/s] -> 570 [ms] (9119 pts)
- Time between IN and OUT (can we limit this time to 1s for exemple?)
- Number of scans per user: min. 2 if we limit INOUT time to get same functionality max. determined by memory depth, mode of operation (expert/op), required repetition rate (can we fix it?)
- For SPS:

With time between IN and OUT of 1s Worst Case: 2048/336 = 6 scans (full record OPS and resolver data) Best Case: 2048/ 5 = 409 scans (no offline processing of OPS/Res)

• VFC or for custom options have the same memory depth (2048 Mbytes)









355ms to transfer 101 Mbytes => 298 Mbit/s Not possible with TCP/IP and VFC Needs full implementation using VME (Phase 2)





# Memory depth for the PSB

# Expert mode (detailed data)

| Tangential speed [m/s]Angular speed (PSB)movement duration [s]max. INOUtfeedback + wire data [Mbits]Optical encoder [Mbits]resolver raw [Mbits]201330.040.5120.33360.11360.11 | total [Mbits]<br>740.54<br>765.65 | 92.57  |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------|--------|
|                                                                                                                                                                               |                                   |        |
|                                                                                                                                                                               | 765.65                            | 05 71  |
| 15 100 0.053 0.504 21.02 372.31 372.31                                                                                                                                        |                                   | 95.71  |
| 10         67         0.0785         0.491         22.33         395.51         395.51                                                                                        | 813.34                            | 101.67 |
| 1 6.7 0.485 0.245 41.86 741.58 741.58                                                                                                                                         | 1525.02                           | 190.63 |
| 20 133 0.04 0.51 20.33 0.58 0.29                                                                                                                                              | 21.20                             | 2.65   |
| 15 100 0.053 0.504 21.02 0.60 0.30                                                                                                                                            | 21.92                             | 2.74   |
| 10 67 0.0785 0.491 22.33 0.64 0.32                                                                                                                                            | 23.28                             | 2.91   |
| 1 6.7 0.485 0.245 41.86 1.20 0.60                                                                                                                                             | 43.66                             | 5.46   |

- 1) One measurement cycle (one IN, one OUT)
- 2) Data recorded continuously between in and out movement
- 3) INOUT time calculated for IN=275ms, OUT=805ms
- 4) Expert mode => Motion data and raw encoders data storage Will be used until we have the optical position sensor digitalised in the VME
- 5) Op Mode => Motion data and processed encoders data storage

Operational mode (OPS & RESOLVER processed)

> IN-OUT delay used for the calculation: 20 m/s => 805-20-275=510ms 15m/s => 805-53/2-275=504ms 10m/s => 805-79/2-275=491ms 1m/s => 805-570/2-275=245ms







- Multiple users with different durations 1.2s to >20s
- Challenge arises when managing multiple scanning cycle per user
- <u>Clear use cases must be given by</u> <u>OP/ABP</u> to calculate the among of produced data and what data reduction we need to apply:
  - at the FPGA levels (ID and/or AS)
  - at the VME CPU levels







# Memory depth for the SPS

**Digital architecture** 1) FPGA internal resources 2) Board interconnects (detailed data)

| Tangantial speed [m/s]    | Angular spood (SDS)  | movement duration [s] | INOUT | feedback + wire data [Mbits] | Optical encoder [Mbits]  | resolver raw [Mbits]   | total [Mbits]  | [Mbyte]  |
|---------------------------|----------------------|-----------------------|-------|------------------------------|--------------------------|------------------------|----------------|----------|
| Taligential speed [III/s] | Aliguiai speeu (SPS) | movement duration [s] | INCOT | reeuback + wire data [wbits] | Optical encoder [wibits] | Tesolvel Taw [IVIDITS] | total [wibits] |          |
| 20                        | 110                  | 0.048                 | 1     | 37.76                        | 668.95                   | 668.95                 | 1375.65        | 171.96   |
| 15                        | 82                   | 0.064                 | 1     | 38.87                        | 688.48                   | 688.48                 | 1415.82        | 176.98   |
| 10                        | 55                   | 0.0933                | 1     | 40.88                        | 724.24                   | 724.24                 | 1489.37        | 186.17   |
| 1                         | 5.5                  | 0.57                  | 1     | 73.73                        | 1306.15                  | 1306.15                | 2686.04        | 335.75   |
| 20                        | 110                  | 0.048                 | 1     | 37.76                        | 1.08                     | 0.54                   | 39.38          | 4.92     |
| 15                        | 82                   | 0.064                 | 1     | 38.87                        | 1.11                     | 0.56                   | 40.53          | 5.07     |
| 10                        | 55                   | 0.0933                | 1     | 40.88                        | 1.17                     | 0.58                   | 42.64          | 5.33     |
| 1                         | 5.5                  | 0.57                  | 1     | 73.73                        | 2.11                     | 1.05                   | 76.89          | 9.61     |
|                           |                      |                       |       |                              |                          |                        |                | 5 at 1 a |
| Tangential speed [m/s]    | Angular speed (SPS)  | movement duration [s] |       | feedback + wire data [Mbits] | Optical encoder [Mbits]  | resolver raw [Mbits]   | total [Mbits]  | [Mbyte]  |
| 20                        | 110                  | 0.048                 | 10    | 347.86                       | 6162.11                  | 6162.11                | 12672.08       | 1584.01  |
| 15                        | 82                   | 0.064                 | 10    | 348.96                       | 6181.64                  | 6181.64                | 12712.24       | 1589.03  |
| 10                        | 55                   | 0.0933                | 10    | 350.98                       | 6217.41                  | 6217.41                | 12785.80       | 1598.22  |
| 1                         | 5.5                  | 0.57                  | 10    | 383.83                       | 6799.32                  | 6799.32                | 13982.47       | 1747.81  |
| 20                        | 110                  | 0.048                 | 10    | 347.86                       | 9.94                     | 4.97                   | 362.77         | 45.35    |
| 15                        | 82                   | 0.064                 | 10    | 348.96                       | 9.97                     | 4.99                   | 363.92         | 45.49    |
| 10                        | 55                   | 0.0933                | 10    | 350.98                       | 10.03                    | 5.01                   | 366.02         | 45.75    |
| 1                         | 5.5                  | 0.57                  | 10    | 383.83                       | 10.97                    | 5.48                   | 400.28         | 50.04    |

- Only one measurement cycle (one IN, one OUT) 1)
- Data recorded continuously between in and out movement 2)

**Operational mode** (OPS & RESOLVER processed)

Expert mode

- Large data grow due to INOUT delay => Can we limit INOUT to a maximum of 1s and do multi-scans per cycle? 3)
- Expert mode => Motion data and raw encoders data storage 4) Will be used until we have the optical position sensor digitalised in the VME
- Op Mode => Motion data and processed encoders data storage or - J.Emery - 19.05.2016 5)



# External memory access organisation

<u>Digital architecture</u>
1) FPGA internal resources
2) Board interconnects
3) External memory

Is the external memories connections could be a limitation?







Is the FPGA selection will determine the system and Firmware testability?

<u>Simulation level:</u>

Most of the final code written in VHDL => Verification (VHDL, SystemVerilog, Simulink) on simulator for all options.

<u>Component level:</u>

JTAG (and JTAG link) probing will be available on all options in the lab and on fields prototypes.

• <u>System level</u>: lab debugging and field validations: Same method could be used (TCP/IP access to large internal data with expert application), transfer rate will vary.





# Code reuse between options?

Is the FPGA selection will determine code reusability?

- Yes partially, because we have already large working code (option 2 and 3)
- No, because all main functionalities will be in VHDL (reusable for all options)
- Not really, not much reuse of existing VFC code for all options (no need of VME, BST, etc ...)



Design process 1) Testability 2) Code reuse 2) Methodology



# Design related criteria

Is the FPGA selection will drive design methodology?

- Yes, hardware processors can allow on-line prototype of processing to run in real time.

Methodology:

- Prototyping data processing in MatLab on existing raw data 1)
- 2) Simulink modelling of the algorithms and test on Dspace
- 3) Implementation in C => Fast to write in C and test on real system Needs fast processing units Needs memory access to data being recorded
- Final version must be in VHDL: 4)
  - Parallel processing independent to any OS or other running tasks
     Powerful verification in simulation

  - Powerful tools to do verification on the FPGA
  - But: Long development & verification time





Design process 1) Testability 2) Code reuse 2) Methodology

- Position, speed and torque precise controlling fully written in VHDL first version operational for 2016 Second version foreseen in 2017 (improve precision and flexibility)
- On-line data processing and fault detection
   Will be used for survey system conditions (mechanical, electrical, controls)
   Prototyping foreseen in Simulink/MatLab and C in the drive
   Implemented in VHDL for critical ones, leave
- Special functionalities

Processing/Area to foresee for future functionalities: Tails measurements procedure, <u>Delayed multi-scans (reconstruct small beams)</u>, vibrations on-line compensation.





- The 3 options where compared for their compatibility with existing hardware, firmware and future needs.
- The table next slide benchmark the 3 options between them.
- This doesn't mean we can't use one of the 3 to build the scanner.

We will adapt the hardware, firmware and specification based on the selected option by BI management.







# Summary table

| Criteria | Option 1:    | Option 2: | Option 3: |
|----------|--------------|-----------|-----------|
|          | VFC modified | Custom    | DevKit    |





- Possible to run in parallel with different options
- Evaluate VFC, then decide
- ?





# Additional slides

BWS discussion on the FPGA platform - J.Emery - 19.05.2016





https://www.altera.com/products/boards\_and\_kits/dev-kits/altera/kit-arria-v-soc.html



https://www.altera.com/products/boards and kits/dev-kits/altera/arria-10-soc-development-kit.html

# Profiles: online calculation vs pre-calculated

- Operate at different top speed 20 [m/s] -> 48 [ms] (767 pts) 1 [m/s] -> 570 [ms] (9119 pts)
- Today pre-calculated into 3 tables included in the FPGA as ROM (safe).
- Needs 3 tables for each preset
- Alternative: Online calculation based on system properties. (3 parameters to play with: Jmax, duration, ratio acc/cst speed).

Optimised iterative Algorithm to be written in C and monitored by FPGA





# Condition monitoring: Survey all system variables





- Condition monitoring and decision in real time.
- To react to unexpected even during a movement
- Large number of parameters to take into account
- Target reaction time within one feedback period 62.5us: 12k instructions Nios 62k instructions ARM



# Feedback implementation in VHDL

CÉRN





| Connection                            | ns Name                                        | Description                          | Export                    | Clock             | Base                 | End         | IRQ | 2 |
|---------------------------------------|------------------------------------------------|--------------------------------------|---------------------------|-------------------|----------------------|-------------|-----|---|
|                                       | ■ hps_0                                        | Arria V/Cyclone V Hard Processor Sy. |                           |                   |                      |             |     |   |
| Ŷ                                     | -D- f2h_cold_reset_req                         | Reset Input                          | hps_0_f2h_cold_reset_req  |                   |                      |             |     |   |
| ¢                                     | -D- f2h_debug_reset_req                        | Reset Input                          | hps_0_f2h_debug_reset_req |                   |                      |             |     |   |
| ¢                                     | −⊃− f2h_warm_reset_req                         | Reset Input                          | hps_0_f2h_warm_reset_req  |                   |                      |             |     |   |
|                                       |                                                | Conduit                              | hps_0_f2h_stm_hw_events   |                   |                      |             |     |   |
|                                       | memory                                         | Conduit                              | memory                    |                   |                      |             |     |   |
| 0                                     |                                                | Conduit                              | hps_io                    |                   |                      |             |     |   |
|                                       | h2f_reset                                      | Reset Output                         | Double-click to export    |                   |                      |             |     |   |
|                                       | h2f_axi_dock                                   | Clock Input                          | Double-click to export    | pll_50_outclk0    |                      |             |     |   |
|                                       | ──── h2f_axi_master                            | AXI Master                           | Double-click to export    | [h2f_axi_clock]   |                      |             |     |   |
|                                       | f2h_axi_dock                                   | Clock Input                          | Double-click to export    | pll_50_outclk0    |                      |             |     |   |
| ↓ \$-\$-                              | → f2h_axi_slave                                | AXI Slave                            | Double-click to export    | [f2h_axi_clock]   | =                    |             |     |   |
|                                       | h2f_lw_axi_dock                                | Clock Input                          | Double-click to export    | pll_50_outclk0    |                      |             |     |   |
|                                       | ── h2f_lw_axi_master                           | AXI Master                           | Double-click to export    | [h2f_lw_axi_dock] |                      |             |     |   |
| ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ | $\rightarrow$ $\blacksquare$ clk_100           | Clock Source                         |                           | exported          |                      |             |     |   |
|                                       | → 🕀 pll_50                                     | Altera PLL                           |                           | clk_100           |                      |             |     |   |
| ↓ ↓ ↓                                 | → <b> </b>                                     | Clock Source                         |                           | exported          |                      |             |     |   |
| │   <b>◇</b> → <b>◆</b> ─             | →      Motion_Controller_0                     | Motion Controller                    |                           | pll_50_outclk0    | multiple             | multiple    |     |   |
|                                       | DMA_DDR3_Motion_Controller                     | DMA_DDR3                             |                           | pll_50_outclk0    |                      |             |     |   |
|                                       | → 🕀 Info_0                                     | Info                                 |                           | pll_50_outclk0    |                      |             |     |   |
|                                       | →                                              | DDR3 SDRAM Controller with UniPHY    |                           | multiple          | multiple             | multiple    |     |   |
|                                       | →                                              | DDR3 SDRAM Controller with UniPHY    |                           | multiple          | multiple             | multiple    |     |   |
|                                       | → <b> </b>                                     | AD9257                               |                           | pll_50_outclk0    |                      |             |     |   |
|                                       | →                                              | DMA_DDR3_Fast_ADC                    |                           | multiple          |                      | 0x0000_01c3 |     |   |
|                                       | →      DMA_DDR3_Fast_ADC_ch3                   | DMA_DDR3_Fast_ADC                    |                           | multiple          |                      | 0x0000_01c7 |     |   |
|                                       | →      DMA_DDR3_Fast_ADC_ch4                   | DMA_DDR3_Fast_ADC                    |                           | multiple          |                      | 0x0000_01cb |     |   |
|                                       | Resolver_to_digital_converter                  | SPI_interface_AD2S1210               |                           | pll_50_outclk0    |                      |             |     |   |
| • • • •                               | →      Current_acquisition                     | current_acquisition_top              |                           | pll_50_outclk0    |                      |             |     |   |
|                                       | →                                              | ops_laser_mon_top                    |                           | pll_50_outclk0    |                      |             |     |   |
|                                       | →                                              | wire_meas_top                        |                           | pll_50_outclk0    |                      |             |     |   |
| │ │ � ─ � ─ <b>●</b> ──               | $\rightarrow$ $\blacksquare$ DMA_DDR3_ADC161_0 | DMA_DDR3_ADC161                      |                           | pll_50_outclk0    |                      | 0x0000_01bf |     |   |
| ↓ ↓ ↓                                 | →      DAC_resolver                            | Resolver_top                         |                           | pll_50_outclk0    |                      |             |     |   |
| ∲∲                                    | →      DAC_daisy_chain_0                       | DAC_daisy_chain                      |                           | pll_50_outclk0    | 0x0000_01d0          | 0x0000_01df |     |   |
|                                       | →      Power_up_sequencer_0                    | Power_up_sequencer                   |                           | pll_50_outclk0    |                      |             |     |   |
| ☆☆∳                                   | → 🕀 Token                                      | slave_template                       |                           | pll_50_outclk0    | <b>■ 0x0000_0200</b> | 0x0000_023f |     |   |
|                                       | DMA_ADC1615626                                 | DMA_ADC                              |                           | unconnected       |                      |             |     |   |
|                                       | onchip_memory_data_ADC1615626                  | On-Chip Memory (RAM or ROM)          |                           | unconnected       | =                    |             |     |   |
|                                       | DMA_SPI_interface_AD2S1210                     | DMA_ADC                              |                           | unconnected       |                      |             |     |   |
|                                       | onchip_memory_data_AD251210                    | On-Chip Memory (RAM or ROM)          |                           | unconnected       | =P                   |             |     |   |
|                                       |                                                | DMA_PWM                              |                           | unconnected       |                      |             |     |   |
|                                       | onchip_memory_data_PWM                         | On-Chip Memory (RAM or ROM)          |                           | unconnected       | <b>■</b>             |             |     |   |
|                                       | serial flash loader 0                          | Altera Serial Flash Loader           |                           |                   |                      |             |     |   |

| hrd_5astfd5k3 > mem_if_ddr3_emif_1                    |                     |                                 | ghr    | d_5astfd5k3 > mem_if_ddr3_emif_1                                       |                                                                 | ghrd_5astfd5k3 >  | > mem_if_ddr3_em  | nif_1                    |             |
|-------------------------------------------------------|---------------------|---------------------------------|--------|------------------------------------------------------------------------|-----------------------------------------------------------------|-------------------|-------------------|--------------------------|-------------|
| DR3 SDRAM Controller with Un<br>tera_mem_if_ddr3_emif | iPHY                |                                 |        | DR3 SDRAM Controller Wi<br>ra_mem_if_ddr3_emif                         | ith UniPHY                                                      | DDR3 SDR          |                   | ller with UniP           | 'HΥ         |
|                                                       |                     |                                 |        | Parameters                                                             |                                                                 |                   |                   |                          |             |
|                                                       |                     |                                 |        |                                                                        | niPHY produces unencrypted PHY and Controller HDL,              | Parameters        |                   |                          |             |
| r Parameters                                          |                     |                                 | co     | onstraint scripts, an example design and                               | a testbench for simulation.                                     |                   |                   | er with UniPHY produc    |             |
| Generation of the DDR3 Controller with UniPHY proc    |                     | PHY and Controller HDL,         | -      | Interface Type                                                         |                                                                 | constraint script | s, an example de  | sign and a testbench     | for simulat |
| constraint scripts, an example design and a testber   | ch for simulation.  |                                 | F      | Enable Hard External Memory Interfa                                    | ce                                                              | ▼ Interface Ty    | ne                |                          |             |
|                                                       |                     |                                 |        | PHY Settings Memory Parameters Me                                      | mory Timing   Board Settings   Controller Settings   Diagnostic |                   | d External Memory | v Interface              |             |
| Interface Type                                        |                     |                                 |        | Apply memory parameters from the ma                                    | anufacturer data sheet                                          |                   |                   |                          |             |
| Enable Hard External Memory Interface                 |                     |                                 |        | Apply device presets from the preset li                                |                                                                 | PHY Settings      | Memory Paramet    | ters Memory Timing       | Board Se    |
| PHY Settings Memory Parameters Memory Timi            | Read Calling        | Controller Cottings Discourse   |        | Memory vendor:                                                         | Micron 👻                                                        | Apply timing      | parameters from   | the manufacturer da      | ata sheet   |
| Memory Parameters Memory Timi                         | ng   Board Settings | Controller Settings Diagnostics |        | Memory format:                                                         | Discrete Device 👻                                               |                   |                   | e preset list on the rig |             |
| General Settings                                      |                     |                                 |        | Memory device speed grade:                                             | 800.0 V MHz                                                     | tIS (base):       | 170               | ps                       |             |
| Speed Grade:                                          | 3 👻                 |                                 |        | Total interface width:                                                 | 32                                                              | tIH (base):       | 120               | ps                       |             |
|                                                       |                     |                                 |        | DQ/DQS group size:                                                     | 8 🗸                                                             | tDS (base):       | 10                | DS                       |             |
| Generate PHY only                                     |                     |                                 |        | Number of DQS groups:<br>Number of chip selects:                       | 4                                                               | tDH (base):       |                   | ps                       |             |
| ▼ Clocks                                              |                     |                                 |        | Number of clocks:                                                      | 1 -                                                             | tDQSQ:            | 100               | ].                       |             |
| Memory clock frequency:                               |                     |                                 |        | Row address width:                                                     | 1 -                                                             | tQH:              |                   | ps                       |             |
|                                                       | 400.0               | MHz                             |        | Column address width:                                                  | 9                                                               |                   | 0.38              | cycles                   |             |
| Achieved memory clock frequency:                      | 400.0               | MHz                             |        | Bank-address width:                                                    | 3                                                               | tDQSCK:           | 255               | ps                       |             |
| PLL reference clock frequency:                        | 100.0               | MHz                             |        | Enable DM pins                                                         |                                                                 | tDQSS:            | 0.27              | cycles                   |             |
| Rate on Avalon-MM interface:                          | Full 👻              |                                 |        |                                                                        |                                                                 | tQSH:             | 0.4               | cycles                   |             |
|                                                       |                     |                                 |        | Memory Initialization Options<br>Mirror Addressing: 1 per chip select: |                                                                 | tDSH:             | 0.18              | cycles                   |             |
| Achieved local clock frequency:                       | 400.0               | MHz                             |        | Address and command parity                                             | 0                                                               | tDSS:             | 0.18              | cycles                   |             |
| Enable AFI half rate clock                            |                     |                                 |        |                                                                        |                                                                 | tINIT:            | 500               | us                       |             |
|                                                       |                     |                                 |        | Mode Register 0<br>Read Burst Type:                                    | Securatial                                                      | tMRD:             | 4                 | cycles                   |             |
| Advanced PHY Settings                                 |                     |                                 |        | DLL precharge power down:                                              | Sequential   DLL off                                            | tRAS:             | 35.0              | ns                       |             |
| Advanced clock phase control                          |                     |                                 |        | Memory CAS latency setting:                                            | 7 -                                                             | tRCD:             | 13.75             | ns                       |             |
| Additional address and command clock phase:           | 0.0                 | Degrees                         |        | Mode Register 1                                                        |                                                                 | tRP:              | 13.75             | ns                       |             |
| Supply Voltage:                                       | 1.5V DDR3 -         | 2-3,                            |        | Output drive strength setting:                                         | RZQ/6 👻                                                         | tREFI:            | 7.8               | us                       |             |
| I/O standard:                                         |                     |                                 |        | Memory additive CAS latency setting                                    |                                                                 | tRFC:             | 260.0             | ns                       |             |
|                                                       | SSTL-15 👻           |                                 |        | ODT Rtt nominal value:                                                 | RZQ/4 ▼                                                         | tWR:              | 15.0              | ns                       |             |
| PLL sharing mode:                                     | No sharing 👻        |                                 |        | Mode Register 2<br>Auto selfrefresh method:                            |                                                                 | tWTR:             | 4                 | cycles                   |             |
| DLL sharing mode:                                     | No sharing 👻        |                                 |        | Selfrefresh temperature:                                               | Manual 👻                                                        | tFAW:             | 35.0              | ns                       |             |
| OCT sharing mode:                                     | Slave 👻             |                                 |        | Memory write CAS latency setting:                                      |                                                                 | tRRD:             | 6.0               | ns                       |             |
| Reconfigurable PLL Location:                          | Top_Bottom 👻        | BWS discussion on               | the FP | GA platform - J.Emery                                                  |                                                                 | tRTP:             | 7.5               | ns                       |             |
|                                                       |                     |                                 |        | 1                                                                      |                                                                 |                   |                   |                          |             |

Q

#### ghrd\_5astfd5k3 > mem\_if\_ddr3\_emif\_1

## DDR3 SDRAM Controller with UniPHY

altera\_mem\_if\_ddr3\_emif

### Parameters

Generation of the DDR3 Controller with UniPHY produces unencrypted PHY and Controller HDL, constraint scripts, an example design and a testbench for simulation.

### Interface Type

Enable Hard External Memory Interface

PHY Settings Memory Parameters Memory Timing Board Settings Controller Settings Diagnostics

Use the Board Settings to model the board-level effects in the timing analysis.

The wizard supports single- and multi-rank configurations. Altera has determined the effects on the output signaling of these configurations and has stored the effects on the output slew rate and the channel uncertainty within the UniPHY MegaWizard.

These values are representative of specific Altera boards. You must change the values to account for the board level effects for your board. You can use HyperLynx or similar simulators to obtain values that are representative of your board.

### Setup and Hold Derating

You can specify the slew rate of the output signals to refer to their effect on the setup and hold times of both the address and command signals and the DQ signals, or specify the setup and hold times directly.

Derating method:

Ose Altera's default settings

Specify slew rates to calculate setup and hold times

| ťly |
|-----|
|     |

| CK/CK# slew rate (Differential):   | 2.0   | V/ns |
|------------------------------------|-------|------|
| Address and command slew rate:     | 1.0   | V/ns |
| DQS/DQS# slew rate (Differential): | 2.0   | V/ns |
| DQ slew rate:                      | 1.0   | V/ns |
| tIS:                               | 0.32  | ns   |
| tIH:                               | 0.22  | ns   |
| tDS:                               | 0.16  | ns   |
| tDH:                               | 0.145 | ns   |

### Channel Signal Integrity

Channel Signal Integrity is a measure of the distortion of the eye due to intersymbol interference or crosstalk or other effects. Typically when going from a single-rank configuration to a multi-rank configuration there is an increase in the channel loss as there are multiple stubs causing reflections. Please perform your channel signal integrity simulations and enter the extra channel uncertainty as compared to Altera's reference eye diagram.

| Dera | ting | Metho | od: |
|------|------|-------|-----|
| Dera | unq  | PIEUK | Ju. |

Use Altera's default settings

|                                            | Specify channel uncertainty value |    |  |
|--------------------------------------------|-----------------------------------|----|--|
| Address and command eye reduction (setup): | 0.0                               | ns |  |
| Address and command eye reduction (hold):  | 0.0                               | ns |  |
| Write DQ eye reduction:                    | 0.0                               | ns |  |
| Write Delta DQS arrival time:              | 0.0                               | ns |  |
| Read DQ eye reduction:                     | 0.0                               | ns |  |
| Read Delta DQS arrival time:               | 0.0                               | ns |  |
|                                            |                                   |    |  |

### Board Skews

PCB traces can have skews between them that can cause timing margins to be reduced. Furthermore skews between different ranks can further reduce the timing margin in multi-rank topologies.

| Restore default values                                       |       |    |
|--------------------------------------------------------------|-------|----|
| Maximum CK delay to DIMM/device:                             | 0.6   | ns |
| Maximum DQS delay to DIMM/device:                            | 0.6   | ns |
| Minimum delay difference between CK and DQS:                 | -0.01 | ns |
| Maximum delay difference between CK and DQS:                 | 0.01  | ns |
| Maximum skew within DQS group:                               | 0.02  | ns |
| Maximum skew between DQS groups:                             | 0.02  | ns |
| Average delay difference between DQ and DQS:                 | 0.0   | ns |
| Maximum skew within address and command bus:                 | 0.02  | ns |
| Average delay difference between address and command and CK: | 0.0   | ns |

| ghrd 5astfd5k3 > mem if ddr3 emif | emif 1 | ddr3 | if | > mem | 5astfd5k3 | ahrd |
|-----------------------------------|--------|------|----|-------|-----------|------|
|-----------------------------------|--------|------|----|-------|-----------|------|

### DDR3 SDRAM Controller with UniPHY

altera\_mem\_if\_ddr3\_emif

#### Parameters

Generation of the DDR3 Controller with UniPHY produces unencrypted PHY and Controller HDL, constraint scripts, an example design and a testbench for simulation.

### Interface Type

Enable Hard External Memory Interface

PHY Settings Memory Parameters Memory Timing Board Settings Controller Settings Diagnostics

cycles

### \* Avalon Interface

Generate power-of-2 data bus widths for Qsys or SOPC Builder

Generate SOPC Builder compatible resets

Maximum Avalon-MM burst length: 4 🗸

Enable Avalon-MM byte-enable signal

#### \* Low Power Mode

Enable Self-Refresh Controls

Enable Auto Power-Down

Auto Power-Down Cycles:

### Efficiency

Enable User Auto-Refresh Controls

Command Queue Look-Ahead Depth: 4

Enable Auto-Precharge Control

Local-to-Memory Address Mapping: CHIP-ROW-BANK-COL 🗸

Enable Reordering

Starvation limit for each command: 10 commands

Internal (JTAG) 🚽

4 🗸

### Configuration, Status and Error Handling

Enable Configuration and Status Register Interface

CSR port host interface:

Enable Error Detection and Correction Logic

Enable Auto Error Correction

### Multiple Port Front End

### Export bonding interface

### Expand Avalon MM data for ECC

Number of ports:

| ead-Data FIFO |
|---------------|
|               |
|               |
|               |
|               |
|               |

ghrd\_5astfd5k3 > mem\_if\_ddr3\_emif\_1

### DDR3 SDRAM Controller with UniPHY altera\_mem\_if\_ddr3\_emif

### Parameters

Generation of the DDR3 Controller with UniPHY produces unencrypted PHY and Controller HDL, constraint scripts, an example design and a testbench for simulation.

### Interface Type

### Enable Hard External Memory Interface

PHY Settings Memory Parameters Memory Timing Board Settings Controller Settings Diagnostics

### Simulation Options

Auto-calibration mode:

Skip Memory Initialization Delays

Enable verbose memory model output

Enable support for Nios II ModelSim flow in Eclipse

### Debugging Options

Debugging feature set:

Option 1 🚽

Full calibration 🚽

| Feature Set  | Included Debugging Features                                                                                     | Additional<br>Utilization               |
|--------------|-----------------------------------------------------------------------------------------------------------------|-----------------------------------------|
| No Debugging | None                                                                                                            | None                                    |
| Option 1     | Connectivity to the EMIF toolkit allowing you to display information about your interface and generate reports. | +600 Registers<br>+700 ALUTs<br>+8 M9Ks |

### Enable EMIF On-Chip Debug Toolkit

EMIF On-Chip Debug Toolkit interface type: Avalon-MM Slave 👻

### Efficiency Monitor and Protocol Checker Settings

The Efficiency Monitor and Protocol Checker is used to measure efficiency on the Avalon interface between the Traffic Generator and the Controller. It will also perform protocol checking on the bus.

Enable the Efficiency Monitor and Protocol Checker on the Controller Avalon Interface

BWS discussion on the FPGA platform - J.Emery - 19.05.2016