# Design of an FPGA-based USB 3.0 device controller

Zhe Ning, Yunhua Sun

*Abstract*—The traditional USB 3.0 communication based on FPGA uses an external chip as a USB PHY or a USB controller including a USB PHY. This paper realizes a USB 3.0 controller using FPGA resources, in which FPGA logic realizes a serial interface engine, and an FPGA internal transceiver is a USB PHY. Used slices percent after implementation is 4.59% in Kintex-7 325t. The test result shows that the speed of USB 3.0 is more than 320 MB/s bulk-in and bulk-out transfers.

Index Terms—FPGA, Transceivers, USB 3.0

## I. INTRODUCTION

ECAUSE high precision and high-speed components are used in instruments based on FPGA, 1 the requirement of high-speed transmission is necessary. The standard transmission scheme includes 125 MB/s Ethernet and USB 3.0. A widely used example of 125 MB/s Ethernet is SiTCP [1] which is realized based on a hardware protocol stack plus an external PHY. But USB 3.0 maximum speed is 625 MB/s, which is faster than 125 MB/s of Ethernet. Now the readout electronics integrated with Photomultiplier tubes (PMT) [2] [3] is more and more popular. Still, the area of printed circuit boards (PCB) is limited by the size of PMT and power dissipation. At the same time, if dark rates of PMT which is nearly 50 kHz [4], needed to be studied deeply, the transmission bandwidth is very high. Because the window size of waveform sampling is 1000 points generally, the total size per second is 1000 points \* 2 bytes/point \* 50k = 100 MB/s which is very close to the limit of 125 MB/s of Ethernet. USB transmission schemes look better. But most USB schemes are realized using an external USB chip.

As shown in Figure 1, there are several typical schemes for USB in FPGA boards. In Architecture A, PHY, a serial interface engine and drivers are packaged into one chip, such as Cypress CY3014 错误!未找到引用源。 and FIDI FT600 [6], so that user logic in FPGA can send and receive data from the chip quickly. These chips are also called USB-FIFO chips. But

Manuscript received 19 May 2024. This project was supported by National Natural Science Foundation of China (No. 12375192), Beijing Natural Science Foundation (Grant No. 1214029) and the State Key Laboratory of Particle Detection and Electronics, SKLPDE-ZZ(KF)-202308.

Zhe Ning and Yunhua Sun are with State Key Laboratory of Particle Detection and Electronics, 100049, Beijing, China, and also with Institute of High Energy Physics, Chinese Academy of Sciences, 100049, Beijing, China(ningzhe@ihep.ac.cn).

they need scalability and cannot be a USB host. Architecture B, whose external chip includes a PHY, has more scalability and is more complicated because the other layers, such as the serial interface engine (SIE), are realized by FPGA logic. The most common external chip in architecture B is TUSB1310A [7] from Texas Instruments. It is unlucky that this chip has stopped production, and there is no replacement. This paper discusses Architecture C, which is the upgrade of Architecture B and tries to use an FPGA internal transceiver as a USB PHY to replace TUSB1310A. A smaller area of PCB for readout electronics could be adapted with PMT well.



Figure 1 USB schemes for FPGA

#### II. SYSTEM DESIGN

According to the document and support from Xilinx, Xilinx FPGA external transceivers are not used as a USB 3.0 PHY [8]. The primary function of USB PHY should be realized respectively. The standard USB 3.0 PHY includes a module transmitting Low-Frequency Periodic Signaling (LFPS) in which gigabit transceivers are turned off. The other is an SIE module communicating regular USB 3.0 data in which gigabit transceivers are turned on.

# A. LFPS definition

USB 3.0 PHY is similar to the PHY of PCIe and SATA, which should transmit Low-Frequency Periodic Signaling (LFPS) to initiate links or wake up the link partner in a low-power link state.

As Figure 2 shown, there are three parameters for LFPS: tPeriod, tBurst, and tRepeat. Because LFPS is a square wave, tPeriod is the period of a square wave whose minimum value is 20 ns, and the maximum value is 100 ns; tBurst is a period occupied by the transmission of continuous LFPS signal, and its value is dependent on LFPS types shown by Table 1 [8]; tRepeat are composed of tPeriod and the time of electric idle states by keeping two transmission differential wires at the

same voltage, and its value is also depended on LFPS types.

The most common LFPS type is polling, whose normal

tBurst is 1.0 µs, and normal tRepeat is 10.0 µs.



Figure 2 LFPS signaling

Table 1 LFPS Transmitter Timing for SuperSpeed Design

|           | tBurst  |        |         |                                     | tRepeat |        |         |  |
|-----------|---------|--------|---------|-------------------------------------|---------|--------|---------|--|
|           | Minimum | Normal | Maximum | Minimum<br>Number of<br>LFPS Cycles | Minimum | Normal | Maximum |  |
| Polling   | 0.6 µs  | 1.0 µs | 1.4 μs  |                                     | 6 µs    | 10 µs  | 14 µs   |  |
| Ping      | 40 ns   |        | 200 ns  | 2                                   | 160 ms  | 200 ms | 240 ms  |  |
| tReset    | 80 ms   | 100 ms | 120 ms  |                                     |         |        |         |  |
| U1 Exit   | 600 ns  |        | 2 ms    |                                     |         |        |         |  |
| U2 Exit   | 80 µs   |        | 2 ms    |                                     |         |        |         |  |
| U1 Wakeup | 80 µs   |        | 10 ms   |                                     |         |        |         |  |

B. Transmitting and receiving LFPS

As discussed, some cases, such as link initiations or waking up link partners, should transmit LFPS by configuring some ports of transmitters shown in Figure 3. When initiating links, the generation conditions of LFPS are txpd = 2'b0 && rxpd =2'b0 && TXDETECTRX = 1 && TXELECIDLE = 1 [10] ; When waking up link partners, the generation conditions of LFPS are txppd = 2'b1 && rxpd = 2'b1 && TXELECIDLE = 0. For the detection of LFPS receiving, RXELECIDLE = 0 means LFPS detection, and RXELECIDLE = 1 means no LFPS detection [11].

First, a square wave should be generated repeatedly during the tPeriod stages by toggling the wires between a differential 

Figure 4 The tPeriod of LFPS signal captured by an oscilloscope

# C. 2.2 SIE design

As Figure 5 shown, SIE comprises three modules: the PIPE interface module, the link module, and the protocol module. The PIPE interface module is used for scrambling and descrambling data; The link module includes two parts: one is responsible for a link module and management, which is used to generate link command management packets, such as LGOOD and LCRD. The other one is Link Training Status State Machine (LTSSM), which is responsible for the transfer management from one state to another state, such as U0 and U1; the protocol module is used to generate tokens or data packets

for the response of commands from a USB host when as a device or generate token or data packets to a USB device when as a host. For a USB device, an endpoints management module is needed to manage Endpoints 0, which is used for enumeration, Endpoint 1, and Endpoint 2, which is used for bulk-in and out. For a USB host, an enumeration management module is necessary to generate commands for enumerations after a device is inserted. A device driver module is also needed depending on the type of devices attached to hosts. Resource utilization is shown in Figure 6, and the percent of used slices in Kintex-7 325t is 4.59%.



Figure 6 Resource utilization of the USB controller

## D. Initiation process

As shown in Figure 7, in the Rx.Detect state, PHY should be told to begin a receiver detection operation by asserting TXDETECTRX if the Signal phystatus is low. Signal phystatus is asserted High during receiver detection to indicate receiver detection completion [12]. So the state goes into a polling state. Transceivers should continue to transmit LFPS signals; On the other side, transceivers should check if the received LFPS signal is a polling signal. If it is, the transceivers are turned on and then transmit a TSEQ ordered set to the link partner for training the equalizer for a certain period. Following this, TS1 and TS2 ordered sets are sent as a handshake to finalize the link training and request info from the typical setup, such as a speedy response. The state will go into a U0 state from the polling state when all these are done. In the U0 state, at first, the USB controller will have a response for the enumeration request from a host and then come into bulk-in or bulk-out status.



Figure 7 LTSSM High-Level States

#### **III. EXPERIMENT RESULTS**

## A. Hardware setup

As shown in Figure 8, a KC705 board [13] is used for verification, and a mezzanine card named HiTech USB [14] is used for adding a USB interface connecting with GTX transceivers of Kintex-7 directly. More details about this connection are shown in Figure 9. It is noticeable that there should be a 0.1 uF capacitor in an RX termination, which is very common in a TX termination. The clock of the USB controller is from the output of the phase lock loop (PLL), whose input is associated with the TXOUTCLK port from transceivers whose reference clock is from a 200 MHz differential oscillator. So a FIFO is necessary between user logic and USB controllers due to different clock domains.







Figure 9 The FPGA USB 3.0 connector schematics

# B. Speed tests

The speed tests include bulk-in and bulk-out transfer tests, which could be based on the upper computer software such as the cypress stream program or the third-party software of USB analyzers such as LeCroy adviser T3 [15]. For bulk-in tests shown in Figure 10 and Figure 11, both of cypress stream program and LeCroy adviser T3 show that the USB speed is more than 320 MB/s. At the same time, the USB analyzer also shows no bit error after transferring more than 3 TB of data, and then an estimated bit error rate is less than  $10^{-13}$ , which is less than  $10^{-12}$  of the USB specification requirement. The bulk-out tests shown in Figure 12 and Figure 13 also indicate a similar result.

| 😏 C++ Streamer         |           |         |                 | —            |          | $\times$ |
|------------------------|-----------|---------|-----------------|--------------|----------|----------|
| Connected Devices      | (0x04B4 - | 0x1003) | Cypress FX2LP : | StreamerExam | ple Devi | ce 💌     |
| Endpoint               | BULK IN,  | 16      | 384 Bytes,15 Ma | Burst, (O    | - 0x81)  | ~        |
| Packets per Xfer       | 32        | Ŧ       | Successes       |              |          | 2208     |
| Xfers to Queue         | 16        | Ŧ       | Failures        |              |          | 0        |
| Timeout Per Xfer (ms)  |           | 1500    |                 | Stop         |          |          |
| - Transfer Rate (KBps) |           |         |                 |              |          |          |
| ,                      |           | 33650   | 00              |              |          |          |

Figure 10 Bulk-in transfer test based on Cypress stream program

| (SN:12269) SuperSpeed Host L Data Payload Throughput 📼 & SuperSpeed Host R Data Payload Throughput 📼 |  |  |  |   |                 | ) SN:12269                                                                                                      |                   |  |
|------------------------------------------------------------------------------------------------------|--|--|--|---|-----------------|-----------------------------------------------------------------------------------------------------------------|-------------------|--|
|                                                                                                      |  |  |  | ^ | USB 2.0         | ch                                                                                                              |                   |  |
| 6                                                                                                    |  |  |  |   | Data Park       | ets 00 000 1                                                                                                    | 000 000           |  |
|                                                                                                      |  |  |  |   | Data Byter      | 00.000.                                                                                                         | 000,000           |  |
| 1                                                                                                    |  |  |  |   | Total Byte      | 00,000,0                                                                                                        | 000,000           |  |
|                                                                                                      |  |  |  |   | USB 3.2         |                                                                                                                 |                   |  |
| 8                                                                                                    |  |  |  |   | Endpoint S      | tatistics (Addres                                                                                               | s, Endpoint, Dire |  |
|                                                                                                      |  |  |  |   |                 | 7, 1, In                                                                                                        | 0, 0, Bath        |  |
| 5                                                                                                    |  |  |  | V | Activity        | in a second s |                   |  |
|                                                                                                      |  |  |  |   | Throughput      | 345.394 MB/s                                                                                                    | 0 B/s             |  |
| 5                                                                                                    |  |  |  |   | Bytes           | 3.015 TB                                                                                                        | 0 Bytes           |  |
|                                                                                                      |  |  |  |   | ACK             | 2958221903                                                                                                      | 22/351            |  |
| 2                                                                                                    |  |  |  |   | NRDY            | 0                                                                                                               | 0                 |  |
|                                                                                                      |  |  |  |   | ERDY            | 0                                                                                                               | 0                 |  |
| a<br>                                                                                                |  |  |  |   | DP              | 2953077047                                                                                                      | 0                 |  |
|                                                                                                      |  |  |  |   | DP Error        | 0                                                                                                               | 0                 |  |
| a                                                                                                    |  |  |  |   | TP              | 2958310096                                                                                                      | 227351            |  |
|                                                                                                      |  |  |  |   | TP Error        | 0                                                                                                               | 0                 |  |
| 8                                                                                                    |  |  |  |   | Link Statisti   | cs                                                                                                              |                   |  |
|                                                                                                      |  |  |  |   |                 | Left                                                                                                            | Right             |  |
| 2                                                                                                    |  |  |  |   | LBAD            |                                                                                                                 | 0                 |  |
|                                                                                                      |  |  |  |   | CRC-5           |                                                                                                                 | 0                 |  |
| £.                                                                                                   |  |  |  |   | CRC+16          |                                                                                                                 | 0                 |  |
|                                                                                                      |  |  |  |   | RD RD           |                                                                                                                 | 0                 |  |
| 5                                                                                                    |  |  |  |   | Inv Sym         |                                                                                                                 | 0                 |  |
|                                                                                                      |  |  |  |   | DP              |                                                                                                                 | 0                 |  |
| 3                                                                                                    |  |  |  |   | DP Err          |                                                                                                                 | 0                 |  |
|                                                                                                      |  |  |  |   | DP Err %        |                                                                                                                 | 0.00000           |  |
| 8                                                                                                    |  |  |  |   | TP              |                                                                                                                 | 2958428           |  |
|                                                                                                      |  |  |  |   | TP Em           | 0                                                                                                               | 0                 |  |
| 2                                                                                                    |  |  |  |   | 02:29:15        | Restart                                                                                                         | Save and Rest     |  |
|                                                                                                      |  |  |  |   | Time From Start |                                                                                                                 |                   |  |
| 7                                                                                                    |  |  |  |   |                 |                                                                                                                 |                   |  |
|                                                                                                      |  |  |  |   |                 |                                                                                                                 |                   |  |
| 3                                                                                                    |  |  |  |   |                 |                                                                                                                 |                   |  |
|                                                                                                      |  |  |  |   |                 |                                                                                                                 |                   |  |

Figure 11 Bulk-in transfer test based on LeCroy adviser T3

| 😌 C++ Streamer        |           |         | —               |              | $\times$  |       |  |
|-----------------------|-----------|---------|-----------------|--------------|-----------|-------|--|
| Connected Devices     | (0x04B4 - | 0x1003) | Cypress FX2LP : | StreamerExam | nple Devi | C € 🔻 |  |
| Endpoint              | BULK OUT, | 16384   | Bytes, 15 MaxBu | rst, (0 -    | 0x02)     | -     |  |
| Packets per Xfer      | 32        | -       | Successes       |              | 611       | 15440 |  |
| Xfers to Queue        | 16        | •       | Failures        |              |           | 0     |  |
| Timeout Per Xfer (ms) |           | 1500    |                 | Start        |           |       |  |
| Transfer Rate (KBps)  |           |         |                 |              |           |       |  |
|                       |           |         |                 |              |           |       |  |
| 339300                |           |         |                 |              |           |       |  |

Figure 12 Bulk-out transfer test based on Cypress stream program.



Figure 13 Bulk-out transfer test based on LeCroy adviser T3.

## IV. SUMMARY

This paper proves that it is possible for USB 3.0 controllers, including PHY and SIE, based on FPGA internal transceivers and FPGA logic, and the test result shows the speed of bulk-in and bulk-out is more than 320 MB/s, and the bit error rate is less than 10<sup>-13</sup>. It is noticeable that the solution provided by this paper can be well dealt with USB 3.0 communication with upper computers. Still, if USB 2.0 compatibility is needed, an external USB 2.0 chip has to be added to the circuit.

## REFERENCES

- Tomohisa Uchida, "SiTCP Manual," Electronics system group, IPNS, KEK. Japan, [Online]. Available:
  - https://www.sitcp.net/doc/SiTCP\_eng.pdf
- [2] Bandini, A. Brigatti, A. Barresi, et al., "Embedded readout electronics R&D for the large PMTs in the JUNO experiment", Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, Volume 985, 2021, 164600
- [3] Gao, Feng, Kiel, Florian, Kuhlbusch, Tim, Stahl, Achim, Steinmann, Jochen, Wiebusch, Christopher, & Wysotzki, Christian, "The conventional PMT system for OSIRIS," Verhandlungen der Deutschen Physikalischen Gesellschaft, (Aache2019issue), 1.
- [4] Zhonghua Qin, "The 20-inch PMT system for the JUNO experiment", Institute of High Energy Physics, CAS. Beijing, China, [Online]. Available: https://indico.cern.ch/event/686555/contributions/2972208/attachments/

1682285/2703107/The20inchPMTSystem-ICHEP2018-V2.pdf Infineon. USA, [Online]. Available:

- [5] Infineon. USA, [Online]. Available: https://www.infineon.com/cms/en/product/universal-serial-bus/usb-perip heral-controllers-for-superspeed/ez-usb-fx3-usb-5gbps-peripheral-contro ller/cyusb3014-bzxi/?redirId=231959&utm\_medium=referral&utm\_sour ce=cypress&utm\_campaign=202110\_globe\_en\_all\_integration-product\_ types
- [6] Future Technology Devices International Ltd. USA, [Online]. Available: https://www.ftdichip.com/old2020/Products/ICs/FT600.html
- "TUSB1310A USB 3.0 Transceiver datasheet (Rev. G)," Texas Instruments. USA, [Online]. Available: http://www.ti.com/product/TUSB1310A
- [8] https://www.reddit.com/r/FPGA/comments/mgesks/comment/gstys94/?u tm source=share&utm medium=web2x&context=3
- [9] "Universal Serial Bus 3.0 Specification," USB Implementers Forum, Inc., USA, [Online]. Available: https://www.usb.org/documents
- [10] "7 Series FPGAs GTX/GTH Transceivers," AMD. USA, [Online]. Available:

https://www.xilinx.com/support/documentation/user\_guides/ug476\_7Ser ies\_Transceivers.pdf

- [11] https://www.truechip.net/articles-details/low-frequency-periodic-signali ng-lfps-in-usb-3-x/1557760454
- [12] "PHY Interface for PCI Express, SATA, USB 3.1, DisplayPort, and Converged IO Architectures," Intel. USA, [Online]. Available: https://www.intel.com/content/dam/www/public/us/en/documents/whitepapers/phy-interface-pci-express-sata-usb30-architectures-3.1.pdf
- [13] Kintex-7 FPGA KC705 Evaluation Kit Getting Started Guide (UG883), AMD. USA, [Online]. Available:
- https://china.xilinx.com/products/boards-and-kits/ek-k7-kc705-g.html [14] HiTech Global. USA, [Online]. Available:
- http://www.hitechglobal.com/FMCModules/FMC\_USB3.htm [15] "USB Protocol Suite User Manual," Teledyne LeCroy. USA, [Online]. Available: https://teledynelecroy.com/protocolanalyzer/usb/advisor-t3