# A Low-Complexity MLSE Algorithm for the NRZ High-Speed Transceivers



Dongwei Zou (dwzou@mail.ustc.edu.cn), Kezhu Song\* (skz@ustc.edu.cn), Xiangshi Zhong, Chengyang Zhu State Key Laboratory of Particle Detection and Electronics, University of Science and Technology of China

Abstract

This paper proposes a low-complexity maximum likelihood sequence estimation (MLSE) algorithm tailored for non-return-to-zero (NRZ) highspeed transceivers. In particle physics experiments, data transmission volumes are continually increasing, with transceivers assuming pivotal roles. MLSE has garnered significant attention due to its notable advantages in eliminating inter-symbol interference (ISI) and its capability to replace decision feedback equalizers (DFEs). However, MLSE complexity exponentially escalates with the increment of traceback length and equalizer order. Hence, reducing MLSE complexity while preserving performance is imperative. This paper simplifies MLSE transition metric calculations, obviating the necessity for intricate state computations and result storage. A configurable and highly adaptable transceiver simulation system is developed utilizing a field programmable gate array (FPGA), and the proposed algorithm is evaluated using this system. Synthesis results demonstrate that the proposed algorithm significantly reduces resource utilization while maintaining algorithm performance.

### **Traditional MLSE**

In communication systems, FFE is a linear equalizer, and its complexity scales linearly with the channel order. Both DFE and MLSE are nonlinear equalizers, with DFE having lower complexity than MLSE. However, DFE suffers from significant error propagation, which is a notable drawback. In terms of equalizer performance, MLSE is currently considered the optimal choice for receiver-side application.

$$\begin{split} tm_{1,1}[n] &= \alpha - x \left(1 + \alpha\right) \\ tm_{1,-1}[n] &= -\alpha - x \left(-1 + \alpha\right) \\ tm_{-1,1}[n] &= -\alpha - x \left(1 - \alpha\right) \\ tm_{-1,-1}[n] &= \alpha + x \left(1 + \alpha\right). \\ sm_{1}[n+1] &= min(sm_{1}[n] + tm_{1,1}[n]), \\ sm_{-1}[n] + tm_{-1,1}[n]) \\ sm_{-1}[n+1] &= min(sm_{1}[n] + tm_{1,-1}[n]), \\ sm_{-1}[n] + tm_{-1,-1}[n]). \end{split}$$



Fig. 2. Trellis diagram for NRZ signaling.

Fig. 3. Block diagram of the MLSE hardware circuit.

Fig. 3 illustrates the block diagram of the MLSE hardware circuit. The block diagram consists of the following components: the transition metric unit (TMU), add compare select unit (ACSU), overflow protection unit (OPU), survivor memory unit (SMU), and channel estimation unit (CEU).

The Viterbi algorithm selects the minimum value of the sum of transition metrics (*tm*) and state metrics (*sm*) for the current state, which is then employed as the next state metric.

# **Proposed MLSE**

Based on the principle of MLSE, this paper deduces a new expression for computation. In the new expression,  $\Delta T$  is the base variable, where *J* represents  $\Delta T[n-1] + \alpha x[n]$ .

$$\Delta T[n] = \begin{cases} x[n] - \alpha & \alpha \le J \\ -\Delta T[n-1] + (1-\alpha)x[n] & -\alpha \le J < \alpha \\ x[n] + \alpha & J < -\alpha \end{cases}$$

This approach significantly reduces the cumulative computations, requiring only straightforward assessments to accomplish equalization.





 TABLE I

 Resource utilization from Quartus.

| Resource               | Proposed MLSE | Traditional MLSE | Reduction |
|------------------------|---------------|------------------|-----------|
| ALMs                   | 904.2         | 1490.3           | 39.32 %   |
| Combinational<br>ALUTs | 630           | 1114             | 43.45 %   |
| Registers              | 1415          | 2304             | 38.59 %   |
| DSP Blocks             | 4             | 12               | 66.67 %   |

This paper employed a customizable transceiver simulation system to validate the low-complexity MLSE algorithm. The development board utilized is the Intel Agilex 7 FPGA I Series Transceiver-SoC Development Kit, identified by model number DK-SI-AGI027FB. The FPGA chip utilized is the Agilex 7 AGIB027R31B1E1VAA from Intel.

### Conclusion

The resource utilization of the proposed low-complexity MLSE has been significantly decreased, facilitating its practical implementation. The resource utilization of the proposed algorithm decreases by at least 38.59 % and up to 66.67 %. Simultaneously, the power consumption of the proposed algorithm is halved compared to the original. The test results demonstrate that the proposed algorithm achieves resource savings while preserving performance integrity and without introducing additional bit errors.



# **References and Acknowledgments**

 J. G. Proakis et al., Digital communications. McGraw-hill, 2008.
 Q. Chen et al., "A 14-Gb/s VCSEL driver in 65-nm CMOS with a power-efficient driving structure for particle physics experiments," IEEE Transactions on Nuclear Science, vol. 70, no. 6, pp. 1001–1006, 2023.

[3] X. Niu et al., "Design and simulation of HiGBt, a 5 Gb/s SerDes for heavy-ion physics experiments," IEEE Transactions on Nuclear Science, vol. 70, no. 6, pp. 1083–1089, 2023.
[4] B. Deng et al., "GBT20, a 20.48 Gbps PAM4 optical transmitter

[4] B. Deng et al., "GBT20, a 20.48 Gbps PAM4 optical transmitter module for particle physics experiments," Journal of Instrumentation, vol. 18, no. 2, p. C02065, 2023.

[5] D. Zeolla et al., "DFE versus MLSE electronic equalization for gigabit/s SI-POF transmission systems," IEEE Photonics Technology Letters, vol. 23, no. 8, pp. 510–512, 2011.

[6] G. D. Forney, "The viterbi algorithm," Proceedings of the IEEE, vol. 61, no. 3, pp. 268–278, 1973.

This work is supported by the China National Key Research and Development Program under Grant 2022YFF0706800 and Grant 2022YFF0706804.