High energy physics has a constant demand for random number generators (RNGs) with high statistical quality. In this paper, we present ROOT's implementation of the RANLUX++ generator. We discuss the choice of relying only on standard C++ for portability reasons. Building on an initial implementation, we describe a set of optimizations to increase generator speed. This allows to reach performance very close to the original assembler version. We test our implementation on an Apple M1 and Nvidia GPUs to demonstrate the advantages of portable code.