Conveners
Algorithms and Machines
- Kate Clark (NVIDIA)
Algorithms and Machines
- Patrick Steinbrecher (Brookhaven National Laboratory)
Algorithms and Machines
- Ming Gong
Algorithms and Machines
- Andrei Alexandru (The George Washington University)
I discuss motivations for generalizing QCD simulations for heterogeneous clusters, and identify possible models of LQCD simulations suitable for a modular supercomputing environment. The Jureca cluster at the Juelich Supercomputing Centre, with Haswell, KNL, and GPU-enabled compute nodes, serves as a test bed for modular supercomputing strategies. I describe initial tests with the MILC code...
We present a preliminary code package designed for Sunway infrastructure of Taihu-Light supercomputer. Meta-programming and genetic algorithm on a new designed virtual machine layer are adopted to investigate the feasibility of a general automatic optimization scheme.
The open source ROCm platform for GPU computing provides an uniform framework to support both the NVIDIA and AMD GPUs, and also the possibility to porting the CUDA code to the ROCm-compatible one. We will present the porting progress on the Overlap fermion inverter (GWU-code) based on thrust and also a general inverter package - QUDA.
With the latest generation of leadership-class machines, lattice QCD simulations are able to probe multi-scale physics with unprecedented resolution. These advancements come with super-linear increases in the costs of modern simulations due to the phenomena of critical slowing down. In the case of linear solvers for LQCD, the only robust solution to this challenge is the development and...
The ability to strong scale is crucial for Lattice QCD simulations. Since the creation of the QUDA library for Lattice QCD on NVIDIA GPUs, this has always been a key development goal. Techniques like GPUDirect RDMA and NVLink allow for fast intra-node and inter-node data transfer and QUDA makes extensive use of them. However, API overheads and necessary synchronizations between GPU and CPU are...
Hadrons is a free C++ framework based on the high-performance Grid library to implement lattice QCD measurement workflows. It is based on a modular dataflow programming approach to accommodate with the heterogeneity of lattice measurements. The different measurement steps (inversions, contractions, I/O …) are implemented as individual modules with inputs and outputs, and a measurement workflow...
We have evaluated perturbation coefficients of Wilson loops up to $O(g^8)$ for the four-dimensional twisted Eguchi-Kawai model using the numerical stochastic perturbation theory (NSPT) in arXiv:1902.09847. In this talk we present the progress report on the higher order calculation up to $O(g^{63})$, for which we apply the fast Fourier transformation (FFT) based convolution algorithm to the...
The quantum link (or QCD abacus) Hamiltonian was introduced as a
classical algorithm representing both gauge and matter fields by
single bit fermion operators in an extra dimension. This formalism is
recast for quantum computing, as a Hamiltonian in Minkowski space for
real time Qubit simulations. The advantages of pseudo-fermions
to implement the Jordan Wigner transformation and the...
The RBC and UKQCD Collaborations continue to produce 2+1 flavor domain wall fermion ensembles, currently focusing on an ensemble with a $96^3 \times 192$ volume on SUMMIT at ORNL with $1/a = 2.8$ GeV, and smaller ensembles at stronger couplings. The $1/a = 2.8$ GeV ensemble uses the Exact One Flavor Algorithm for the strange quark, along with the Multisplitting Preconditioned Conjugate...
In papers [Fukuma, Matsumoto, Umeda, arXiv:1705.06097, arXiv:1806.10915], we defined for a given Markov chain Monte Carlo (MCMC) algorithm a distance between two configurations that quantifies the difficulty of transition from one configuration to the other configuration. In this talk, we discuss its application to the optimization of parameters in various tempering algorithms. Examples...
Questions about quantum field theories at non-zero chemical potential and/or real-time correlators are often impossible to investigate numerically due to the sign problem. A possible solution to this problem is to deform the integration domain for the path integral in the complex plane. Sampling configurations on these manifolds is challenging. In this talk I will discuss some of these...
Markov Chain Monte Carlo (MCMC) allows efficient estimation of observables in many lattice theories. However, as a critical point in parameter space is approached, typical MCMC algorithms suffer from critical slowing-down: autocorrelation lengths in the chain diverge for all observables, demanding increasingly more computational cost to achieve the same statistical power. In lattice QCD, for...
We present a method for accelerating topological transition in the 2D
Schwinger model with a compact U(1) gauge and Wilson fermions by
coupling the 2D lattice via a 3rd dimension with open boundary
conditions. The fermions live on the central slice. This allows
topological charge to flow into the central slice, which maintains
an integer valued winding. The resulting effective action on...
To compute disconnected quark loop operators, stochastic noise methods are generally used. In order to strengthen the physical signal projected out from these noisy methods, various subtraction techniques may be employed. We use the GMRES-DR and MINRES-DR algorithms to solve for the linear equations of the non Hermitian Wilson and Hermitian Wilson matrices, while simultaneously calculating low...
In this talk I will discuss the recently introduced frequency-splitting estimators
of quark-line disconnected diagrams in lattice QCD. The evaluation of these diagrams is
required for many phenomenologically interesting observables, but suffers from large statistical
errors due to the vacuum and the random-noise contributions to their variances. Multi-level integration
has the potential to...
In recent years multigrid algorithms have dramatically reduced the cost of generating gauge field ensembles and quark propagators for lattice simulations including light quarks described by the Wilson and Wilson-clover fermion actions. As a result, we have observed in recent calculations of nuclear physics at the physical pion mass that assembling correlation functions from quark propagators...
We investigate power of Machine Learning for Lattice QCD problems. We used three set up. First, we used bare configurations of gauge fields and trained ML model to calculate Polyakov loop: trained at two betas it predicts correct critical value. Second, we used set of Wilson loops for classification of phases: trained in SU(2) ML model gives some signal in SU(3). And third, with spacial...
We employ machine learning techniques to estimate the topological charge $Q$ of gauge configurations in SU(3) Yang-Mills theory. As a first trial, four-dimensional convolutional neural networks are trained to estimate the topological charge from the topological charge density on gauge configurations. The value of $Q$ measured by the gradient flow is used for the definition of the correct...
OpenMP is a programming model that has been widely used for multi-threaded computations on multicore and many-core CPUs. However, its support for GPU accelerated computing was not available until OpenMP 4.0. Since then, many new features and capabilities have been added to the OpenMP standard to enable GPU offloading in response to the popularity of GPU computing. In this presentation, we will...