Progress on the code:
Documentation and Github workflow
- Since a few potential users showed interest in wakis, a strong effort into documentation was made, available
here. We also added a preliminary logo!
- A new Gitub workflow was stablished to keep in main
branch always a stable version. Features are developed in separate branches and then merged into the develop
branch, that should always have the latest changes. Bugfixes and small changes that should be included in main will be added via cherry-pick
Discussion: Gianni mentions that in Xsuite and Pyheadtail, they don´t use a develop branch because it tends to become messy. Users should not fork from Github unless they are developers. We should deploy to pyPI so we have a tagged, stable version for user's to install.
GPU acceleration and memory optimization
- Implemented GPU acceleration seamlessly using cupy and cupyx. This provided a speedup of a factor x120 vs CPU for a case of 20 million cells: in GPU takes 20 minutes while in CPU would take 40h. Found that the new bottleneck for heavier simulations is memory allocation in the GPU.
- To improve memory fluctuations, an entire refactor of the Field class was made, changing the way we allocate the electromagnetic fields and material tensors from 3d matrices to 1d arrays. This refactor does not affect the rest of the code, since python magic methods enable to tweak the way we access the 1d data (setters, getters, mul, div...). This big effort payed off, having now a constant memory allocation during the time-stepping sequence, which reduces considerably GPU and CPU crashes, and also improves speed by 12%.
Discussion: Gianni mentions that for specially heavy simulations, there are GPUs available in htcondor with a much bigger memory capacity (A series). This could be requested for special cases, but normal simulations must run in CPU, since GPUs are a scarce resource. Lorenzo suggests maybe we should think about multi-processing, dividing the domain into several subdomains to send to different CPU nodes. Programming-wise is affordable, however one needs to figure out how to terminate the domain for the submatrices, and how to divide the curl operator. Elena mentions this is also related to simmetries, that haven´t been implemented yet in the code. We need to understand how these affect the matrix operators. Carlo adds that a 20M cells is already a quite heavy simulation, but for FCC simulations may get even heavier for complicated geometries.
Testing Low relativistic beta simulations.
- To understand the basics of space charge, we simulated a squared PEC pipe in both CST and wakis. The pipe is below cutoff, so PML and PEC boundaries give the same results. The impedance should be ~0 for beta=1, and purely reactive (imaginary) for beta < 1.
- For beta=1, when we inject the beam (a line current on the z direction Jz), we see field perturbations at the boundaries of the domain. These perturbations affect the wake potential, creating a delta in the wake potential that gives an undesired positive real impedance. This perturbations can be removed from the wake potential calculation by enlargind the domain or skipping cells inside the computational domain. This can be controlled in wakis with the parameter add_space
. add_space = 30 gives results osimilar to CST. add_space = 50 gives a result closer to 0 impedance, as desired.
- For beta<1, contrary to the previous case, by increasing the add_space parameter, the result differs more and more from CST. For an almost perfect agreement to CST, add_space=0 is needed. If add_space is used, the error in the wake potential is equal to the error in impedance, proving that the wakis algorithm is sound. The injection time in CST changes with beta, following the empirically derived formula: $t_inj =(8.54σ_z)/c(β)^(3/2)$. When applying this injection time in wakis, the results match perfectly.
- With these findings, a model to substract automatically the indirect space charge from the impedance could be made, avoiding the lengthy simulation of the bounding box,
Discussion: Carlo mentions that this is a very important results that deepens our understanding of space charge and low beta simulations. Lorenzo wonders why the perturbation is occuring in the first place, since we are using PEC cells and the field at the boundaries should be cero. Elena clarifies that this is coming from the continuity equation, since we don´t correct the divergence, and when the current is >0, the perturbation starts. Gianni asks if the current is injected forcing a value on the line and Elena confirms. Gianni wonders if a different approach could be followed that gives cero perturbation. Elena mentions that no gaussian line current can propagate in the domain from the boundary. This is true for planewaves and gaussian wavepackets, that are only initialized at the boundary, but the current does not propagate and have to be enforced. Lorenzo mentions that in CST, there is probably some initial values at the boundary that diminish the perturbation. An approach like the current damper can be applied if we inject at the PML. Carlo mentions that for now, we can skip the cells in the domain, but this is a temporary solution that can become computationally costly for more complicated simulations. An idea is to break the simulation in two: first the source excitation in a larger domain, and then the time-stepping without the current in a shorter domain. copying the results of the field. To be investigated.
First PML implementation.
- The PML formulation can be derived entirely from Fresnell and Snell laws. Elena has worked on the
mathematical derivation, available here. The summary of the implemented PML is an anisotropic conductivity at the boundary that grows inside the PML following an exponential law. Each layer of the PML have to be matched impedance-wise to avoid reflections, and an increasingly high conductivity is desired to prevent the field from reflecting at the last layer (PEC) and back to the domain. This is implemented in the code by modifying the conductivity tensor $\sigma$ at the last 10 cells of the domain where PML is chosen. A new $\sigma^*$ tensor needs to be added for the magnetic field. It does not affect the computation time since everything is pre-computed. The exponential profile chosen (e.g., for x direction) is:
σ_x=ε_0/2Δt ((x_(N_pml )-x_(0_pml ))/(N_pml Δx))^n
σ_x^∗= μ_0σ_x
- To test the PML, the Berenger test was used. The test consists on a reference simulation of a Harris pulse that only has a positive Ez field, and a test simulation with the domain halved, so the pulse collides with the x-low boundary. Subtracting the test and the reference simulation, one can obtain the reflection from the boundary. Originally, this is a 2D simulation. and could be adapted to 3D wakis simulation by changing the harris pulse from a point source to a line source extending in z, and using PEC boundaries in z-lo, z-hi.
- All the available boundary conditions were tested: PEC gives 99.9% reflection, ABC gives 30% reflection, and periodic gives no reflection but the wave enters from x-high. A bug was found in the ABC at x-hi and y-hi, not working as expected and giving >100% reflection. A first 10 layer PML was tested, giving 70% reflection. Trying to improve its performance, a stability limit for the PML was found. In the PML region, the conductivity cannot be higher than epsilon/(2 dt) where dt is the simulation timestep. To increase conductivity, one should increase epsilon in the PML region. The performance of the PML has to be further studied and optimized.
- Besides the Berenger test, the PML was tested in a simulation of a PEC cubic cavity with modes above cutoff. Comparing the wakis results with PEC and PML versus CST, the results almost match the values of the modes above cutoff. From the E-field point of view, it is observed how the field goes to 0 at the boundaries thanks to the PML, but the injection perturbation gets worse. The PML was further tested with the worst-case scenario of a step-out transition, in which the PEC boundaries create an artificial resonating structure. With the PML, the results match in amplitude the results from CST, but a frequency shift can be observed, maybe coming from the geometry. To be further optimized and tested.
Discussion: Lorenzo comments that it is very good that the perturbation vanishes to cero and does not stay inside the PML. One can try to apply the current damping algorithm user in warpX. The fact that we understood the stability limit is also good. Gianni comments that this is not in production state but it's going in the right direction. He also points out that the bug of the ABC-high boundaries should be studied in a under the same conditions as low, to narrow the origin of the bug. Carlo mentions that analytic formulas are available for the transition impedance and could be a further benchmark for wakis. Elena comments that she found
a paper with the full analytical formula, not only the asymtotic behavior, but it is quite convoluted to implement. Carlo mentions that the case of step-in transition should be studied too.
Future plans and discussion:
Still many things to do:
- Good conductors with leontovich condition
- Staircased geometry, grid refinement with non-uniform grid
- moving window for short bunches
- frequency dependent materials
- frequency domain monitors,
- TESTS ….
Discussion: Carlo comments that from the strategic point of view, the moving window simulation could be the best option to implement, since it is being requested for short-range, short bunches simulations. People are switching to ECHO3D, a code by Igor Zargodonov, that includes the moving window to speed up the simulation. This is ideal for collimators and FCC impedances. Gianni states that for him the priority is to have a stable version deployed in pyPI with automatic tests. He explain the advantages of adopting the pragmatic programming approach: it slows down the development in the early stages but ensures a good quality and manageable code that allows continuous growth in the later stages. With the approach we followed until now, without integrated tests, the code will become unmanageable when it grows bigger, hampering the development of new features. The strategy adopted means stopping developments for a few months to write the tests. As for the moving window implementation, Lorenzo says that it should not be a big programming effor, but the speed advantage is not warrantied since we have to modify all the material tensors every timestep. Elena mentions that the moving window changes the reference frame from t to s, and the wake potential calculation algorithm will need modifications. Carlo adds that a good idea to show that wakis can be already used, is to have a complex structure simulated. For example, the wire scanner could be a candidate. Gianni comments that the idea is to start having users that use the code independently, only needing the documentation.
Next steps: write tests, change code structure to a package and deploy to pyPI as a priority, while keeping optimmizing PML and investigating new areas of development. Ideally, before ICAP, the package should be deployed and we should have a complex simulation of the e.g. wire scanner to show at the conference.