Georg Weidenspointner (MPE Garching)
An extensively documented, quantitative study of software evolution resulting in deterioration of physical accuracy over the years is presented. The analysis concerns the energy deposited by electrons in various materials produced by Geant4 versions released between 2007 and 2013. The evolution of the functional quality of the software is objectively quantified by means of a rigorous statistical analysis, which combines goodness-of-fit tests and methods of categorical data testing to validate the simulation against high precision experimental measurements. Significantly lower compatibility with experiment is observed with the later Geant4 versions subject to evaluation; the significance level of the test is 0.01. Various issues related to the complexity of appraising the evolution of software functional quality are considered, such as the dependence on the experimental environment where the software operates and its sensitivity to detector characteristics. Methods and techniques to mitigate the risk of “negative improvements” are critically discussed: they concern various disciplines of the software development process, including not only testing and quality assurance, but also domain decomposition, software design and change management. Concrete prototype solutions are presented. This study is intended to provide a constructive contribution to identify possible causes of the deterioration of software functionality, and means to address them effectively. It is especially relevant to the HEP software environment, where widely used tools and experiments’ software are expected to stand long life-cycles and are necessarily subject to evolution.