Speaker
Description
Use of declarative languages for HEP data analysis is an emerging, promising approach. One highly developed example is ADL (Analysis Description Language), an external domain specific language that expresses the analysis physics algorithm in a standard and unambiguous way, independent of frameworks. The most advanced infrastructure that executes an analysis written in the formal ADL syntax is the CutLang (CL) runtime interpreter based on traditional parsing tools. CL which was previously presented in this conference, has been further developed in the last years to cope with most LHC analyses. The new additions include full fledged histogramming and data-MC comparison facilities alongside an interface to a number of well known limit setting tools.
The ADL/CL architecture was thus far prepared and built with a general-purpose programming language, without formal computing expertise and has grown into a complex monolithic structure. To facilitate maintenance and further development of CL, while making it reusable in other (non-scientific) domains, we designed a protocol called Dynamic Domain Specific eXtensible Language (DDSXL) that modularizes its monolithic structure. The DDSXL protocol provides a set of strict rules that allow each researcher to work in their area of expertise and understand the work done without any expertise in other areas, completely independent of the programming languages and frameworks used.
DDSXL integrates a domain ecosystem (such as CL) into the development environment with a completely abstract structure using various OOP design patterns and with a set of rules determined through communication over the network. This protocol also integrates numerous programming languages and frameworks, allowing each developer to integrate it into their own module without the need for expertise in technologies from other modules.
Here, we introduce the latest developments in ADL/CL focusing on the working principles of the DDSXL protocol and integration.
References
ADL/CutLang have been published many times, as listed below. However the recent work on the DDSXL protocol proposed for presentation above has not been yet published.
-- Project website: cern.ch/adl (includes all references)
Publicatons
-- B. Gokturk, A. M. Toon, A. Paul, B. Orgen, N. Ravel, J. Setpal, G. Unel, S. Sekmen, "CutLang V2: towards a unified Analysis Description Language", Frontiers in Science, Big Data, 2021, doi:10.3389/fdata.2021.659986, arXiv:2101.09031
-- G. Unel, S. Sekmen and A.M. Toon, “CutLang: a cut-based HEP analysis description language and runtime interpreter,” J. Phys. Conf. Ser. 1525 (2020) no.1, 012025 doi:10.1088/1742-6596/1525/1/012025, arXiv:1909.10621.
-- S. Sekmen and G. Unel, “CutLang: A Particle Physics Analysis Description Language and Runtime Interpreter,” Comput. Phys. Commun. 233 (2018), 215-236, doi:10.1016/j.cpc.2018.06.023, arXiv:1801.05727.
Proceedings: 12 proceedings including the following 2 for ACAT:
-- ACAT 2021: "Declarative interfaces for HEP data analysis: FuncADL and ADL/CutLang", 29 Nov - 3 Dec 2021, Daejeon, South Korea
-- ACAT 2019: "CutLang analysis description language and runtime interpreter" (poster), 10-15 March 2019, Saas Fe, Switzerland, G. Unel et al 2020 J. Phys.: Conf. Ser. 1525 012025 , https://arxiv.org/abs/1909.10621
Significance
ADL/CutLang aims to solve the complexity of HEP analyses by using a domain specific language adapted to collider physics. This presentation focuses on a new protocol proposal which bring a fundamental change to the interpreter infrastructure design. The protocol would convert the interpreter's monolithic structure into a more generic setup where hexagonal architecture design principles are applied. This means independent blocks such as parsers, interpretation engines etc are communicating over the network and can dynamically be replaced or extended. Therefore we expect the resulting protocol to be able to handle not only HEP analyses but other data management tasks as well. A completely out of HEP context example would be the analysis of insurence data to detect frauds.