Skip to content

unelg/CutLang

Repository files navigation

cutlangg copy

This is the repository for CutLang: A Particle Physics Analysis Description Language Runtime Interpreter. CutLang is a domain-specific language and interpreter for cut-based HEP data analysis. It allows users to write analysis in ADL (Analysis Description Language) files, which are then interpreted by the CutLang framework at run time. The interpreter is implemented in C++ and is built on top of the CERN data analysis framework ROOT. CutLang offers several features to make data analysis more efficient and error-free, including object definitions, event selections, histogramming, and Monte Carlo weighting. It also supports multi-core/multi-CPU hardware and has the ability to save events at any stage of the analysis. The latest version CutLang V3 uses the Lex/Yacc-based approach for ADL file processing and has several enhancements over the previous version, including improved handling of object combinatorics, the ability to include tables and weights and support for more complex algorithms. By providing a standard and human-readable way for writing and interpretation, CutLang and ADL is aiming to advance the field of HEP data analysis.

What is ADL?

ADL (Analysis Description Language) is a domain-specific language used to describe and implement analysis in high-energy particle physics experiments. ADL allows users to write HEP analyses in a clear and easily readable format. ADL is written in a way that is independent of any specific computing framework, making it easier to share and compare analyses between different users and experiments. It is used to define the criteria for selecting events of interest and to specify how the selected events should be processed (e.g. by calculating certain variables or performing specific cuts). ADL is typically used in conjunction with a Monte Carlo simulation to generate samples of events and a data analysis framework to process the events and produce results. You can read more about it here.

Auto-Generated Graph of an ADL Analysis (using Graphviz)

Contents

Table of Contents

Installing CutLang

Cutlang is available on Linux, macOS, and Windows (partially).

from Source

Available on Linux, macOS, and Windows

▷ Setup

Dependencies:

  • ROOT6
  • command line compilation utilities (make, gcc, g++...)
  • flex
  • bison (without installing flex and bison, the make command gets interrupted by a fatal error).

Setup the package using:

  git clone https://github.com/unelg/CutLang.git
  cd CutLang
  source setup.sh
  # if you want, you can run
  # echo "source /path/to/CutLang/setup.sh" >> ~/.bashrc
  # to keep it active all the time
  CLA_compile # this will run the following in order
  # cd $CUTLANG_PATH/CLA
  # make -j

Now, you can run CutLang (please see Running part)

▷ Update

Update the package using:

  cd CutLang
  git pull
  CLA_recompile # this will run the following in order
  # cd $CUTLANG_PATH/CLA
  # make clean
  # make -j

▷ Remove

Remove the package using:

  rm -rf /path/to/CutLang

with Conda

Available on Linux, macOS Anaconda-Server BadgeAnaconda-Server BadgeAnaconda-Server Badge

▷ Setup

Dependencies:

Create and activate the environment using:

 conda create -c conda-forge -c cutlang --name <my-environment> cutlang # download CutLang and create environment
 conda activate <my-environment> # activate environment

Now, you can run CutLang (please see Running part)

▷ Update

Update the environment using:

 conda create -c conda-forge -c cutlang --name <my-environment> cutlang # remove the existing environment and install the latest version
 # or just
 conda update -c conda-forge -c cutlang cutlang # run in environment with cutlang installed

 # or force update (temporary, do not use unless necessary)
 CLA_conda_update

▷ Remove

Remove the environment using:

  conda deactivate <my-environment>
  conda env remove --name <my-environment>

with Docker

Available on Linux, macOS, and Windows

▷ Setup

Dependencies:

After installing the Docker, download the image and run the container using:

  docker run -p 8888:8888 -p 5901:5901 -p 6080:6080 -d -v $PWD/:/src --name CutLang-root-vnc cutlang/cutlang-root-vnc:latest 
#If you would like to re-run by mounting another directory, you should stop the container using
#docker stop CutLang-root-vnc && docker container rm CutLang-root-vnc
#and rerun with a different path as 
#docker run -p 8888:8888 -p 5901:5901 -p 6080:6080 -d -v /path/you/want/:/src ...
#For example: 
#docker run -p 8888:8888 -p 5901:5901 -p 6080:6080 -d -v ~/example_work_dir/:/src --name CutLang-root-vnc cutlang/cutlang-root-vnc:latest

For Windows:

 docker run -p 8888:8888 -p 5901:5901 -p 6080:6080 -d -v %cd%/:/src --name CutLang-root-vnc cutlang/cutlang-root-vnc:latest
 #If you would like to re-run by mounting another directory, you should stop the container using
>> docker stop CutLang-root-vnc && docker container rm CutLang-root-vnc
#and rerun with a different path as 
# docker run -p 8888:8888 -p 5901:5901 -p 6080:6080 -d -v /path/you/want/:/src ... 
#For example:
#docker run -p 8888:8888 -p 5901:5901 -p 6080:6080 -d -v ~/example_work_dir/:/src --name CutLang-root-vnc cutlang/cutlang-root-vnc:latest

Execute the container by docker exec -it CutLang-root-vnc bash .

If you have installed the container successfully, you will see:

For examples see /CutLang/runs/
and for LHC analysis implementations, see
https://github.com/ADL4HEP/ADLLHCanalyses

Now, the container is ready to run CutLang. You can leave the container by typing exit on the command line.

▷ Update

In case an update is necessary, you can perform the update as follows:

docker pull cutlang/cutlang-root-vnc:latest
docker stop CutLang-root-vnc && docker container rm CutLang-root-vnc
docker run -p 8888:8888 -p 5901:5901 -p 6080:6080 -d -v $PWD/:/src --name CutLang-root-vnc cutlang/cutlang-root-vnc

▷ Remove

Remove the docker container and image using

docker stop CutLang-root-vnc
docker ps -a | grep "CutLang-root-vnc" | awk '{print $1}' | xargs docker rm
docker images -a | grep "cutlang-root-vnc" | awk '{print $3}' | xargs docker rmi

with Jupyter

Available on Linux, macOS, Windows

⚠️ In order to run CutLang in Jupyter, you must first complete the setup from source or with Conda or Docker.

▷ Setup

  • You should have completed the CutLang setup and be able to run the CLA command without any problems.

▷ Starting

Starts Jupyter with "ROOT c++ with CutLang" kernel in your current directory

  CLA_Jupyter lab
  # or
  CLA_Jupyter notebook
  # Jupyter will be started, you can use by using the link 127.0.0.1:8888/... in the logs

▷ Jupyter CutLang Magic

CutLang can be used on Jupyter notebooks with ROOT

  • You can see how to use ROOT notebooks from the link
  • You can also run CutLang on any cell as (For detailed information, you can check the tutorial section.)
%%cutlang file=<root-file-name> filetype=<root-file-type> ...

Running CutLang

CutLang can be run anywhere using the CLA (shell script) or using the CLA.py scripts.

 CLA (or CLA.py) [inputrootfile] [inputeventformat] -i [adlfilename.adl] -e [numberofevents]
 # Also, you can start simultaneous processes, which can increase the analysis speed tremendously.
 CLA (or CLA.py) [inputrootfile] [inputeventformat] -i [adlfilename.adl] -e [numberofevents] -j 0
 # When you enter 0 in the j flag, it will start the process as much as the number of processor cores, if you want, you can set the number of processes by changing the value from 0.
 # for example:
 # CLA (or CLA.py) [inputrootfile] [inputeventformat] -i [adlfilename.adl] -e [numberofevents] -j 8
 # above command starts 8 simultaneous processes
  • Input event formats can be: DELPHES, CMSNANO, LHCO, FCC, ATLASVLL, ATLASOD, CMSOD, VLLBG3 and LVL0 (CutLang internal format)
  • Number of events is optional.

The output will be saved in histoOut-[adlfilename].root. This ROOT file will have a separate directory for each search region, which contains the relevant histograms and ADL content defining the region. The histogram(s) cutflow (and bincounts, in case search bins are specified in the region) exist by default.

Getting Started with Examples:

First, download some simple event samples:

wget https://www.dropbox.com/s/zza28peyjy8qgg6/T2tt_700_50.root

The samples contain SUSY events in DELPHES format.

ADL syntax is self-descriptive. One can study and run several tutorial examples to learn the main syntax rules. These examples can be seen by:

ls /CutLang/runs/tutorials/*.adl

Read the ADL files in the tutorials directory to understand the algorithm and syntax. Then run the ADL files with the commands given below. If there are histograms made, check out the resulting ROOT file and inspect the histograms.

CLA T2tt_700_50.root DELPHES -i /CutLang/runs/tutorials/ex05_functions.adl
CLA T2tt_700_50.root DELPHES -i /CutLang/runs/tutorials/ex06_bins.adl
CLA T2tt_700_50.root DELPHES -i /CutLang/runs/tutorials/ex12_counts.adl

More ADL files for various full LHC analyses (focusing on signal region selections) can be found in this git repository.

Tutorial

Launch with Binder:

  • with Jupyter Lab: Binder

  • with Jupyter Notebook: Binder

Launch with Self Host:

▷ Setup

⚠️ CutLang installation should be complete and CLA command should run without any problems.

▷ Starting

Starts Jupyter with "ROOT c++ with CutLang" kernel in $CUTLANG_PATH directory

  CLA_tutorial lab
  # or
  CLA_tutorial notebook
  # Jupyter will be started, you can use the tutorial by using the link 127.0.0.1:8888/... in the logs
  # Then you can browse index and other ipynb files in binder folder

▷ Update

  CLA_tutorial_update
  # force update (temporary, do not use unless necessary)

FAQ

Where to find an example ntuple?

Ntuple files are kept in CLA directory.

Where to find example ADL files? Example adl files are kept in runs directory, you can also check out the repository at https://github.com/ADL4HEP/ADLLHCanalyses

Contributing

Setting Up The Development Environment

◆ from Source

You can refer to using CutLang from source

◆ with Conda

Create and activate the environment using

 git clone https://github.com/unelg/CutLang.git
 cd CutLang
 conda env create -f scripts/environment.yml # create environment with dependencies
 conda activate CutLang-dev # activate development environment
 source setup.sh
 # if you want, you can run
 # echo "source /path/to/CutLang/setup.sh" >> ~/.bashrc
 # to keep it active all the time

 # then you should run (just first time)
 CLA_recompile # this will run the following in order
 # cd $CUTLANG_PATH/CLA
 # make clean
 # make -j

◆ with Docker

Compile CutLang, and build and run the container using:

 git clone https://github.com/unelg/CutLang.git
 cd CutLang
 ./scripts/docker/util.sh dev
 # ! Do not add the dockerfile created for the development environment to the git
 docker-compose up

Exec the container using (in the another terminal window):

 docker exec -it cutlang-dev bash
 # then you should run in docker (just first time)
 CLA_recompile # this will run the following in order
 # cd $CUTLANG_PATH/CLA
 # make clean
 # make -j

Build and Deploy Environment

◆ Conda

Dependencies:

See https://anaconda.org

 git clone https://github.com/unelg/CutLang.git
 cd CutLang
 cd scripts/conda
 # To change version you have to edit scripts/conda/meta.yaml
 conda-build -c conda-forge .
 # You will see the "anaconda upload /file/path/to/upload" command at the end of the logs after the compile process is finished, you can upload the package to the relevant conda channel by using this.
 # And then it can be used with:
 # conda create -c conda-forge -c <your-username> --name <your-environment> cutlang

◆ Docker

See https://hub.docker.com

 git clone https://github.com/unelg/CutLang.git
 cd CutLang
 ./scripts/docker/util.sh prod
 # you need to edit docker-compose.yml image name <your-username/cutlang>
 docker-compose build
 docker push <your-username>/<your-image-name>:<tagname>
 # example:
 # docker push cutlang/cutlang:latest

Note

If a newer version is released, please wait until all the tests are checked in the respective commit and the associated packages have been available. A notifying mail will be sent.