DIANA Meeting - Swordfish and Arrow

Europe/Zurich
40/R-B10 (CERN)

40/R-B10

CERN

20
Show room on map
Jim Pivarski (Princeton University), Vincent Alexander Croft (New York University (US))
Description

Swordfish

With swordfish you can quickly and accurately forecast experimental sensitivities without all the fuss with time-intensive Monte Carlos, mock data generation and likelihood maximization.

With swordfish you can

  • Calculate the expected upper limit or discovery reach of an instrument.
  • Derive expected confidence contours for parameter reconstruction.
  • Visualize confidence contours as well as the underlying information metric field.
  • Calculate the information flux, an effective signal-to-noise ratio that accounts for background systematics and component degeneracies.

A large range of experiments in particle physics and astronomy are statistically described by a Poisson point process. The swordfish module implements at its core a rather general version of a Poisson point process, and provides easy access to its information geometrical properties. Based on this information, a number of common and less common tasks can be performed.

https://github.com/cweniger/swordfish

Apache Arrow

Arrow is a high-performance cross-system data layer for columnar in-memory analytics.

Like a file format, it allows data to be transferred between data analysis platforms, but through zero-copy shared memory, rather than files on disk.

Arrow is backed by key developers of 13 major open source projects, including Calcite, Cassandra, Drill, Hadoop, HBase, Ibis, Impala, Kudu, Pandas, Parquet, Phoenix, Spark, and Storm making it the de-facto standard for columnar in-memory analytics. It also supports a wide variety of industry-standard programming languages, such as Java, C, C++, Python, Ruby, and JavaScript.

As a columnar format, Arrow is suited for native vectorized optimization of analytical data processing.

https://arrow.apache.org/

Recorded Meeting Video: https://www.youtube.com/watch?v=tfaG503cN3M