### Conveners

#### Exploring the Latent Structure of Data: Data Structure

- Jean-Roch Vlimant (California Institute of Technology (US))
- Anja Butter

#### Exploring the Latent Structure of Data: Latent Space Exploration

- Jean-Roch Vlimant (California Institute of Technology (US))
- Anja Butter

Unsupervised anomaly detection could be crucial in future analyses searching for rare phenomena in large datasets, as for example collected at the LHC. To this end, we introduce a physics inspired variational autoencoder (VAE) architecture which performs competitively and robustly on the LHC Olympics Machine Learning Challenge datasets. We demonstrate how embedding some physical observables...

Symmetries are a fundamental property of functions applied to datasets. A key function for any dataset is the probability density, and the corresponding symmetries are often referred to as the symmetries of the dataset itself. We provide a rigorous statistical notion of symmetry for a dataset, which involves reference datasets that we call ...

Fundamental laws of physics introduce specific topological features in the phase-space of n-body processes in collider events. We introduce a new analysis approach relying on analyzing such global topological properties of the manifold over the distribution of events. One specific property of potential interest is the dimensionality of the phase space. It can, for example, be used for...

We build a simple probabilistic model for collider events represented by a pattern of points in a space of high-level observables. The model is based on three assumptions for the point data: the measurements in individual events are discrete, exchangeable, and generated from a mixture of latent distributions, or 'themes'. The result is a mixed-membership model known as Latent Dirichlet...

Deep neural networks (DNNs) are essential tools in particle physics targeting various use cases ranging from reconstruction of particles up to event classification and anomaly detection. Whereas DNNs for event classification are primarily trained on quantities deduced from the kinematic properties of the particles in the final state (high-level observables), we present an alternative approach...

Autoencoders as tools behind anomaly searches at the LHC have the structural problem that they only work in one direction, extracting jets with higher complexity but not the other way around. To address this, we derive classifiers from the latent space of (variational) autoencoders, specifically in Gaussian mixture and Dirichlet latent spaces. In particular, the Dirichlet setup solves the...

Given the increasing data collection capabilities and limited computing resources of future collider experiments, interest in using generative neural networks for the fast simulation of collider events is growing. In our previous study, the Bounded Information Bottleneck Autoencoder (BIB-AE) architecture for generating photon showers in a high-granularity calorimeter showed a high accuracy...

We introduce persistent Betti numbers to characterize topological structure of jets. These topological invariants measure multiplicity and connectivity of jet branches at a given scale threshold, while their persistence records evolution of each topological feature as this threshold varies. With this knowledge, in particular, we are able to reconstruct branch phylogenetic tree of each jet....