Indico celebrates its 20th anniversary! Check our blog post for more information!

18–24 Aug 2024
Cairns, Queensland, Australia
Australia/Brisbane timezone

Nonparametric $f$-Divergence Estimation and its Application to Eliminating Harmful Variables

22 Aug 2024, 17:10
20m
M1

M1

Oral F: Nuclear and Astro-Particle Physics Nuclear and Astro-particle Physics

Description

This research first explores advancements in nearest neighbor methods tailored for $f$-divergence estimation and the mitigation of biases induced by high-dimensional data. Nonparametric methods, including nearest neighbor and kernel techniques, are recognized for their simplicity and scalability, allowing for parallel computation without extensive model tuning. Despite their advantages, these methods are vulnerable to performance impairments due to biases associated with the high-dimensional nature of data. Our approach initially considers the construction of a series of equations for estimating diverse $f$-divergences using nearest neighbors. We will discuss the shortcomings of previous plug-in methodologies for constructing $f$-divergence estimators with a fixed $k$ and introduce a principled approach for this purpose. Additionally, we will also present our recent publications on mitigating the high-dimensional biases of these estimators, which primarily stem from the geometric properties of density functions.
Second, we will discuss how the constructed set of estimators can be used in various machine learning problems such as feature selection and addressing distribution shift. In particular, distribution shift is related to various issues that hinder the application of artificial intelligence to real-world problems, including the generalization of algorithms trained on specific groups, fairness and performance degradation in underrepresented groups, and discrepancies between real and simulated distributions. We tackle these problems by introducing a general framework of methods to control the flow of information and implement appropriate blocking within a graphical model.

Primary author

Prof. Yung-Kyun Noh (Hanyang University)

Presentation materials