15–18 Oct 2024
Purdue University
America/Indiana/Indianapolis timezone

An Efficient and Unified Debiasing Approach for Vision-Language Models across Modalities and Tasks

16 Oct 2024, 15:35
5m
Steward Center 306 (Third floor) (Purdue University)

Steward Center 306 (Third floor)

Purdue University

128 Memorial Mall Dr, West Lafayette, IN 47907
Lightning 5 min talk + poster Lighting talks

Speaker

Hoin Jung (Purdue University)

Description

Recent advancements in Vision-Language Models (VLMs) have enabled complex multimodal tasks by processing text and image data simultaneously, significantly enhancing the field of artificial intelligence. However, these models often exhibit biases that can skew outputs towards societal stereotypes, thus necessitating debiasing strategies. Existing debiasing methods focus narrowly on specific modalities or tasks and require extensive retraining, which adds to computational cost. To address these limitations, this paper introduces Selective Feature Imputation for Debiasing (SFID), a novel methodology that integrates feature pruning and low-confidence imputation (LCI) to effectively reduce biases in VLMs. SFID not only eliminates the need for retraining but also ensures that debiasing is achieved without increasing computational cost during inference, preserving efficiency throughout. Our experimental results demonstrate SFID's effectiveness across various VLM tasks, including zero-shot classification, text-to-image retrieval, image captioning, and text-to-image generation, significantly reducing gender biases without compromising performance. This approach enhances the fairness of VLM applications while maintaining computational efficiency across diverse scenarios.

Authors

Hoin Jung (Purdue University) Taeuk Jang Xiaoqian Wang (Purdue University)

Presentation materials