Speaker
Description
Recent advancements in Vision-Language Models (VLMs) have enabled complex multimodal tasks by processing text and image data simultaneously, significantly enhancing the field of artificial intelligence. However, these models often exhibit biases that can skew outputs towards societal stereotypes, thus necessitating debiasing strategies. Existing debiasing methods focus narrowly on specific modalities or tasks and require extensive retraining, which adds to computational cost. To address these limitations, this paper introduces Selective Feature Imputation for Debiasing (SFID), a novel methodology that integrates feature pruning and low-confidence imputation (LCI) to effectively reduce biases in VLMs. SFID not only eliminates the need for retraining but also ensures that debiasing is achieved without increasing computational cost during inference, preserving efficiency throughout. Our experimental results demonstrate SFID's effectiveness across various VLM tasks, including zero-shot classification, text-to-image retrieval, image captioning, and text-to-image generation, significantly reducing gender biases without compromising performance. This approach enhances the fairness of VLM applications while maintaining computational efficiency across diverse scenarios.