BiasMap: Leveraging Cross-Attentions to Discover and Mitigate Hidden Social Biases in Text-to-Image Generation

Bias discovery is critical for black-box generative models, especiall text-to-image (TTI) models. Existing works predominantly focus on output-level demographic distributions, which do not neces- sarily guarantee concept representations to be disentangled post- mitigation. We propose BiasMap, a model-agnostic framework for uncovering latent concept-level representational biases in stable dif- fusion models. BiasMap leverages cross-attention attribution maps to reveal structural entanglements between demographics (e.g., gender, race) and semantics (e.g., professions), going deeper into representational bias during the image generation. Using attribu- tion maps of these concepts, we quantify the spatial demographics- semantics concept entanglement via Intersection over Union (IoU), offering a lens into bias that remains hidden in existing fairness dis- covery approaches. In addition, we further utilize BiasMap for bias mitigation through energy-guided diffusion sampling that directly modifies latent noise space and minimizes the expected SoftIoU dur- ing the denoising process. Our findings show that existing fairness interventions may reduce the output distributional gap but often fail to disentangle concept-level coupling, whereas our mitigation method can mitigate concept entanglement in image generation while complementing distributional bias mitigation.
View on arXiv