v1v2v3 (latest)

EmoAgent: A Multi-Agent Framework for Diverse Affective Image Manipulation

14 March 2025

ArXiv (abs)PDF HTML Github

Main:11 Pages

20 Figures

Bibliography:2 Pages

Appendix:7 Pages

Abstract

Affective Image Manipulation (AIM) aims to alter visual elements within an image to evoke specific emotional responses from viewers. However, existing AIM approaches rely on rigid \emph{one-to-one} mappings between emotions and visual cues, making them ill-suited for the inherently subjective and diverse ways in which humans perceive and expressthis http URLaddress this, we introduce a novel task setting termed \emph{Diverse AIM (D-AIM)}, aiming to generate multiple visually distinct yet emotionally consistent image edits from a single source image and target emotion. We propose \emph{EmoAgent}, the first multi-agent framework tailored specifically for D-AIM. EmoAgent explicitly decomposes the manipulation process into three specialized phases executed by collaborative agents: a Planning Agent that generates diverse emotional editing strategies, an Editing Agent that precisely executes these strategies, and a Critic Agent that iteratively refines the results to ensure emotional accuracy. This collaborative design empowers EmoAgent to model \emph{one-to-many} emotion-to-visual mappings, enabling semantically diverse and emotionally faithfulthis http URLquantitative and qualitative evaluations demonstrate that EmoAgent substantially outperforms state-of-the-art approaches in both emotional fidelity and semantic diversity, effectively generating multiple distinct visual edits that convey the same target emotion.

View on arXiv

Comments on this paper