InstructAttribute: Fine-grained Object Attributes editing with Instruction

1 May 2025

Abstract

Text-to-image (T2I) diffusion models, renowned for their advanced generative abilities, are extensively utilized in image editing applications, demonstrating remarkable effectiveness. However, achieving precise control over fine-grained attributes still presents considerable challenges. Existing image editing techniques either fail to modify the attributes of an object or struggle to preserve its structure and maintain consistency in other areas of the image. To address these challenges, we propose the Structure-Preserving and Attribute Amplification (SPAA), a training-free method which enables precise control over the color and material transformations of objects by editing the self-attention maps and cross-attention values. Furthermore, we constructed the Attribute Dataset, which encompasses nearly all colors and materials associated with various objects, by integrating multimodal large language models (MLLM) to develop an automated pipeline for data filtering and instruction labeling. Training on this dataset, we present our InstructAttribute, an instruction-based model designed to facilitate fine-grained editing of color and material attributes. Extensive experiments demonstrate that our method achieves superior performance in object-level color and material editing, outperforming existing instruction-based image editing approaches.

View on arXiv

@article{yin2025_2505.00751,
  title={ InstructAttribute: Fine-grained Object Attributes editing with Instruction },
  author={ Xingxi Yin and Jingfeng Zhang and Zhi Li and Yicheng Li and Yin Zhang },
  journal={arXiv preprint arXiv:2505.00751},
  year={ 2025 }
}

Comments on this paper