Concept-Aware LoRA for Domain-Aligned Segmentation Dataset Generation

28 March 2025

Abstract

This paper addresses the challenge of data scarcity in semantic segmentation by generating datasets through text-to-image (T2I) generation models, reducing image acquisition and labeling costs. Segmentation dataset generation faces two key challenges: 1) aligning generated samples with the target domain and 2) producing informative samples beyond the training data. Fine-tuning T2I models can help generate samples aligned with the target domain. However, it often overfits and memorizes training data, limiting their ability to generate diverse and well-aligned samples. To overcome these issues, we propose Concept-Aware LoRA (CA-LoRA), a novel fine-tuning approach that selectively identifies and updates only the weights associated with necessary concepts (e.g., style or viewpoint) for domain alignment while preserving the pretrained knowledge of the T2I model to produce informative samples. We demonstrate its effectiveness in generating datasets for urban-scene segmentation, outperforming baseline and state-of-the-art methods in in-domain (few-shot and fully-supervised) settings, as well as in domain generalization tasks, especially under challenging conditions such as adverse weather and varying illumination, further highlighting its superiority.

View on arXiv

@article{park2025_2503.22172,
  title={ Concept-Aware LoRA for Domain-Aligned Segmentation Dataset Generation },
  author={ Minho Park and Sunghyun Park and Jungsoo Lee and Hyojin Park and Kyuwoong Hwang and Fatih Porikli and Jaegul Choo and Sungha Choi },
  journal={arXiv preprint arXiv:2503.22172},
  year={ 2025 }
}

Comments on this paper