12
0

ConsistencyDet: A Few-step Denoising Framework for Object Detection Using the Consistency Model

Abstract

Object detection, a quintessential task in the realm of perceptual computing, can be tackled using a generative methodology. In the present study, we introduce a novel framework designed to articulate object detection as a denoising diffusion process, which operates on the perturbed bounding boxes of annotated entities. This framework, termed \textbf{ConsistencyDet}, leverages an innovative denoising concept known as the Consistency Model. The hallmark of this model is its self-consistency feature, which empowers the model to map distorted information from any time step back to its pristine state, thereby realizing a \textbf{``few-step denoising''} mechanism. Such an attribute markedly elevates the operational efficiency of the model, setting it apart from the conventional Diffusion Model. Throughout the training phase, ConsistencyDet initiates the diffusion sequence with noise-infused boxes derived from the ground-truth annotations and conditions the model to perform the denoising task. Subsequently, in the inference stage, the model employs a denoising sampling strategy that commences with bounding boxes randomly sampled from a normal distribution. Through iterative refinement, the model transforms an assortment of arbitrarily generated boxes into definitive detections. Comprehensive evaluations employing standard benchmarks, such as MS-COCO and LVIS, corroborate that ConsistencyDet surpasses other leading-edge detectors in performance metrics. Our code is available atthis https URL.

View on arXiv
@article{jiang2025_2404.07773,
  title={ ConsistencyDet: A Few-step Denoising Framework for Object Detection Using the Consistency Model },
  author={ Lifan Jiang and Zhihui Wang and Changmiao Wang and Ming Li and Jiaxu Leng },
  journal={arXiv preprint arXiv:2404.07773},
  year={ 2025 }
}
Comments on this paper