ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.16376
58
0

LaPIG: Cross-Modal Generation of Paired Thermal and Visible Facial Images

20 March 2025
Leyang Wang
Joice Lin
    DiffM
ArXivPDFHTML
Abstract

The success of modern machine learning, particularly in facial translation networks, is highly dependent on the availability of high-quality, paired, large-scale datasets. However, acquiring sufficient data is often challenging and costly. Inspired by the recent success of diffusion models in high-quality image synthesis and advancements in Large Language Models (LLMs), we propose a novel framework called LLM-assisted Paired Image Generation (LaPIG). This framework enables the construction of comprehensive, high-quality paired visible and thermal images using captions generated by LLMs. Our method encompasses three parts: visible image synthesis with ArcFace embedding, thermal image translation using Latent Diffusion Models (LDMs), and caption generation with LLMs. Our approach not only generates multi-view paired visible and thermal images to increase data diversity but also produces high-quality paired data while maintaining their identity information. We evaluate our method on public datasets by comparing it with existing methods, demonstrating the superiority of LaPIG.

View on arXiv
@article{wang2025_2503.16376,
  title={ LaPIG: Cross-Modal Generation of Paired Thermal and Visible Facial Images },
  author={ Leyang Wang and Joice Lin },
  journal={arXiv preprint arXiv:2503.16376},
  year={ 2025 }
}
Comments on this paper