ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.08591
29
0

ZipIR: Latent Pyramid Diffusion Transformer for High-Resolution Image Restoration

11 April 2025
Yongsheng Yu
Haitian Zheng
Zhifei Zhang
Jianming Zhang
Yuqian Zhou
Connelly Barnes
Y. Liu
Wei Xiong
Zhe Lin
Jiebo Luo
ArXivPDFHTML
Abstract

Recent progress in generative models has significantly improved image restoration capabilities, particularly through powerful diffusion models that offer remarkable recovery of semantic details and local fidelity. However, deploying these models at ultra-high resolutions faces a critical trade-off between quality and efficiency due to the computational demands of long-range attention mechanisms. To address this, we introduce ZipIR, a novel framework that enhances efficiency, scalability, and long-range modeling for high-res image restoration. ZipIR employs a highly compressed latent representation that compresses image 32x, effectively reducing the number of spatial tokens, and enabling the use of high-capacity models like the Diffusion Transformer (DiT). Toward this goal, we propose a Latent Pyramid VAE (LP-VAE) design that structures the latent space into sub-bands to ease diffusion training. Trained on full images up to 2K resolution, ZipIR surpasses existing diffusion-based methods, offering unmatched speed and quality in restoring high-resolution images from severely degraded inputs.

View on arXiv
@article{yu2025_2504.08591,
  title={ ZipIR: Latent Pyramid Diffusion Transformer for High-Resolution Image Restoration },
  author={ Yongsheng Yu and Haitian Zheng and Zhifei Zhang and Jianming Zhang and Yuqian Zhou and Connelly Barnes and Yuchen Liu and Wei Xiong and Zhe Lin and Jiebo Luo },
  journal={arXiv preprint arXiv:2504.08591},
  year={ 2025 }
}
Comments on this paper