ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.10752
  4. Cited By
High-Resolution Image Synthesis with Latent Diffusion Models

High-Resolution Image Synthesis with Latent Diffusion Models

20 December 2021
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
    3DV
ArXivPDFHTML

Papers citing "High-Resolution Image Synthesis with Latent Diffusion Models"

50 / 7,833 papers shown
Title
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation
Teng Hu
Zhentao Yu
Zhengguang Zhou
Sen Liang
Yuan Zhou
Qin Lin
Qinglin Lu
DiffM
VGen
50
0
0
07 May 2025
CaRaFFusion: Improving 2D Semantic Segmentation with Camera-Radar Point Cloud Fusion and Zero-Shot Image Inpainting
CaRaFFusion: Improving 2D Semantic Segmentation with Camera-Radar Point Cloud Fusion and Zero-Shot Image Inpainting
Huawei Sun
Bora Kunter Sahin
Georg Stettinger
Maximilian Bernhard
Matthias Schubert
Robert Wille
36
0
0
06 May 2025
Real-Time Person Image Synthesis Using a Flow Matching Model
Real-Time Person Image Synthesis Using a Flow Matching Model
Jiwoo Jeong
Kirok Kim
Wooju Kim
Nam-Joon Kim
3DH
51
0
0
06 May 2025
Not All Parameters Matter: Masking Diffusion Models for Enhancing Generation Ability
Not All Parameters Matter: Masking Diffusion Models for Enhancing Generation Ability
L. Wang
Senmao Li
Fei Yang
Jianye Wang
Ziheng Zhang
Y. Liu
Y. Wang
Jian Yang
DiffM
52
0
0
06 May 2025
Phenotype-Guided Generative Model for High-Fidelity Cardiac MRI Synthesis: Advancing Pretraining and Clinical Applications
Phenotype-Guided Generative Model for High-Fidelity Cardiac MRI Synthesis: Advancing Pretraining and Clinical Applications
Z. Li
Yujian Hu
Zhengyao Ding
Yiheng Mao
H. Li
Fan Yi
Hongkun Zhang
Zhengxing Huang
MedIm
25
0
0
06 May 2025
Is AI currently capable of identifying wild oysters? A comparison of human annotators against the AI model, ODYSSEE
Is AI currently capable of identifying wild oysters? A comparison of human annotators against the AI model, ODYSSEE
Brendan Campbell
Alan Williams
Kleio Baxevani
Alyssa Campbell
Rushabh Dhoke
...
Arjun Suresh
Alhim Vera
Arthur Trembanis
Herbert G. Tanner
Edward Hale
36
0
0
06 May 2025
SynSHRP2: A Synthetic Multimodal Benchmark for Driving Safety-critical Events Derived from Real-world Driving Data
SynSHRP2: A Synthetic Multimodal Benchmark for Driving Safety-critical Events Derived from Real-world Driving Data
Liang Shi
Boyu Jiang
Zhenyuan Yuan
Miguel A. Perez
Feng Guo
16
0
0
06 May 2025
PiCo: Enhancing Text-Image Alignment with Improved Noise Selection and Precise Mask Control in Diffusion Models
PiCo: Enhancing Text-Image Alignment with Improved Noise Selection and Precise Mask Control in Diffusion Models
Chang Xie
Chenyi Zhuang
Pan Gao
VLM
22
0
0
06 May 2025
From Pixels to Polygons: A Survey of Deep Learning Approaches for Medical Image-to-Mesh Reconstruction
From Pixels to Polygons: A Survey of Deep Learning Approaches for Medical Image-to-Mesh Reconstruction
Fengming Lin
Arezoo Zakeri
Yidan Xue
Michael MacRaild
Haoran Dou
Zherui Zhou
Ziwei Zou
Ali Sarrami-Foroushani
Jinming Duan
Alejandro F Frangi
3DV
MedIm
30
0
0
06 May 2025
Multimodal Benchmarking and Recommendation of Text-to-Image Generation Models
Multimodal Benchmarking and Recommendation of Text-to-Image Generation Models
Kapil Wanaskar
Gaytri Jena
Magdalini Eirinaki
EGVM
22
0
0
06 May 2025
DiffVQA: Video Quality Assessment Using Diffusion Feature Extractor
DiffVQA: Video Quality Assessment Using Diffusion Feature Extractor
Wei-Ting Chen
Yu-Jiet Vong
Yi-Tsung Lee
Sy-Yen Kuo
Qiang Gao
Sizhuo Ma
Jian Wang
61
0
0
06 May 2025
Show or Tell? A Benchmark To Evaluate Visual and Textual Prompts in Semantic Segmentation
Show or Tell? A Benchmark To Evaluate Visual and Textual Prompts in Semantic Segmentation
Gabriele Rosi
Fabio Cermelli
VLM
29
0
0
06 May 2025
Diffusion Models are Secretly Exchangeable: Parallelizing DDPMs via Autospeculation
Diffusion Models are Secretly Exchangeable: Parallelizing DDPMs via Autospeculation
Hengyuan Hu
Aniket Das
Dorsa Sadigh
Nima Anari
DiffM
19
0
0
06 May 2025
Enhancing Glass Defect Detection with Diffusion Models: Addressing Imbalanced Datasets in Manufacturing Quality Control
Enhancing Glass Defect Detection with Diffusion Models: Addressing Imbalanced Datasets in Manufacturing Quality Control
Sajjad Rezvani Boroujeni
Hossein Abedi
Tom Bush
20
0
0
06 May 2025
No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves
No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves
D. Jiang
Mengmeng Wang
Liuzhuozheng Li
Lei Zhang
Haoyu Wang
Wei Wei
Guang Dai
Yanning Zhang
Jingdong Wang
DiffM
35
0
0
05 May 2025
T2S: High-resolution Time Series Generation with Text-to-Series Diffusion Models
T2S: High-resolution Time Series Generation with Text-to-Series Diffusion Models
Yunfeng Ge
Jiawei Li
Yiji Zhao
Haomin Wen
Zhao Li
M. Qiu
H. Li
Ming Jin
Shirui Pan
DiffM
34
0
0
05 May 2025
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
X. Zhang
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
57
0
0
05 May 2025
SuperEdit: Rectifying and Facilitating Supervision for Instruction-Based Image Editing
SuperEdit: Rectifying and Facilitating Supervision for Instruction-Based Image Editing
Ming Li
Xin Gu
Fan Chen
X. Xing
Longyin Wen
C. L. P. Chen
Sijie Zhu
DiffM
68
1
0
05 May 2025
Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction
Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction
Biao Gong
Cheng Zou
Dandan Zheng
Hu Yu
Jingdong Chen
...
Qingpei Guo
Rui Liu
Weilong Chai
Xinyu Xiao
Ziyuan Huang
MLLM
74
1
0
05 May 2025
MCCD: Multi-Agent Collaboration-based Compositional Diffusion for Complex Text-to-Image Generation
MCCD: Multi-Agent Collaboration-based Compositional Diffusion for Complex Text-to-Image Generation
Mingcheng Li
Xiaolu Hou
Ziyang Liu
Dingkang Yang
Ziyun Qian
Jiawei Chen
Jinjie Wei
Y. Jiang
Qingyao Xu
L. Zhang
DiffM
44
0
0
05 May 2025
Towards Dataset Copyright Evasion Attack against Personalized Text-to-Image Diffusion Models
Towards Dataset Copyright Evasion Attack against Personalized Text-to-Image Diffusion Models
Kuofeng Gao
Yufei Zhu
Yiming Li
Jiawang Bai
Yong-Liang Yang
Z. Li
Shu-Tao Xia
34
0
0
05 May 2025
MUSAR: Exploring Multi-Subject Customization from Single-Subject Dataset via Attention Routing
MUSAR: Exploring Multi-Subject Customization from Single-Subject Dataset via Attention Routing
Zinan Guo
Pengze Zhang
Yanze Wu
Chong Mou
Songtao Zhao
Qian He
17
0
0
05 May 2025
Enhancing AI Face Realism: Cost-Efficient Quality Improvement in Distilled Diffusion Models with a Fully Synthetic Dataset
Enhancing AI Face Realism: Cost-Efficient Quality Improvement in Distilled Diffusion Models with a Fully Synthetic Dataset
Jakub Wąsala
Bartłomiej Wrzalski
Kornelia Noculak
Yuliia Tarasenko
Oliwer Krupa
Jan Kocoń
Grzegorz Chodak
29
0
0
04 May 2025
Regression is all you need for medical image translation
Regression is all you need for medical image translation
Sebastian Rassmann
David Kügler
Christian Ewert
Martin Reuter
DiffM
MedIm
56
0
0
04 May 2025
Quantizing Diffusion Models from a Sampling-Aware Perspective
Quantizing Diffusion Models from a Sampling-Aware Perspective
Qian Zeng
Jie Song
Yuanyu Wan
Huiqiong Wang
Mingli Song
DiffM
MQ
64
1
0
04 May 2025
Benchmarking Feature Upsampling Methods for Vision Foundation Models using Interactive Segmentation
Benchmarking Feature Upsampling Methods for Vision Foundation Models using Interactive Segmentation
Volodymyr Havrylov
Haiwen Huang
Dan Zhang
Andreas Geiger
34
0
0
04 May 2025
Language translation, and change of accent for speech-to-speech task using diffusion model
Language translation, and change of accent for speech-to-speech task using diffusion model
Abhishek Mishra
Ritesh Sur Chowdhury
Vartul Bahuguna
Isha Pandey
Ganesh Ramakrishnan
DiffM
32
0
0
04 May 2025
Robust AI-Generated Face Detection with Imbalanced Data
Robust AI-Generated Face Detection with Imbalanced Data
Yamini Sri Krubha
Aryana Hou
Braden Vester
Web Walker
X. Wang
Li Lin
Shu Hu
27
0
0
04 May 2025
Improving Physical Object State Representation in Text-to-Image Generative Systems
Improving Physical Object State Representation in Text-to-Image Generative Systems
Tianle Chen
Chaitanya Chakka
Deepti Ghadiyaram
25
0
0
04 May 2025
MVHumanNet++: A Large-scale Dataset of Multi-view Daily Dressing Human Captures with Richer Annotations for 3D Human Digitization
MVHumanNet++: A Large-scale Dataset of Multi-view Daily Dressing Human Captures with Richer Annotations for 3D Human Digitization
Chenghong Li
Hongjie Liao
Yihao Zhi
Xihe Yang
Zhengwentai Sun
Jiahao Chang
Shuguang Cui
Xiaoguang Han
3DH
45
0
0
03 May 2025
DualDiff: Dual-branch Diffusion Model for Autonomous Driving with Semantic Fusion
DualDiff: Dual-branch Diffusion Model for Autonomous Driving with Semantic Fusion
Haoteng Li
Zhao Yang
Zezhong Qian
Gongpeng Zhao
Yuqi Huang
Jun-chen Yu
Huazheng Zhou
Longjun Liu
46
1
0
03 May 2025
RAGAR: Retrieval Augment Personalized Image Generation Guided by Recommendation
RAGAR: Retrieval Augment Personalized Image Generation Guided by Recommendation
Run Ling
W. Wang
Yuting Liu
G. Guo
Linying Jiang
Xingwei Wang
DiffM
43
0
0
03 May 2025
Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning
Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning
Jifeng Hu
Sili Huang
Z. Yang
Shengchao Hu
Li Shen
H. Chen
Lichao Sun
Yi-Ju Chang
Dacheng Tao
OffRL
44
0
0
03 May 2025
FreePCA: Integrating Consistency Information across Long-short Frames in Training-free Long Video Generation via Principal Component Analysis
FreePCA: Integrating Consistency Information across Long-short Frames in Training-free Long Video Generation via Principal Component Analysis
Jiangtong Tan
Hu Yu
Jie Huang
Jie Xiao
Feng Zhao
57
1
0
02 May 2025
Where's the liability in the Generative Era? Recovery-based Black-Box Detection of AI-Generated Content
Where's the liability in the Generative Era? Recovery-based Black-Box Detection of AI-Generated Content
Haoyue Bai
Yiyou Sun
Wei Cheng
Haifeng Chen
AAML
46
0
0
02 May 2025
Fast Flow-based Visuomotor Policies via Conditional Optimal Transport Couplings
Fast Flow-based Visuomotor Policies via Conditional Optimal Transport Couplings
Andreas Sochopoulos
Nikolay Malkin
Nikolaos Tsagkas
João Moura
Michael Gienger
S. Vijayakumar
34
1
0
02 May 2025
GENMO: A GENeralist Model for Human MOtion
GENMO: A GENeralist Model for Human MOtion
Jiefeng Li
Jinkun Cao
Haotian Zhang
Davis Rempe
Jan Kautz
Umar Iqbal
Ye Yuan
DiffM
VGen
42
1
0
02 May 2025
Any-to-Any Vision-Language Model for Multimodal X-ray Imaging and Radiological Report Generation
Any-to-Any Vision-Language Model for Multimodal X-ray Imaging and Radiological Report Generation
Daniele Molino
Francesco Di Feola
Linlin Shen
Paolo Soda
V. Guarrasi
MedIm
LM&MA
57
0
0
02 May 2025
Improving Editability in Image Generation with Layer-wise Memory
Improving Editability in Image Generation with Layer-wise Memory
Daneul Kim
Jaeah Lee
Jaesik Park
DiffM
KELM
53
0
0
02 May 2025
Provable Efficiency of Guidance in Diffusion Models for General Data Distribution
Provable Efficiency of Guidance in Diffusion Models for General Data Distribution
Gen Li
Yuchen Jiao
44
0
0
02 May 2025
VIDSTAMP: A Temporally-Aware Watermark for Ownership and Integrity in Video Diffusion Models
VIDSTAMP: A Temporally-Aware Watermark for Ownership and Integrity in Video Diffusion Models
Mohammadreza Teymoorianfard
Shiqing Ma
Amir Houmansadr
WIGM
53
0
0
02 May 2025
Safety-Critical Traffic Simulation with Guided Latent Diffusion Model
Safety-Critical Traffic Simulation with Guided Latent Diffusion Model
Mingxing Peng
Ruoyu Yao
Xusen Guo
Yuting Xie
Xianda Chen
Jun Ma
16
0
0
01 May 2025
Pixel3DMM: Versatile Screen-Space Priors for Single-Image 3D Face Reconstruction
Pixel3DMM: Versatile Screen-Space Priors for Single-Image 3D Face Reconstruction
Simon Giebenhain
Tobias Kirschstein
Martin Rünz
Lourdes Agapito
Matthias Nießner
CVBM
3DH
52
0
0
01 May 2025
Quaternion Wavelet-Conditioned Diffusion Models for Image Super-Resolution
Quaternion Wavelet-Conditioned Diffusion Models for Image Super-Resolution
Luigi Sigillo
Christian Bianchi
A. Uncini
Danilo Comminiello
46
0
0
01 May 2025
Multi-Modal Language Models as Text-to-Image Model Evaluators
Multi-Modal Language Models as Text-to-Image Model Evaluators
Jiahui Chen
Candace Ross
Reyhane Askari Hemmat
Koustuv Sinha
Melissa Hall
M. Drozdzal
Adriana Romero-Soriano
EGVM
60
0
0
01 May 2025
GuideSR: Rethinking Guidance for One-Step High-Fidelity Diffusion-Based Super-Resolution
GuideSR: Rethinking Guidance for One-Step High-Fidelity Diffusion-Based Super-Resolution
Aditya Arora
Z. Tu
Y. Wang
Ruizheng Bai
Jian Wang
Sizhuo Ma
DiffM
58
0
0
01 May 2025
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
D. Jiang
Ziyu Guo
Renrui Zhang
Zhuofan Zong
Hao Li
Le Zhuo
Shilin Yan
Pheng-Ann Heng
H. Li
LRM
57
0
0
01 May 2025
A Time-Series Data Augmentation Model through Diffusion and Transformer Integration
A Time-Series Data Augmentation Model through Diffusion and Transformer Integration
Yuren Zhang
Zhongnan Pu
Lei Jing
20
0
0
01 May 2025
KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution
KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution
Antoni Bigata
Rodrigo Mira
Stella Bounareli
Michał Stypułkowski
Konstantinos Vougioukas
Stavros Petridis
Maja Pantic
49
0
0
01 May 2025
SpatialLLM: A Compound 3D-Informed Design towards Spatially-Intelligent Large Multimodal Models
SpatialLLM: A Compound 3D-Informed Design towards Spatially-Intelligent Large Multimodal Models
Wufei Ma
Luoxin Ye
Nessa McWeeney
Celso M de Melo
A. Yuille
Jieneng Chen
LRM
57
1
0
01 May 2025
Previous
12345...155156157
Next