ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06125
  4. Cited By
Hierarchical Text-Conditional Image Generation with CLIP Latents

Hierarchical Text-Conditional Image Generation with CLIP Latents

13 April 2022
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
    VLM
    DiffM
ArXivPDFHTML

Papers citing "Hierarchical Text-Conditional Image Generation with CLIP Latents"

50 / 4,735 papers shown
Title
SD-ReID: View-aware Stable Diffusion for Aerial-Ground Person Re-Identification
SD-ReID: View-aware Stable Diffusion for Aerial-Ground Person Re-Identification
Xiang Hu
Pingping Zhang
Yuhao Wang
Bin Yan
Huchuan Lu
21
0
0
13 Apr 2025
Towards Explainable Partial-AIGC Image Quality Assessment
Towards Explainable Partial-AIGC Image Quality Assessment
Jiaying Qian
Ziheng Jia
Zicheng Zhang
Zeyu Zhang
Guangtao Zhai
Xiongkuo Min
35
0
0
12 Apr 2025
UniFlowRestore: A General Video Restoration Framework via Flow Matching and Prompt Guidance
UniFlowRestore: A General Video Restoration Framework via Flow Matching and Prompt Guidance
S.
Yu Zhang
Chen Wu
Dianjie Lu
Dianjie Lu
Guijuan Zhan
Yang Weng
Zhuoran Zheng
DiffM
VGen
26
0
0
12 Apr 2025
Generating Fine Details of Entity Interactions
Generating Fine Details of Entity Interactions
Xinyi Gu
Jiayuan Mao
32
0
0
11 Apr 2025
AGENT: An Aerial Vehicle Generation and Design Tool Using Large Language Models
AGENT: An Aerial Vehicle Generation and Design Tool Using Large Language Models
Colin Samplawski
Adam Cobb
Susmit Jha
LLMAG
AI4CE
50
0
0
11 Apr 2025
Diffusion Models for Robotic Manipulation: A Survey
Diffusion Models for Robotic Manipulation: A Survey
Rosa Wolf
Yitian Shi
Sheng Liu
Rania Rayyes
49
1
0
11 Apr 2025
PixelFlow: Pixel-Space Generative Models with Flow
PixelFlow: Pixel-Space Generative Models with Flow
Shoufa Chen
Chongjian Ge
Shilong Zhang
Peize Sun
Ping Luo
VLM
DRL
33
0
0
10 Apr 2025
POEM: Precise Object-level Editing via MLLM control
POEM: Precise Object-level Editing via MLLM control
Marco Schouten
Mehmet Onurcan Kaya
Serge Belongie
Dim P. Papadopoulos
DiffM
68
0
0
10 Apr 2025
Teaching Humans Subtle Differences with DIFFusion
Teaching Humans Subtle Differences with DIFFusion
Mia Chiquier
Orr Avrech
Yossi Gandelsman
Berthy T. Feng
Katherine L. Bouman
Carl Vondrick
DiffM
43
0
0
10 Apr 2025
Marmot: Multi-Agent Reasoning for Multi-Object Self-Correcting in Improving Image-Text Alignment
Marmot: Multi-Agent Reasoning for Multi-Object Self-Correcting in Improving Image-Text Alignment
Jiayang Sun
H. Wang
Jie Cao
Huaibo Huang
R. He
DiffM
68
0
0
10 Apr 2025
A Unified Agentic Framework for Evaluating Conditional Image Generation
A Unified Agentic Framework for Evaluating Conditional Image Generation
Jifang Wang
Xue Yang
Longyue Wang
Zhenran Xu
Y. Wang
Yaowei Wang
Weihua Luo
Kaifu Zhang
Baotian Hu
Min Zhang
EGVM
DiffM
72
0
0
09 Apr 2025
A Meaningful Perturbation Metric for Evaluating Explainability Methods
A Meaningful Perturbation Metric for Evaluating Explainability Methods
Danielle Cohen
Hila Chefer
Lior Wolf
AAML
20
0
0
09 Apr 2025
Compass Control: Multi Object Orientation Control for Text-to-Image Generation
Compass Control: Multi Object Orientation Control for Text-to-Image Generation
Rishubh Parihar
Vaibhav Agrawal
Sachidanand VS
R. V. Babu
DiffM
28
0
0
09 Apr 2025
IGG: Image Generation Informed by Geodesic Dynamics in Deformation Spaces
IGG: Image Generation Informed by Geodesic Dynamics in Deformation Spaces
Nian Wu
Nivetha Jayakumar
Jiarui Xing
Miaomiao Zhang
23
0
0
09 Apr 2025
Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model
Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model
Qi Mao
L. Chen
Yuchao Gu
Mike Zheng Shou
Ming-Hsuan Yang
DiffM
36
0
0
08 Apr 2025
CDM-QTA: Quantized Training Acceleration for Efficient LoRA Fine-Tuning of Diffusion Model
CDM-QTA: Quantized Training Acceleration for Efficient LoRA Fine-Tuning of Diffusion Model
Jinming Lu
Minghao She
Wendong Mao
Zhongfeng Wang
MQ
31
0
0
08 Apr 2025
Enhancing Compositional Reasoning in Vision-Language Models with Synthetic Preference Data
Enhancing Compositional Reasoning in Vision-Language Models with Synthetic Preference Data
Samarth Mishra
Kate Saenko
Venkatesh Saligrama
CoGe
LRM
34
0
0
07 Apr 2025
Reinforced Multi-teacher Knowledge Distillation for Efficient General Image Forgery Detection and Localization
Reinforced Multi-teacher Knowledge Distillation for Efficient General Image Forgery Detection and Localization
Zeqin Yu
Jiangqun Ni
Jian Zhang
Haoyi Deng
Yuzhen Lin
21
0
0
07 Apr 2025
Dimension-Free Convergence of Diffusion Models for Approximate Gaussian Mixtures
Dimension-Free Convergence of Diffusion Models for Approximate Gaussian Mixtures
Gen Li
Changxiao Cai
Yuting Wei
DiffM
20
1
0
07 Apr 2025
SCAM: A Real-World Typographic Robustness Evaluation for Multimodal Foundation Models
SCAM: A Real-World Typographic Robustness Evaluation for Multimodal Foundation Models
Justus Westerhoff
Erblina Purellku
Jakob Hackstein
Jonas Loos
Leo Pinetzki
Lorenz Hufe
AAML
28
0
0
07 Apr 2025
CADCrafter: Generating Computer-Aided Design Models from Unconstrained Images
CADCrafter: Generating Computer-Aided Design Models from Unconstrained Images
Cheng Chen
Jiacheng Wei
Tianrun Chen
Chi Zhang
Xiaofeng Yang
...
Bingchen Yang
Chuan-Sheng Foo
Guosheng Lin
Qixing Huang
Fayao Liu
44
0
0
07 Apr 2025
PartStickers: Generating Parts of Objects for Rapid Prototyping
PartStickers: Generating Parts of Objects for Rapid Prototyping
Mo Zhou
Josh Myers-Dean
Danna Gurari
21
0
0
07 Apr 2025
DiCoTTA: Domain-invariant Learning for Continual Test-time Adaptation
DiCoTTA: Domain-invariant Learning for Continual Test-time Adaptation
Sohyun Lee
N. Kim
Juwon Kang
Seong Joon Oh
Suha Kwak
83
0
0
07 Apr 2025
UniToken: Harmonizing Multimodal Understanding and Generation through Unified Visual Encoding
UniToken: Harmonizing Multimodal Understanding and Generation through Unified Visual Encoding
Yang Jiao
Haibo Qiu
Zequn Jie
S. Chen
Jingjing Chen
Lin Ma
Yu Jiang
26
2
0
06 Apr 2025
Can You Count to Nine? A Human Evaluation Benchmark for Counting Limits in Modern Text-to-Video Models
Can You Count to Nine? A Human Evaluation Benchmark for Counting Limits in Modern Text-to-Video Models
Xuyang Guo
Zekai Huang
Jiayan Huo
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
Jiahao Zhang
ALM
VGen
59
2
0
05 Apr 2025
Multi-identity Human Image Animation with Structural Video Diffusion
Multi-identity Human Image Animation with Structural Video Diffusion
Zhenzhi Wang
Y. Li
Yanhong Zeng
Yuwei Guo
D. Lin
Tianfan Xue
Bo Dai
VGen
22
0
0
05 Apr 2025
Structured Knowledge Accumulation: The Principle of Entropic Least Action in Forward-Only Neural Learning
Structured Knowledge Accumulation: The Principle of Entropic Least Action in Forward-Only Neural Learning
Bouarfa Mahi Quantiota
25
0
0
04 Apr 2025
Fine-Tuning Visual Autoregressive Models for Subject-Driven Generation
Fine-Tuning Visual Autoregressive Models for Subject-Driven Generation
Jiwoo Chung
Sangeek Hyun
Hyunjun Kim
Eunseo Koh
MinKyu Lee
Jae-Pil Heo
33
0
0
03 Apr 2025
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
Zhiyuan Yan
Junyan Ye
Weijia Li
Zilong Huang
Shenghai Yuan
Xiangyang He
Kaiqing Lin
Jun-Jian He
Conghui He
Li Yuan
MLLM
EGVM
88
8
0
03 Apr 2025
Scene Splatter: Momentum 3D Scene Generation from Single Image with Video Diffusion Model
Scene Splatter: Momentum 3D Scene Generation from Single Image with Video Diffusion Model
Shengjun Zhang
Jinzhao Li
Xin Fei
Hao Liu
Yueqi Duan
DiffM
3DGS
VGen
66
0
0
03 Apr 2025
VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via Iterative Instruction Tuning and Reinforcement Learning
VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via Iterative Instruction Tuning and Reinforcement Learning
Xianwei Zhuang
Yuxin Xie
Yufan Deng
Dongchao Yang
Liming Liang
Jinghan Ru
Yuguo Yin
Yuexian Zou
61
1
0
03 Apr 2025
MultiNeRF: Multiple Watermark Embedding for Neural Radiance Fields
MultiNeRF: Multiple Watermark Embedding for Neural Radiance Fields
Yash Kulthe
Andrew Gilbert
John Collomosse
33
0
0
03 Apr 2025
MD-ProjTex: Texturing 3D Shapes with Multi-Diffusion Projection
MD-ProjTex: Texturing 3D Shapes with Multi-Diffusion Projection
Ahmet Burak Yildirim
Mustafa Utku Aydogdu
Duygu Ceylan
Aysegül Dündar
DiffM
42
1
0
03 Apr 2025
Implicit Bias Injection Attacks against Text-to-Image Diffusion Models
Implicit Bias Injection Attacks against Text-to-Image Diffusion Models
Huayang Huang
Xiangye Jin
Jiaxu Miao
Yu Wu
29
0
0
02 Apr 2025
FreSca: Unveiling the Scaling Space in Diffusion Models
FreSca: Unveiling the Scaling Space in Diffusion Models
Chao Huang
Susan Liang
Yunlong Tang
Li Ma
Yapeng Tian
Chenliang Xu
DiffM
48
0
0
02 Apr 2025
FlowMotion: Target-Predictive Conditional Flow Matching for Jitter-Reduced Text-Driven Human Motion Generation
FlowMotion: Target-Predictive Conditional Flow Matching for Jitter-Reduced Text-Driven Human Motion Generation
Manolo Canales Cuba
Vinícius do Carmo Melício
João Paulo Gois
3DH
48
0
0
02 Apr 2025
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation
Shaojin Wu
Mengqi Huang
Wenxu Wu
Yufeng Cheng
Fei Ding
Qian He
DiffM
50
4
0
02 Apr 2025
Random Conditioning with Distillation for Data-Efficient Diffusion Model Compression
Random Conditioning with Distillation for Data-Efficient Diffusion Model Compression
Dohyun Kim
S. Park
Geonhee Han
Seung Wook Kim
Paul Hongsuck Seo
DiffM
45
0
0
02 Apr 2025
Pro-DG: Procedural Diffusion Guidance for Architectural Facade Generation
Pro-DG: Procedural Diffusion Guidance for Architectural Facade Generation
Aleksander Plocharski
Jan Swidzinski
Przemyslaw Musialski
DiffM
30
0
0
02 Apr 2025
Multi-party Collaborative Attention Control for Image Customization
Multi-party Collaborative Attention Control for Image Customization
Han Yang
Chuanguang Yang
Qiuli Wang
Zhulin An
Weilun Feng
Libo Huang
Y. Xu
DiffM
20
0
0
02 Apr 2025
Spingarn's Method and Progressive Decoupling Beyond Elicitable Monotonicity
Spingarn's Method and Progressive Decoupling Beyond Elicitable Monotonicity
B. Evens
P. Latafat
Panagiotis Patrinos
46
0
0
01 Apr 2025
Beyond Static Scenes: Camera-controllable Background Generation for Human Motion
Beyond Static Scenes: Camera-controllable Background Generation for Human Motion
Mingshuai Yao
Mengting Chen
Qinye Zhou
Y. Zhang
Ming-Yu Liu
...
Chen Ju
Shuai Xiao
Qingwen Liu
Jinsong Lan
Wangmeng Zuo
DiffM
VGen
28
1
0
01 Apr 2025
IntrinsiX: High-Quality PBR Generation using Image Priors
IntrinsiX: High-Quality PBR Generation using Image Priors
Peter Kocsis
Lukas Höllein
Matthias Nießner
33
0
0
01 Apr 2025
Prompting Forgetting: Unlearning in GANs via Textual Guidance
Prompting Forgetting: Unlearning in GANs via Textual Guidance
Piyush Nagasubramaniam
Neeraj Karamchandani
Chen Wu
Sencun Zhu
DiffM
AILaw
MU
51
0
0
01 Apr 2025
Training-Free Text-Guided Image Editing with Visual Autoregressive Model
Training-Free Text-Guided Image Editing with Visual Autoregressive Model
Yufei Wang
Lanqing Guo
Z. Li
Jiaxing Huang
Pichao Wang
Bihan Wen
J. Wang
DiffM
50
1
0
31 Mar 2025
Pre-training with 3D Synthetic Data: Learning 3D Point Cloud Instance Segmentation from 3D Synthetic Scenes
Pre-training with 3D Synthetic Data: Learning 3D Point Cloud Instance Segmentation from 3D Synthetic Scenes
Daichi Otsuka
Shinichi Mae
Ryosuke Yamada
Hirokatsu Kataoka
3DPC
35
0
0
31 Mar 2025
MuseFace: Text-driven Face Editing via Diffusion-based Mask Generation Approach
MuseFace: Text-driven Face Editing via Diffusion-based Mask Generation Approach
Xin Zhang
Siting Huang
Xiangyang Luo
Yifan Xie
Weijiang Yu
Heng Chang
Fei Ma
Fei Richard Yu
DiffM
31
0
0
31 Mar 2025
Can Diffusion Models Disentangle? A Theoretical Perspective
Can Diffusion Models Disentangle? A Theoretical Perspective
Liming Wang
Muhammad Jehanzeb Mirza
Yishu Gong
Yuan Gong
Jiaqi Zhang
Brian Tracey
Katerina Placek
Marco Vilela
James Glass
DiffM
CoGe
82
0
0
31 Mar 2025
Effective Cloud Removal for Remote Sensing Images by an Improved Mean-Reverting Denoising Model with Elucidated Design Space
Effective Cloud Removal for Remote Sensing Images by an Improved Mean-Reverting Denoising Model with Elucidated Design Space
Yi Liu
Wengen Li
Jihong Guan
S. Kevin Zhou
Yichao Zhang
DiffM
44
1
0
31 Mar 2025
ORAL: Prompting Your Large-Scale LoRAs via Conditional Recurrent Diffusion
ORAL: Prompting Your Large-Scale LoRAs via Conditional Recurrent Diffusion
Rana Muhammad Shahroz Khan
Dongwen Tang
Pingzhi Li
Kai Wang
Tianlong Chen
AI4CE
46
0
0
31 Mar 2025
Previous
123456...939495
Next