ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.08748
  4. Cited By
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with
  Fine-Grained Chinese Understanding

Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

14 May 2024
Zhimin Li
Jianwei Zhang
Qin Lin
Jiangfeng Xiong
Yanxin Long
Xinchi Deng
Yingfang Zhang
Xingchao Liu
Minbin Huang
Zedong Xiao
Dayou Chen
Jiajun He
Jiahao Li
Wenyue Li
Chen Zhang
Rongwei Quan
Jianxiang Lu
Jiabin Huang
Xiaoyan Yuan
Xiao-Ting Zheng
Yixuan Li
Jihong Zhang
Chao Zhang
Mengxi Chen
Jie Liu
Zheng Fang
Weiyan Wang
J. Xue
Yang-Dan Tao
Jianchen Zhu
Kai Liu
Si-Da Lin
Yifu Sun
Yun Li
Dongdong Wang
Ming-Dao Chen
Zhichao Hu
Xiao Xiao
Yan Chen
Yuhong Liu
Wei Liu
Dingyong Wang
Yong Yang
Jie Jiang
Qinglin Lu
    ViT
ArXivPDFHTML

Papers citing "Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding"

24 / 24 papers shown
Title
Accelerating Diffusion Transformer via Increment-Calibrated Caching with Channel-Aware Singular Value Decomposition
Accelerating Diffusion Transformer via Increment-Calibrated Caching with Channel-Aware Singular Value Decomposition
Zhiyuan Chen
Keyi Li
Yifan Jia
Le Ye
Yufei Ma
DiffM
25
0
0
09 May 2025
WILD: a new in-the-Wild Image Linkage Dataset for synthetic image attribution
WILD: a new in-the-Wild Image Linkage Dataset for synthetic image attribution
Pietro Bongini
S. Mandelli
Andrea Montibeller
Mirko Casu
Orazio Pontorno
...
Paolo Bestagini
Irene Amerini
F. D. De Natale
S. Battiato
Mauro Barni
VLM
76
0
0
28 Apr 2025
RepText: Rendering Visual Text via Replicating
RepText: Rendering Visual Text via Replicating
H. Wang
Y. Xu
Y. Li
J. Li
Chaowei Zhang
J. Wang
Kejia Yang
Z. Chen
VLM
66
0
0
28 Apr 2025
From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning
From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning
Le Zhuo
Liangbing Zhao
Sayak Paul
Yue Liao
Renrui Zhang
Yi Xin
Peng Gao
Mohamed Elhoseiny
H. Li
VLM
63
0
0
22 Apr 2025
Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Size Wu
W. Zhang
Lumin Xu
Sheng Jin
Zhonghua Wu
Qingyi Tao
Wentao Liu
Wei Li
Chen Change Loy
VGen
59
2
0
27 Mar 2025
FireEdit: Fine-grained Instruction-based Image Editing via Region-aware Vision Language Model
FireEdit: Fine-grained Instruction-based Image Editing via Region-aware Vision Language Model
Jun Zhou
J. Li
Zunnan Xu
Hanhui Li
Yiji Cheng
Fa-Ting Hong
Qin Lin
Qinglin Lu
Xiaodan Liang
DiffM
62
1
0
25 Mar 2025
Unleashing Vecset Diffusion Model for Fast Shape Generation
Unleashing Vecset Diffusion Model for Fast Shape Generation
Zeqiang Lai
Yunfei Zhao
Zibo Zhao
Haolin Liu
Fuyun Wang
...
Jinwei Huang
Yuhong Liu
Jie Jiang
Chunchao Guo
Xiangyu Yue
DiffM
76
0
0
20 Mar 2025
Personalize Anything for Free with Diffusion Transformer
Personalize Anything for Free with Diffusion Transformer
Haoran Feng
Zehuan Huang
Lin Li
Hairong Lv
Lu Sheng
DiffM
72
1
0
16 Mar 2025
RectifiedHR: Enable Efficient High-Resolution Image Generation via Energy Rectification
Zhen Yang
Guibao Shen
Liang Hou
Mushui Liu
Luozhou Wang
Xin Tao
Pengfei Wan
Di Zhang
Ying-cong Chen
DiffM
74
0
0
04 Mar 2025
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model
Guoqing Ma
Haoyang Huang
K. Yan
L. Chen
Nan Duan
...
Y. Wang
Yuanwei Lu
Yu-Cheng Chen
Yu-Juan Luo
Y. Luo
DiffM
VGen
145
14
0
14 Feb 2025
Matrix3D: Large Photogrammetry Model All-in-One
Matrix3D: Large Photogrammetry Model All-in-One
Yuanxun Lu
Jingyang Zhang
Tian Fang
Jean-Daniel Nahmias
Yanghai Tsin
Long Quan
Xun Cao
Yao Yao
Shiwei Li
103
4
0
11 Feb 2025
Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation
Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation
Zibo Zhao
Zeqiang Lai
Qingxiang Lin
Yunfei Zhao
Haolin Liu
...
Jingwei Huang
Chunchao Guo
Jie Jiang
Jingwei Huang
Chunchao Guo
101
19
0
21 Jan 2025
ACE++: Instruction-Based Image Creation and Editing via Context-Aware Content Filling
ACE++: Instruction-Based Image Creation and Editing via Context-Aware Content Filling
Chaojie Mao
J. Zhang
Yulin Pan
Zeyinzi Jiang
Zhen Han
Yu Liu
Jingren Zhou
DiffM
34
15
0
05 Jan 2025
F-Bench: Rethinking Human Preference Evaluation Metrics for Benchmarking
  Face Generation, Customization, and Restoration
F-Bench: Rethinking Human Preference Evaluation Metrics for Benchmarking Face Generation, Customization, and Restoration
Lu Liu
Huiyu Duan
Qiang Hu
Liu Yang
Chunlei Cai
Tianxiao Ye
Huayu Liu
Xiaoyun Zhang
Guangtao Zhai
EGVM
92
1
0
17 Dec 2024
AI-generated Image Detection: Passive or Watermark?
AI-generated Image Detection: Passive or Watermark?
Moyang Guo
Yuepeng Hu
Zhengyuan Jiang
Zeyu Li
Amir Sadovnik
Arka Daw
Neil Zhenqiang Gong
67
1
0
20 Nov 2024
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion
  Transformers
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers
Enze Xie
Junsong Chen
Junyu Chen
Han Cai
Haotian Tang
...
Zhekai Zhang
Muyang Li
Ligeng Zhu
Y. Lu
Song Han
VLM
26
48
0
14 Oct 2024
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
Dongyang Liu
Shitian Zhao
Le Zhuo
Weifeng Lin
Yu Qiao
Xinyue Li
Qi Qin
Yu Qiao
Hongsheng Li
Peng Gao
MLLM
60
48
0
05 Aug 2024
CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models
CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models
Zheng Chong
Xiao Dong
Haoxiang Li
Shiyue Zhang
Wenqing Zhang
Xujie Zhang
Hanqing Zhao
D. Jiang
Xiaodan Liang
DiffM
48
17
0
21 Jul 2024
DialogGen: Multi-modal Interactive Dialogue System for Multi-turn Text-to-Image Generation
DialogGen: Multi-modal Interactive Dialogue System for Multi-turn Text-to-Image Generation
Minbin Huang
Yanxin Long
Xinchi Deng
Ruihang Chu
Jiangfeng Xiong
Xiaodan Liang
Hong Cheng
Qinglin Lu
Wei Liu
MLLM
EGVM
59
8
0
13 Mar 2024
One-step Diffusion with Distribution Matching Distillation
One-step Diffusion with Distribution Matching Distillation
Tianwei Yin
Michael Gharbi
Richard Zhang
Eli Shechtman
Frédo Durand
William T. Freeman
Taesung Park
DiffM
124
215
0
30 Nov 2023
Adversarial Diffusion Distillation
Adversarial Diffusion Distillation
Axel Sauer
Dominik Lorenz
A. Blattmann
Robin Rombach
138
326
0
28 Nov 2023
UFOGen: You Forward Once Large Scale Text-to-Image Generation via
  Diffusion GANs
UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs
Yanwu Xu
Yang Zhao
Zhisheng Xiao
Tingbo Hou
129
105
0
14 Nov 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
Muse: Text-To-Image Generation via Masked Generative Transformers
Muse: Text-To-Image Generation via Masked Generative Transformers
Huiwen Chang
Han Zhang
Jarred Barber
AJ Maschinot
José Lezama
...
Kevin Patrick Murphy
William T. Freeman
Michael Rubinstein
Yuanzhen Li
Dilip Krishnan
DiffM
197
515
0
02 Jan 2023
1