Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.12221
Cited By
Images that Sound: Composing Images and Sounds on a Single Canvas
20 May 2024
Ziyang Chen
Daniel Geng
Andrew Owens
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Images that Sound: Composing Images and Sounds on a Single Canvas"
17 / 17 papers shown
Title
LookingGlass: Generative Anamorphoses via Laplacian Pyramid Warping
Pascal Chang
Sergio Sancho
Jingwei Tang
Markus Gross
Vinicius Azevedo
23
0
0
11 Apr 2025
Unconditional Priors Matter! Improving Conditional Generation of Fine-Tuned Diffusion Models
Prin Phunyaphibarn
Phillip Y. Lee
Jaihoon Kim
Minhyuk Sung
DiffM
78
0
0
26 Mar 2025
Making Images from Images: Interleaving Denoising and Transformation
S. Baluja
David Marwood
Ashwin Baluja
DiffM
68
0
0
24 Nov 2024
VMAS: Video-to-Music Generation via Semantic Alignment in Web Music Videos
Yan-Bo Lin
Yu Tian
L. Yang
Gedas Bertasius
Heng Wang
VGen
26
7
0
11 Sep 2024
Emergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space
Core Francisco Park
Maya Okawa
Andrew Lee
Ekdeep Singh Lubana
Hidenori Tanaka
50
6
0
27 Jun 2024
Factorized Diffusion: Perceptual Illusions by Noise Decomposition
Daniel Geng
Inbum Park
Andrew Owens
DiffM
36
13
0
17 Apr 2024
T-VSL: Text-Guided Visual Sound Source Localization in Mixtures
Tanvir Mahmud
Yapeng Tian
Diana Marculescu
42
7
0
02 Apr 2024
Fast Timing-Conditioned Latent Audio Diffusion
Zach Evans
CJ Carr
Josiah Taylor
Scott H. Hawley
Jordi Pons
DiffM
74
98
0
07 Feb 2024
Lumiere: A Space-Time Diffusion Model for Video Generation
Omer Bar-Tal
Hila Chefer
Omer Tov
Charles Herrmann
Roni Paiss
...
T. Michaeli
Oliver Wang
Deqing Sun
Tali Dekel
Inbar Mosseri
VGen
101
214
0
23 Jan 2024
THInImg: Cross-modal Steganography for Presenting Talking Heads in Images
Lin Zhao
Hongxuan Li
Xuefei Ning
Xinru Jiang
25
1
0
28 Nov 2023
Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model
Deepanway Ghosal
Navonil Majumder
Ambuj Mehrish
Soujanya Poria
135
137
0
24 Apr 2023
Learning State-Aware Visual Representations from Audible Interactions
Himangi Mittal
Pedro Morgado
Unnat Jain
Abhinav Gupta
55
22
0
27 Sep 2022
Denoising Diffusion Restoration Models
Bahjat Kawar
Michael Elad
Stefano Ermon
Jiaming Song
DiffM
204
770
0
27 Jan 2022
Palette: Image-to-Image Diffusion Models
Chitwan Saharia
William Chan
Huiwen Chang
Chris A. Lee
Jonathan Ho
Tim Salimans
David J. Fleet
Mohammad Norouzi
DiffM
VLM
322
1,570
0
10 Nov 2021
Diffusion Probabilistic Models for 3D Point Cloud Generation
Shitong Luo
Wei Hu
3DPC
164
711
0
02 Mar 2021
VisualVoice: Audio-Visual Speech Separation with Cross-Modal Consistency
Ruohan Gao
Kristen Grauman
CVBM
185
196
0
08 Jan 2021
VisualEchoes: Spatial Image Representation Learning through Echolocation
Ruohan Gao
Changan Chen
Ziad Al-Halah
Carl Schissler
Kristen Grauman
MDE
SSL
156
83
0
04 May 2020
1