ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1512.08512
  4. Cited By
Visually Indicated Sounds

Visually Indicated Sounds

28 December 2015
Andrew Owens
Phillip Isola
Josh H. McDermott
Antonio Torralba
Edward H. Adelson
William T. Freeman
ArXivPDFHTML

Papers citing "Visually Indicated Sounds"

50 / 206 papers shown
Title
Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment
Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment
Kim Sung-Bin
Arda Senocak
H. Ha
Andrew Owens
Tae-Hyun Oh
DiffM
VGen
46
37
0
30 Mar 2023
Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos
Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos
Kun Su
Kaizhi Qian
Eli Shlizerman
Antonio Torralba
Chuang Gan
VGen
AI4CE
35
20
0
29 Mar 2023
Sounding Video Generator: A Unified Framework for Text-guided Sounding
  Video Generation
Sounding Video Generator: A Unified Framework for Text-guided Sounding Video Generation
Jiawei Liu
Weining Wang
Sihan Chen
Xinxin Zhu
Qingbin Liu
DiffM
VGen
28
13
0
29 Mar 2023
Sound Localization from Motion: Jointly Learning Sound Direction and
  Camera Rotation
Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation
Ziyang Chen
Shengyi Qian
Andrew Owens
33
12
0
20 Mar 2023
The Audio-Visual BatVision Dataset for Research on Sight and Sound
The Audio-Visual BatVision Dataset for Research on Sight and Sound
Amandine Brunetto
Sascha Hornauer
Stella X. Yu
Fabien Moutarde
46
3
0
13 Mar 2023
Side Auth: Synthesizing Virtual Sensors for Authentication
Side Auth: Synthesizing Virtual Sensors for Authentication
Yan Long
Kevin Fu
AAML
20
0
0
27 Jan 2023
Chat2Map: Efficient Scene Mapping from Multi-Ego Conversations
Chat2Map: Efficient Scene Mapping from Multi-Ego Conversations
Sagnik Majumder
Hao Jiang
Pierre Moulon
E. Henderson
P. Calamia
Kristen Grauman
V. Ithapu
EgoV
40
7
0
04 Jan 2023
iQuery: Instruments as Queries for Audio-Visual Sound Separation
iQuery: Instruments as Queries for Audio-Visual Sound Separation
Jiaben Chen
Renrui Zhang
Dongze Lian
Jiaqi Yang
Ziyao Zeng
Jianbo Shi
34
27
0
07 Dec 2022
Muscles in Action
Muscles in Action
Mia Chiquier
Carl Vondrick
11
1
0
05 Dec 2022
Mix and Localize: Localizing Sound Sources in Mixtures
Mix and Localize: Localizing Sound Sources in Mixtures
Xixi Hu
Ziyang Chen
Andrew Owens
35
51
0
28 Nov 2022
Touch and Go: Learning from Human-Collected Vision and Touch
Touch and Go: Learning from Human-Collected Vision and Touch
Fengyu Yang
Chenyang Ma
Jiacheng Zhang
Jing Zhu
Wenzhen Yuan
Andrew Owens
25
53
0
22 Nov 2022
LISA: Localized Image Stylization with Audio via Implicit Neural
  Representation
LISA: Localized Image Stylization with Audio via Implicit Neural Representation
Seung Hyun Lee
Chanyoung Kim
Wonmin Byeon
Sang Ho Yoon
Jinkyu Kim
Sangpil Kim
35
3
0
21 Nov 2022
VarietySound: Timbre-Controllable Video to Sound Generation via
  Unsupervised Information Disentanglement
VarietySound: Timbre-Controllable Video to Sound Generation via Unsupervised Information Disentanglement
Chenye Cui
Yi Ren
Jinglin Liu
Rongjie Huang
Zhou Zhao
VGen
40
14
0
19 Nov 2022
Learning Audio-Visual Dynamics Using Scene Graphs for Audio Source
  Separation
Learning Audio-Visual Dynamics Using Scene Graphs for Audio Source Separation
Moitreya Chatterjee
Narendra Ahuja
A. Cherian
38
12
0
29 Oct 2022
Anticipative Feature Fusion Transformer for Multi-Modal Action
  Anticipation
Anticipative Feature Fusion Transformer for Multi-Modal Action Anticipation
Zeyun Zhong
David Schneider
Michael Voit
Rainer Stiefelhagen
Jürgen Beyerer
74
44
0
23 Oct 2022
Visual onoma-to-wave: environmental sound synthesis from visual
  onomatopoeias and sound-source images
Visual onoma-to-wave: environmental sound synthesis from visual onomatopoeias and sound-source images
Hien Ohnaka
Shinnosuke Takamichi
Keisuke Imoto
Yuki Okamoto
Kazuki Fujii
Hiroshi Saruwatari
DiffM
26
8
0
17 Oct 2022
That Sounds Right: Auditory Self-Supervision for Dynamic Robot
  Manipulation
That Sounds Right: Auditory Self-Supervision for Dynamic Robot Manipulation
Abitha Thankaraj
Lerrel Pinto
35
14
0
03 Oct 2022
Learning in Audio-visual Context: A Review, Analysis, and New
  Perspective
Learning in Audio-visual Context: A Review, Analysis, and New Perspective
Yake Wei
Di Hu
Yapeng Tian
Xuelong Li
46
55
0
20 Aug 2022
A Proposal for Foley Sound Synthesis Challenge
A Proposal for Foley Sound Synthesis Challenge
Keunwoo Choi
Sangshin Oh
Minsung Kang
Brian McFee
26
11
0
21 Jul 2022
Finding Fallen Objects Via Asynchronous Audio-Visual Integration
Finding Fallen Objects Via Asynchronous Audio-Visual Integration
Chuang Gan
Yi Gu
Siyuan Zhou
Jeremy Schwartz
S. Alter
James Traer
Dan Gutfreund
J. Tenenbaum
Josh H. McDermott
Antonio Torralba
57
19
0
07 Jul 2022
It's Time for Artistic Correspondence in Music and Video
It's Time for Artistic Correspondence in Music and Video
Dídac Surís
Carl Vondrick
Bryan C. Russell
Justin Salamon
23
37
0
14 Jun 2022
Play it by Ear: Learning Skills amidst Occlusion through Audio-Visual
  Imitation Learning
Play it by Ear: Learning Skills amidst Occlusion through Audio-Visual Imitation Learning
Maximilian Du
Olivia Y. Lee
Suraj Nair
Chelsea Finn
OffRL
62
33
0
30 May 2022
MESH2IR: Neural Acoustic Impulse Response Generator for Complex 3D
  Scenes
MESH2IR: Neural Acoustic Impulse Response Generator for Complex 3D Scenes
Anton Ratnarajah
Zhenyu Tang
R. Aralikatti
Tianyi Zhou
AI4CE
32
36
0
18 May 2022
Weakly-Supervised Action Detection Guided by Audio Narration
Weakly-Supervised Action Detection Guided by Audio Narration
Keren Ye
Adriana Kovashka
38
0
0
12 May 2022
Learning Visual Styles from Audio-Visual Associations
Learning Visual Styles from Audio-Visual Associations
Tingle Li
Yichen Liu
Andrew Owens
Hang Zhao
DiffM
23
21
0
10 May 2022
GWA: A Large High-Quality Acoustic Dataset for Audio Processing
GWA: A Large High-Quality Acoustic Dataset for Audio Processing
Zhenyu Tang
R. Aralikatti
Anton Ratnarajah
Tianyi Zhou
40
31
0
04 Apr 2022
Audio-Visual Fusion Layers for Event Type Aware Video Recognition
Audio-Visual Fusion Layers for Event Type Aware Video Recognition
Arda Senocak
Junsik Kim
Tae-Hyun Oh
H. Ryu
Dingzeyu Li
In So Kweon
27
1
0
12 Feb 2022
Sound and Visual Representation Learning with Multiple Pretraining Tasks
Sound and Visual Representation Learning with Multiple Pretraining Tasks
A. Vasudevan
Dengxin Dai
Luc Van Gool
SSL
38
6
0
04 Jan 2022
Multimodal Image Synthesis and Editing: The Generative AI Era
Multimodal Image Synthesis and Editing: The Generative AI Era
Fangneng Zhan
Yingchen Yu
Rongliang Wu
Jiahui Zhang
Shijian Lu
Lingjie Liu
Adam Kortylewski
Christian Theobalt
Eric Xing
EGVM
36
48
0
27 Dec 2021
Soundify: Matching Sound Effects to Video
Soundify: Matching Sound Effects to Video
David Chuan-En Lin
Anastasis Germanidis
Cristobal Valenzuela
Yining Shi
Nikolas Martelaro
30
16
0
17 Dec 2021
Geometry-Aware Multi-Task Learning for Binaural Audio Generation from
  Video
Geometry-Aware Multi-Task Learning for Binaural Audio Generation from Video
Rishabh Garg
Ruohan Gao
Kristen Grauman
15
28
0
21 Nov 2021
Structure from Silence: Learning Scene Structure from Ambient Sound
Structure from Silence: Learning Scene Structure from Ambient Sound
Ziyang Chen
Xixi Hu
Andrew Owens
31
26
0
10 Nov 2021
A Comparison of Deep Learning Models for the Prediction of Hand Hygiene
  Videos
A Comparison of Deep Learning Models for the Prediction of Hand Hygiene Videos
Rashmi Bakshi
3DH
45
1
0
03 Nov 2021
Taming Visually Guided Sound Generation
Taming Visually Guided Sound Generation
Vladimir E. Iashin
Esa Rahtu
VLM
32
122
0
17 Oct 2021
Visual Scene Graphs for Audio Source Separation
Visual Scene Graphs for Audio Source Separation
Moitreya Chatterjee
Jonathan Le Roux
Narendra Ahuja
A. Cherian
26
36
0
24 Sep 2021
Binaural Audio Generation via Multi-task Learning
Binaural Audio Generation via Multi-task Learning
Sijia Li
Shiguang Liu
Tianyi Zhou
21
12
0
02 Sep 2021
Multi-Modulation Network for Audio-Visual Event Localization
Multi-Modulation Network for Audio-Visual Event Localization
Hao Wang
Zhengjun Zha
Liang Li
Xuejin Chen
Jiebo Luo
30
2
0
26 Aug 2021
Sharing Cognition: Human Gesture and Natural Language Grounding Based
  Planning and Navigation for Indoor Robots
Sharing Cognition: Human Gesture and Natural Language Grounding Based Planning and Navigation for Indoor Robots
Gourav Kumar
Soumyadip Maity
R. Roychoudhury
Brojeshwar Bhowmick
LM&Ro
14
1
0
14 Aug 2021
Hand Pose Classification Based on Neural Networks
Hand Pose Classification Based on Neural Networks
Rashmi Bakshi
3DH
16
2
0
10 Aug 2021
FoleyGAN: Visually Guided Generative Adversarial Network-Based
  Synchronous Sound Generation in Silent Videos
FoleyGAN: Visually Guided Generative Adversarial Network-Based Synchronous Sound Generation in Silent Videos
Sanchita Ghose
John J. Prevost
GAN
27
26
0
20 Jul 2021
Visual-Tactile Cross-Modal Data Generation using Residue-Fusion GAN with
  Feature-Matching and Perceptual Losses
Visual-Tactile Cross-Modal Data Generation using Residue-Fusion GAN with Feature-Matching and Perceptual Losses
Shaoyu Cai
Kening Zhu
Yuki Ban
Takuji Narumi
28
37
0
12 Jul 2021
The Boombox: Visual Reconstruction from Acoustic Vibrations
The Boombox: Visual Reconstruction from Acoustic Vibrations
Boyuan Chen
Mia Chiquier
Hod Lipson
Carl Vondrick
41
10
0
17 May 2021
Exploiting Audio-Visual Consistency with Partial Supervision for Spatial
  Audio Generation
Exploiting Audio-Visual Consistency with Partial Supervision for Spatial Audio Generation
Yan-Bo Lin
Y. Wang
56
21
0
03 May 2021
A Large-Scale Study on Unsupervised Spatiotemporal Representation
  Learning
A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning
Christoph Feichtenhofer
Haoqi Fan
Bo Xiong
Ross B. Girshick
Kaiming He
SSL
AI4TS
39
257
0
29 Apr 2021
End-to-End Video-To-Speech Synthesis using Generative Adversarial
  Networks
End-to-End Video-To-Speech Synthesis using Generative Adversarial Networks
Rodrigo Mira
Konstantinos Vougioukas
Pingchuan Ma
Stavros Petridis
Björn W. Schuller
Maja Pantic
35
43
0
27 Apr 2021
Identifying Actions for Sound Event Classification
Identifying Actions for Sound Event Classification
Benjamin Elizalde
Radu Revutchi
Samarjit Das
Bhiksha Raj
Ian Lane
Laurie M. Heller
19
5
0
26 Apr 2021
Self-supervised object detection from audio-visual correspondence
Self-supervised object detection from audio-visual correspondence
Triantafyllos Afouras
Yuki M. Asano
Francois Fagan
Andrea Vedaldi
Florian Metze
SSL
31
46
0
13 Apr 2021
Visually Informed Binaural Audio Generation without Binaural Audios
Visually Informed Binaural Audio Generation without Binaural Audios
Xudong Xu
Hang Zhou
Ziwei Liu
Bo Dai
Xiaogang Wang
Dahua Lin
DiffM
13
55
0
13 Apr 2021
Collaborative Learning to Generate Audio-Video Jointly
Collaborative Learning to Generate Audio-Video Jointly
V. Kurmi
Vipul Bajaj
Badri N. Patro
K. Venkatesh
Vinay P. Namboodiri
Preethi Jyothi
VGen
22
11
0
01 Apr 2021
Audio Description from Image by Modal Translation Network
Audio Description from Image by Modal Translation Network
Hailong Ning
Xiangtao Zheng
Yuan Yuan
Xiaoqiang Lu
DiffM
8
17
0
18 Mar 2021
Previous
12345
Next