Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1912.11684
Cited By
v1
v2 (latest)
Look, Listen, and Act: Towards Audio-Visual Embodied Navigation
IEEE International Conference on Robotics and Automation (ICRA), 2019
25 December 2019
Chuang Gan
Yiwei Zhang
Jiajun Wu
Boqing Gong
J. Tenenbaum
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Look, Listen, and Act: Towards Audio-Visual Embodied Navigation"
50 / 79 papers shown
Embodied Navigation with Auxiliary Task of Action Description Prediction
Haru Kondoh
Asako Kanezaki
182
2
0
21 Oct 2025
Audio-Guided Visual Perception for Audio-Visual Navigation
Yi Wang
Yinfeng Yu
Fuchun Sun
Liejun Wang
Wendong Zheng
146
0
0
13 Oct 2025
Iterative Residual Cross-Attention Mechanism: An Integrated Approach for Audio-Visual Navigation Tasks
Hailong Zhang
Yinfeng Yu
Liejun Wang
Fuchun Sun
Wendong Zheng
100
0
0
30 Sep 2025
Dynamic Multi-Target Fusion for Efficient Audio-Visual Navigation
Yinfeng Yu
Hailong Zhang
Meiling Zhu
102
4
0
23 Sep 2025
Advancing Audio-Visual Navigation Through Multi-Agent Collaboration in 3D Environments
Hailong Zhang
Yinfeng Yu
Liejun Wang
Fuchun Sun
Wendong Zheng
127
0
0
21 Sep 2025
The Sound of Simulation: Learning Multimodal Sim-to-Real Robot Policies with Generative Audio
Renhao Wang
Haoran Geng
Tingle Li
Feishi Wang
Gopala Anumanchipalli
Trevor Darrell
Boyi Li
Pieter Abbeel
Jitendra Malik
Alexei A. Efros
VGen
279
3
0
03 Jul 2025
Differentiable Room Acoustic Rendering with Multi-View Vision Priors
Derong Jin
Ruohan Gao
368
3
0
30 Apr 2025
Multimodal Perception for Goal-oriented Navigation: A Survey
I-Tak Ieong
Hao Tang
LM&Ro
LRM
417
1
0
22 Apr 2025
Hearing Anywhere in Any Environment
Computer Vision and Pattern Recognition (CVPR), 2025
Xiulong Liu
Anurag Kumar
P. Calamia
Sebastia V. Amengual
Calvin Murdock
Ishwarya Ananthabhotla
Philip Robinson
Eli Shlizerman
V. Ithapu
Ruohan Gao
371
9
0
14 Apr 2025
AI-Gadget Kit: Integrating Swarm User Interfaces with LLM-driven Agents for Rich Tabletop Game Applications
Yijie Guo
Zhenhan Huang
Ruhan Wang
Zhihao Yao
Tianyu Yu
Zhiling Xu
Xinyu Zhao
Xueqing Li
Haipeng Mi
187
4
0
24 Jul 2024
NeRAF: 3D Scene Infused Neural Radiance and Acoustic Fields
Amandine Brunetto
Sascha Hornauer
Fabien Moutarde
623
11
0
28 May 2024
Audio-Visual Segmentation via Unlabeled Frame Exploitation
Jinxiang Liu
Yikun Liu
Fei Zhang
Chen Ju
Ya Zhang
Yanfeng Wang
384
29
0
17 Mar 2024
Sounding Bodies: Modeling 3D Spatial Sound of Humans Using Body Pose and Audio
Neural Information Processing Systems (NeurIPS), 2023
Xudong Xu
Dejan Marković
Jacob Sandakly
Todd Keebler
Steven Krenn
Alexander Richard
186
8
0
01 Nov 2023
Find What You Want: Learning Demand-conditioned Object Attribute Space for Demand-driven Navigation
Neural Information Processing Systems (NeurIPS), 2023
Hongchen Wang
Andy Guan Hong Chen
Xiaoqi Li
Mingdong Wu
Hao Dong
546
28
0
15 Sep 2023
Omnidirectional Information Gathering for Knowledge Transfer-based Audio-Visual Navigation
IEEE International Conference on Computer Vision (ICCV), 2023
Jinyu Chen
Wenguan Wang
Siying Liu
Jiaming Song
Yi Yang
330
18
0
20 Aug 2023
Sonicverse: A Multisensory Simulation Platform for Embodied Household Agents that See and Hear
IEEE International Conference on Robotics and Automation (ICRA), 2023
Ruohan Gao
Hao Li
Gokul Dharan
Zhuzhu Wang
Chengshu Li
Fei Xia
Silvio Savarese
Li Fei-Fei
Jiajun Wu
367
15
0
01 Jun 2023
Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos
Computer Vision and Pattern Recognition (CVPR), 2023
Kun Su
Kaizhi Qian
Eli Shlizerman
Antonio Torralba
Chuang Gan
VGen
AI4CE
361
29
0
29 Mar 2023
Chat2Map: Efficient Scene Mapping from Multi-Ego Conversations
Computer Vision and Pattern Recognition (CVPR), 2023
Sagnik Majumder
Hao Jiang
Pierre Moulon
E. Henderson
P. Calamia
Kristen Grauman
V. Ithapu
EgoV
373
12
0
04 Jan 2023
On Realization of Intelligent Decision-Making in the Real World: A Foundation Decision Model Perspective
Ying Wen
Bo Liu
M. Zhou
Shufang Hou
Zhe Cao
Chenyang Le
Jingxiao Chen
Zheng Tian
Weinan Zhang
Jun Wang
AI4CE
269
13
0
24 Dec 2022
Knowledge-driven Scene Priors for Semantic Audio-Visual Embodied Navigation
Gyan Tatiya
Jonathan M Francis
Luca Bondi
Ingrid Navarro
Eric Nyberg
Jivko Sinapov
Jean Oh
183
10
0
21 Dec 2022
A General Purpose Supervisory Signal for Embodied Agents
Kunal Pratap Singh
Jordi Salvador
Luca Weihs
Aniruddha Kembhavi
SSL
287
4
0
01 Dec 2022
Ask4Help: Learning to Leverage an Expert for Embodied Tasks
Neural Information Processing Systems (NeurIPS), 2022
Kunal Pratap Singh
Luca Weihs
Alvaro Herrasti
Jonghyun Choi
Aniruddha Kemhavi
Roozbeh Mottaghi
241
29
0
18 Nov 2022
HUMANISE: Language-conditioned Human Motion Generation in 3D Scenes
Neural Information Processing Systems (NeurIPS), 2022
Zan Wang
Yixin Chen
Tengyu Liu
Yixin Zhu
Wei Liang
Siyuan Huang
265
178
0
18 Oct 2022
AVLEN: Audio-Visual-Language Embodied Navigation in 3D Environments
Neural Information Processing Systems (NeurIPS), 2022
Sudipta Paul
Amit K. Roy-Chowdhury
A. Cherian
248
36
0
14 Oct 2022
Weakly-Supervised Multi-Granularity Map Learning for Vision-and-Language Navigation
Neural Information Processing Systems (NeurIPS), 2022
Peihao Chen
Dongyu Ji
Kun-Li Channing Lin
Runhao Zeng
Thomas H. Li
Zhuliang Yu
Chuang Gan
SSL
267
110
0
14 Oct 2022
Learning Active Camera for Multi-Object Navigation
Neural Information Processing Systems (NeurIPS), 2022
Peihao Chen
Dongyu Ji
Kun-Li Channing Lin
Weiwen Hu
Wenbing Huang
Thomas H. Li
Ming Tan
Chuang Gan
304
35
0
14 Oct 2022
Retrospectives on the Embodied AI Workshop
Matt Deitke
Dhruv Batra
Yonatan Bisk
Tommaso Campari
Angel X. Chang
...
Jesse Thomason
Alexander Toshev
Joanne Truong
Luca Weihs
Jiajun Wu
LM&Ro
413
53
0
13 Oct 2022
AVE-CLIP: AudioCLIP-based Multi-window Temporal Transformer for Audio Visual Event Localization
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Tanvir Mahmud
Diana Marculescu
CLIP
265
45
0
11 Oct 2022
Pay Self-Attention to Audio-Visual Navigation
British Machine Vision Conference (BMVC), 2022
Yinfeng Yu
Lele Cao
Gang Hua
Xiaohong Liu
Liejun Wang
355
17
0
04 Oct 2022
Anticipating the Unseen Discrepancy for Vision and Language Navigation
Yujie Lu
Huiliang Zhang
Ping Nie
Weixi Feng
Wenda Xu
Xinze Wang
William Yang Wang
272
3
0
10 Sep 2022
Learning in Audio-visual Context: A Review, Analysis, and New Perspective
Yake Wei
Di Hu
Yapeng Tian
Xuelong Li
331
76
0
20 Aug 2022
Impact Makes a Sound and Sound Makes an Impact: Sound Guides Representations and Explorations
IEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2022
Xufeng Zhao
C. Weber
Muhammad Burhan Hafez
S. Wermter
229
10
0
04 Aug 2022
Finding Fallen Objects Via Asynchronous Audio-Visual Integration
Computer Vision and Pattern Recognition (CVPR), 2022
Chuang Gan
Yi Gu
Siyuan Zhou
Jeremy Schwartz
S. Alter
James Traer
Dan Gutfreund
J. Tenenbaum
Josh H. McDermott
Antonio Torralba
352
20
0
07 Jul 2022
SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning
Neural Information Processing Systems (NeurIPS), 2022
Changan Chen
Carl Schissler
Sanchit Garg
Philip Kobernik
Alexander Clegg
P. Calamia
Dhruv Batra
Philip Robinson
Kristen Grauman
3DGS
394
123
0
16 Jun 2022
ProcTHOR: Large-Scale Embodied AI Using Procedural Generation
Matt Deitke
Eli VanderBilt
Alvaro Herrasti
Luca Weihs
Jordi Salvador
...
Winson Han
Eric Kolve
Ali Farhadi
Aniruddha Kembhavi
Roozbeh Mottaghi
LM&Ro
399
428
0
14 Jun 2022
Few-Shot Audio-Visual Learning of Environment Acoustics
Neural Information Processing Systems (NeurIPS), 2022
Sagnik Majumder
Changan Chen
Ziad Al-Halah
Kristen Grauman
311
73
0
08 Jun 2022
Towards Generalisable Audio Representations for Audio-Visual Navigation
Shunqi Mao
Chaoyi Zhang
Heng Wang
Weidong (Tom) Cai
197
1
0
01 Jun 2022
Learning Neural Acoustic Fields
Neural Information Processing Systems (NeurIPS), 2022
Andrew F. Luo
Yilun Du
Michael J. Tarr
J. Tenenbaum
Antonio Torralba
Chuang Gan
AI4CE
428
124
0
04 Apr 2022
Sound Adversarial Audio-Visual Navigation
International Conference on Learning Representations (ICLR), 2022
Yinfeng Yu
Wenbing Huang
Gang Hua
Changan Chen
Yikai Wang
Xiaohong Liu
AAML
253
44
0
22 Feb 2022
Visual Acoustic Matching
Computer Vision and Pattern Recognition (CVPR), 2022
Changan Chen
Ruohan Gao
P. Calamia
Kristen Grauman
325
66
0
14 Feb 2022
Zero Experience Required: Plug & Play Modular Transfer Learning for Semantic Visual Navigation
Computer Vision and Pattern Recognition (CVPR), 2022
Ziad Al-Halah
Santhosh Kumar Ramakrishnan
Kristen Grauman
VLM
412
113
0
05 Feb 2022
Active Audio-Visual Separation of Dynamic Sound Sources
European Conference on Computer Vision (ECCV), 2022
Sagnik Majumder
Kristen Grauman
371
23
0
02 Feb 2022
PONI: Potential Functions for ObjectGoal Navigation with Interaction-free Learning
Computer Vision and Pattern Recognition (CVPR), 2022
Santhosh Kumar Ramakrishnan
Devendra Singh Chaplot
Ziad Al-Halah
Jitendra Malik
Kristen Grauman
475
232
0
25 Jan 2022
Symmetry-aware Neural Architecture for Embodied Visual Navigation
Shuang Liu
Takayuki Okatani
369
2
0
17 Dec 2021
Catch Me If You Hear Me: Audio-Visual Navigation in Complex Unmapped Environments with Moving Sounds
IEEE Robotics and Automation Letters (RA-L), 2021
Abdelrahman Younes
Daniel Honerkamp
Tim Welschehold
Abhinav Valada
496
52
0
29 Nov 2021
Geometry-Aware Multi-Task Learning for Binaural Audio Generation from Video
British Machine Vision Conference (BMVC), 2021
Rishabh Garg
Ruohan Gao
Kristen Grauman
204
31
0
21 Nov 2021
Structure from Silence: Learning Scene Structure from Ambient Sound
Conference on Robot Learning (CoRL), 2021
Ziyang Chen
Xixi Hu
Andrew Owens
253
31
0
10 Nov 2021
Space-Time Memory Network for Sounding Object Localization in Videos
British Machine Vision Conference (BMVC), 2021
Sizhe Li
Yapeng Tian
Chenliang Xu
142
13
0
10 Nov 2021
Audio-Visual Grounding Referring Expression for Robotic Manipulation
IEEE International Conference on Robotics and Automation (ICRA), 2021
Yefei Wang
Kaili Wang
Yi Wang
Di Guo
Huaping Liu
F. Sun
180
18
0
22 Sep 2021
Multi-Agent Embodied Visual Semantic Navigation with Scene Prior Knowledge
Xinzhu Liu
Di Guo
Huaping Liu
F. Sun
EgoV
236
34
0
20 Sep 2021
1
2
Next
Page 1 of 2