Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2007.03669
Cited By
v1
v2 (latest)
See, Hear, Explore: Curiosity via Audio-Visual Association
Neural Information Processing Systems (NeurIPS), 2020
7 July 2020
Victoria Dean
Shubham Tulsiani
Abhinav Gupta
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"See, Hear, Explore: Curiosity via Audio-Visual Association"
45 / 45 papers shown
Embodied Navigation with Auxiliary Task of Action Description Prediction
Haru Kondoh
Asako Kanezaki
190
2
0
21 Oct 2025
Audio-Guided Visual Perception for Audio-Visual Navigation
Yi Wang
Yinfeng Yu
Fuchun Sun
Liejun Wang
Wendong Zheng
157
0
0
13 Oct 2025
Audio-Guided Dynamic Modality Fusion with Stereo-Aware Attention for Audio-Visual Navigation
Jia Li
Yinfeng Yu
Liejun Wang
Fuchun Sun
Wendong Zheng
291
5
0
21 Sep 2025
Prosody as a Teaching Signal for Agent Learning: Exploratory Studies and Algorithmic Implications
International Conference on Multimodal Interaction (ICMI), 2024
Matilda Knierim
Sahil Jain
Murat Han Aydoğan
Kenneth Mitra
K. Desai
Akanksha Saran
Kim Baraka
180
1
0
31 Oct 2024
Hearing Touch: Audio-Visual Pretraining for Contact-Rich Manipulation
IEEE International Conference on Robotics and Automation (ICRA), 2024
Jared Mejia
Victoria Dean
Tess Hellebrekers
Abhinav Gupta
315
24
0
14 May 2024
Do We Really Need a Complex Agent System? Distill Embodied Agent into a Single Model
Zhonghan Zhao
Ke Ma
Wenhao Chai
Xuan Wang
Kewei Chen
Dongxu Guo
Yanting Zhang
Hongwei Wang
Gaoang Wang
279
24
0
06 Apr 2024
Hierarchical Auto-Organizing System for Open-Ended Multi-Agent Navigation
Zhonghan Zhao
Kewei Chen
Dongxu Guo
Wenhao Chai
Tianbo Ye
Yanting Zhang
Gaoang Wang
364
28
0
13 Mar 2024
See and Think: Embodied Agent in Virtual Environment
European Conference on Computer Vision (ECCV), 2023
Zhonghan Zhao
Wenhao Chai
Xuan Wang
Li Boyi
Shengyu Hao
Shidong Cao
Tianbo Ye
Gaoang Wang
LM&Ro
LLMAG
446
60
0
26 Nov 2023
Measuring Acoustics with Collaborative Multiple Agents
International Joint Conference on Artificial Intelligence (IJCAI), 2023
Yinfeng Yu
Changan Chen
Lele Cao
Fangkai Yang
Gang Hua
397
15
0
09 Oct 2023
The Wizard of Curiosities: Enriching Dialogues with Fun Facts
SIGDIAL Conferences (SIGDIAL), 2023
Frederico Vicente
Rafael Ferreira
David Semedo
João Magalhães
212
3
0
20 Sep 2023
Hyperbolic Audio-visual Zero-shot Learning
IEEE International Conference on Computer Vision (ICCV), 2023
Jie Hong
Zeeshan Hayder
Junlin Han
Pengfei Fang
Mehrtash Harandi
L. Petersson
293
27
0
24 Aug 2023
Omnidirectional Information Gathering for Knowledge Transfer-based Audio-Visual Navigation
IEEE International Conference on Computer Vision (ICCV), 2023
Jinyu Chen
Wenguan Wang
Siying Liu
Jiaming Song
Yi Yang
341
21
0
20 Aug 2023
Never Explore Repeatedly in Multi-Agent Reinforcement Learning
Chenghao Li
Tonghan Wang
Chongjie Zhang
Qianchuan Zhao
218
1
0
19 Aug 2023
RealImpact: A Dataset of Impact Sound Fields for Real Objects
Computer Vision and Pattern Recognition (CVPR), 2023
Samuel Clarke
Ruohan Gao
Mason Wang
M. Rau
Julia Xu
Jui-Hsien Wang
Doug L. James
Jiajun Wu
252
13
0
16 Jun 2023
Sonicverse: A Multisensory Simulation Platform for Embodied Household Agents that See and Hear
IEEE International Conference on Robotics and Automation (ICRA), 2023
Ruohan Gao
Hao Li
Gokul Dharan
Zhuzhu Wang
Chengshu Li
Fei Xia
Silvio Savarese
Li Fei-Fei
Jiajun Wu
381
15
0
01 Jun 2023
Chat2Map: Efficient Scene Mapping from Multi-Ego Conversations
Computer Vision and Pattern Recognition (CVPR), 2023
Sagnik Majumder
Hao Jiang
Pierre Moulon
E. Henderson
P. Calamia
Kristen Grauman
V. Ithapu
EgoV
384
12
0
04 Jan 2023
Learning Active Camera for Multi-Object Navigation
Neural Information Processing Systems (NeurIPS), 2022
Peihao Chen
Dongyu Ji
Kun-Li Channing Lin
Weiwen Hu
Wenbing Huang
Thomas H. Li
Ming Tan
Chuang Gan
305
36
0
14 Oct 2022
Pay Self-Attention to Audio-Visual Navigation
British Machine Vision Conference (BMVC), 2022
Yinfeng Yu
Lele Cao
Gang Hua
Xiaohong Liu
Liejun Wang
382
18
0
04 Oct 2022
That Sounds Right: Auditory Self-Supervision for Dynamic Robot Manipulation
Conference on Robot Learning (CoRL), 2022
Abitha Thankaraj
Lerrel Pinto
218
21
0
03 Oct 2022
Masked Imitation Learning: Discovering Environment-Invariant Modalities in Multimodal Demonstrations
IEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2022
Yilun Hao
Ruinan Wang
Zhangjie Cao
Zihan Wang
Yuchen Cui
Dorsa Sadigh
300
4
0
16 Sep 2022
Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions
ACM Computing Surveys (ACM CSUR), 2022
Paul Pu Liang
Amir Zadeh
Louis-Philippe Morency
368
218
0
07 Sep 2022
Self-Supervised Exploration via Temporal Inconsistency in Reinforcement Learning
IEEE Transactions on Artificial Intelligence (IEEE TAI), 2022
Zijian Gao
Kele Xu
Yuanzhao Zhai
Dawei Feng
Bo Ding
Xinjun Mao
Huaimin Wang
237
3
0
24 Aug 2022
Impact Makes a Sound and Sound Makes an Impact: Sound Guides Representations and Explorations
IEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2022
Xufeng Zhao
C. Weber
Muhammad Burhan Hafez
S. Wermter
232
11
0
04 Aug 2022
Finding Fallen Objects Via Asynchronous Audio-Visual Integration
Computer Vision and Pattern Recognition (CVPR), 2022
Chuang Gan
Yi Gu
Siyuan Zhou
Jeremy Schwartz
S. Alter
James Traer
Dan Gutfreund
J. Tenenbaum
Josh H. McDermott
Antonio Torralba
360
20
0
07 Jul 2022
Beyond Visual Field of View: Perceiving 3D Environment with Echoes and Vision
Xiangjie Sui
Esa Rahtu
Hang Zhao
MDE
403
8
0
03 Jul 2022
SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning
Neural Information Processing Systems (NeurIPS), 2022
Changan Chen
Carl Schissler
Sanchit Garg
Philip Kobernik
Alexander Clegg
P. Calamia
Dhruv Batra
Philip Robinson
Kristen Grauman
3DGS
404
124
0
16 Jun 2022
Few-Shot Audio-Visual Learning of Environment Acoustics
Neural Information Processing Systems (NeurIPS), 2022
Sagnik Majumder
Changan Chen
Ziad Al-Halah
Kristen Grauman
322
76
0
08 Jun 2022
Towards Generalisable Audio Representations for Audio-Visual Navigation
Shunqi Mao
Chaoyi Zhang
Heng Wang
Weidong (Tom) Cai
198
1
0
01 Jun 2022
Play it by Ear: Learning Skills amidst Occlusion through Audio-Visual Imitation Learning
Maximilian Du
Olivia Y. Lee
Suraj Nair
Chelsea Finn
OffRL
311
46
0
30 May 2022
Nuclear Norm Maximization Based Curiosity-Driven Learning
Chao Chen
Zijian Gao
Kele Xu
Sen Yang
Yiying Li
Bo Ding
Dawei Feng
Huaimin Wang
662
5
0
21 May 2022
Exploration in Deep Reinforcement Learning: A Survey
Information Fusion (Inf. Fusion), 2022
Pawel Ladosz
Lilian Weng
Minwoo Kim
H. Oh
OffRL
421
550
0
02 May 2022
Brainish: Formalizing A Multimodal Language for Intelligence and Consciousness
Paul Pu Liang
508
7
0
14 Apr 2022
Sound Adversarial Audio-Visual Navigation
International Conference on Learning Representations (ICLR), 2022
Yinfeng Yu
Wenbing Huang
Gang Hua
Changan Chen
Yikai Wang
Xiaohong Liu
AAML
261
48
0
22 Feb 2022
Visual Acoustic Matching
Computer Vision and Pattern Recognition (CVPR), 2022
Changan Chen
Ruohan Gao
P. Calamia
Kristen Grauman
341
67
0
14 Feb 2022
Toward Practical Monocular Indoor Depth Estimation
Computer Vision and Pattern Recognition (CVPR), 2021
Cho-Ying Wu
Jialiang Wang
Michael Hall
Ulrich Neumann
Shuochen Su
3DV
MDE
317
91
0
04 Dec 2021
Geometry-Aware Multi-Task Learning for Binaural Audio Generation from Video
British Machine Vision Conference (BMVC), 2021
Rishabh Garg
Ruohan Gao
Kristen Grauman
213
33
0
21 Nov 2021
MultiBench: Multiscale Benchmarks for Multimodal Representation Learning
Paul Pu Liang
Yiwei Lyu
Xiang Fan
Zetian Wu
Yun Cheng
...
Peter Wu
Michelle A. Lee
Yuke Zhu
Ruslan Salakhutdinov
Louis-Philippe Morency
VLM
343
239
0
15 Jul 2021
Deep Learning for Embodied Vision Navigation: A Survey
Fengda Zhu
Yi Zhu
Vincent CS Lee
Xiaodan Liang
Xiaojun Chang
EgoV
LM&Ro
603
0
0
07 Jul 2021
Learning Audio-Visual Dereverberation
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Changan Chen
Wei-Ju Sun
David Harwath
Kristen Grauman
270
37
0
14 Jun 2021
Ask & Explore: Grounded Question Answering for Curiosity-Driven Exploration
Jivat Neet Kaur
Yiding Jiang
Paul Pu Liang
LRM
180
2
0
24 Apr 2021
Touch-based Curiosity for Sparse-Reward Tasks
Sai Rajeswar
Cyril Ibrahim
Nitin Surya
Florian Golemo
David Vazquez
Rameswar Panda
Pedro H. O. Pinheiro
201
6
0
01 Apr 2021
Audio-Visual Floorplan Reconstruction
IEEE International Conference on Computer Vision (ICCV), 2020
Senthil Purushwalkam
S. V. A. Garí
V. Ithapu
Carl Schissler
Philip Robinson
Abhinav Gupta
Kristen Grauman
VGen
3DV
393
45
0
31 Dec 2020
SEMI: Self-supervised Exploration via Multisensory Incongruity
IEEE International Conference on Robotics and Automation (ICRA), 2020
Jianren Wang
Ziwen Zhuang
Hang Zhao
SSL
213
1
0
26 Sep 2020
Noisy Agents: Self-supervised Exploration by Predicting Auditory Events
IEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2020
Chuang Gan
Xiaoyu Chen
Phillip Isola
Antonio Torralba
J. Tenenbaum
198
7
0
27 Jul 2020
ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation
Chuang Gan
Jeremy Schwartz
S. Alter
Damian Mrowca
Martin Schrimpf
...
Antonio Torralba
J. DiCarlo
J. Tenenbaum
Josh H. McDermott
Daniel L. K. Yamins
VGen
530
324
0
09 Jul 2020
1
Page 1 of 1