ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2007.03669
  4. Cited By
See, Hear, Explore: Curiosity via Audio-Visual Association
v1v2 (latest)

See, Hear, Explore: Curiosity via Audio-Visual Association

Neural Information Processing Systems (NeurIPS), 2020
7 July 2020
Victoria Dean
Shubham Tulsiani
Abhinav Gupta
ArXiv (abs)PDFHTML

Papers citing "See, Hear, Explore: Curiosity via Audio-Visual Association"

45 / 45 papers shown
Embodied Navigation with Auxiliary Task of Action Description Prediction
Embodied Navigation with Auxiliary Task of Action Description Prediction
Haru Kondoh
Asako Kanezaki
190
2
0
21 Oct 2025
Audio-Guided Visual Perception for Audio-Visual Navigation
Audio-Guided Visual Perception for Audio-Visual Navigation
Yi Wang
Yinfeng Yu
Fuchun Sun
Liejun Wang
Wendong Zheng
157
0
0
13 Oct 2025
Audio-Guided Dynamic Modality Fusion with Stereo-Aware Attention for Audio-Visual Navigation
Audio-Guided Dynamic Modality Fusion with Stereo-Aware Attention for Audio-Visual Navigation
Jia Li
Yinfeng Yu
Liejun Wang
Fuchun Sun
Wendong Zheng
291
5
0
21 Sep 2025
Prosody as a Teaching Signal for Agent Learning: Exploratory Studies and
  Algorithmic Implications
Prosody as a Teaching Signal for Agent Learning: Exploratory Studies and Algorithmic ImplicationsInternational Conference on Multimodal Interaction (ICMI), 2024
Matilda Knierim
Sahil Jain
Murat Han Aydoğan
Kenneth Mitra
K. Desai
Akanksha Saran
Kim Baraka
180
1
0
31 Oct 2024
Hearing Touch: Audio-Visual Pretraining for Contact-Rich Manipulation
Hearing Touch: Audio-Visual Pretraining for Contact-Rich ManipulationIEEE International Conference on Robotics and Automation (ICRA), 2024
Jared Mejia
Victoria Dean
Tess Hellebrekers
Abhinav Gupta
315
24
0
14 May 2024
Do We Really Need a Complex Agent System? Distill Embodied Agent into a
  Single Model
Do We Really Need a Complex Agent System? Distill Embodied Agent into a Single Model
Zhonghan Zhao
Ke Ma
Wenhao Chai
Xuan Wang
Kewei Chen
Dongxu Guo
Yanting Zhang
Hongwei Wang
Gaoang Wang
279
24
0
06 Apr 2024
Hierarchical Auto-Organizing System for Open-Ended Multi-Agent
  Navigation
Hierarchical Auto-Organizing System for Open-Ended Multi-Agent Navigation
Zhonghan Zhao
Kewei Chen
Dongxu Guo
Wenhao Chai
Tianbo Ye
Yanting Zhang
Gaoang Wang
364
28
0
13 Mar 2024
See and Think: Embodied Agent in Virtual Environment
See and Think: Embodied Agent in Virtual EnvironmentEuropean Conference on Computer Vision (ECCV), 2023
Zhonghan Zhao
Wenhao Chai
Xuan Wang
Li Boyi
Shengyu Hao
Shidong Cao
Tianbo Ye
Gaoang Wang
LM&RoLLMAG
446
60
0
26 Nov 2023
Measuring Acoustics with Collaborative Multiple Agents
Measuring Acoustics with Collaborative Multiple AgentsInternational Joint Conference on Artificial Intelligence (IJCAI), 2023
Yinfeng Yu
Changan Chen
Lele Cao
Fangkai Yang
Gang Hua
397
15
0
09 Oct 2023
The Wizard of Curiosities: Enriching Dialogues with Fun Facts
The Wizard of Curiosities: Enriching Dialogues with Fun FactsSIGDIAL Conferences (SIGDIAL), 2023
Frederico Vicente
Rafael Ferreira
David Semedo
João Magalhães
212
3
0
20 Sep 2023
Hyperbolic Audio-visual Zero-shot Learning
Hyperbolic Audio-visual Zero-shot LearningIEEE International Conference on Computer Vision (ICCV), 2023
Jie Hong
Zeeshan Hayder
Junlin Han
Pengfei Fang
Mehrtash Harandi
L. Petersson
293
27
0
24 Aug 2023
Omnidirectional Information Gathering for Knowledge Transfer-based
  Audio-Visual Navigation
Omnidirectional Information Gathering for Knowledge Transfer-based Audio-Visual NavigationIEEE International Conference on Computer Vision (ICCV), 2023
Jinyu Chen
Wenguan Wang
Siying Liu
Jiaming Song
Yi Yang
341
21
0
20 Aug 2023
Never Explore Repeatedly in Multi-Agent Reinforcement Learning
Never Explore Repeatedly in Multi-Agent Reinforcement Learning
Chenghao Li
Tonghan Wang
Chongjie Zhang
Qianchuan Zhao
218
1
0
19 Aug 2023
RealImpact: A Dataset of Impact Sound Fields for Real Objects
RealImpact: A Dataset of Impact Sound Fields for Real ObjectsComputer Vision and Pattern Recognition (CVPR), 2023
Samuel Clarke
Ruohan Gao
Mason Wang
M. Rau
Julia Xu
Jui-Hsien Wang
Doug L. James
Jiajun Wu
252
13
0
16 Jun 2023
Sonicverse: A Multisensory Simulation Platform for Embodied Household
  Agents that See and Hear
Sonicverse: A Multisensory Simulation Platform for Embodied Household Agents that See and HearIEEE International Conference on Robotics and Automation (ICRA), 2023
Ruohan Gao
Hao Li
Gokul Dharan
Zhuzhu Wang
Chengshu Li
Fei Xia
Silvio Savarese
Li Fei-Fei
Jiajun Wu
381
15
0
01 Jun 2023
Chat2Map: Efficient Scene Mapping from Multi-Ego Conversations
Chat2Map: Efficient Scene Mapping from Multi-Ego ConversationsComputer Vision and Pattern Recognition (CVPR), 2023
Sagnik Majumder
Hao Jiang
Pierre Moulon
E. Henderson
P. Calamia
Kristen Grauman
V. Ithapu
EgoV
384
12
0
04 Jan 2023
Learning Active Camera for Multi-Object Navigation
Learning Active Camera for Multi-Object NavigationNeural Information Processing Systems (NeurIPS), 2022
Peihao Chen
Dongyu Ji
Kun-Li Channing Lin
Weiwen Hu
Wenbing Huang
Thomas H. Li
Ming Tan
Chuang Gan
305
36
0
14 Oct 2022
Pay Self-Attention to Audio-Visual Navigation
Pay Self-Attention to Audio-Visual NavigationBritish Machine Vision Conference (BMVC), 2022
Yinfeng Yu
Lele Cao
Gang Hua
Xiaohong Liu
Liejun Wang
382
18
0
04 Oct 2022
That Sounds Right: Auditory Self-Supervision for Dynamic Robot
  Manipulation
That Sounds Right: Auditory Self-Supervision for Dynamic Robot ManipulationConference on Robot Learning (CoRL), 2022
Abitha Thankaraj
Lerrel Pinto
218
21
0
03 Oct 2022
Masked Imitation Learning: Discovering Environment-Invariant Modalities
  in Multimodal Demonstrations
Masked Imitation Learning: Discovering Environment-Invariant Modalities in Multimodal DemonstrationsIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2022
Yilun Hao
Ruinan Wang
Zhangjie Cao
Zihan Wang
Yuchen Cui
Dorsa Sadigh
300
4
0
16 Sep 2022
Foundations and Trends in Multimodal Machine Learning: Principles,
  Challenges, and Open Questions
Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open QuestionsACM Computing Surveys (ACM CSUR), 2022
Paul Pu Liang
Amir Zadeh
Louis-Philippe Morency
368
218
0
07 Sep 2022
Self-Supervised Exploration via Temporal Inconsistency in Reinforcement
  Learning
Self-Supervised Exploration via Temporal Inconsistency in Reinforcement LearningIEEE Transactions on Artificial Intelligence (IEEE TAI), 2022
Zijian Gao
Kele Xu
Yuanzhao Zhai
Dawei Feng
Bo Ding
Xinjun Mao
Huaimin Wang
237
3
0
24 Aug 2022
Impact Makes a Sound and Sound Makes an Impact: Sound Guides
  Representations and Explorations
Impact Makes a Sound and Sound Makes an Impact: Sound Guides Representations and ExplorationsIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2022
Xufeng Zhao
C. Weber
Muhammad Burhan Hafez
S. Wermter
232
11
0
04 Aug 2022
Finding Fallen Objects Via Asynchronous Audio-Visual Integration
Finding Fallen Objects Via Asynchronous Audio-Visual IntegrationComputer Vision and Pattern Recognition (CVPR), 2022
Chuang Gan
Yi Gu
Siyuan Zhou
Jeremy Schwartz
S. Alter
James Traer
Dan Gutfreund
J. Tenenbaum
Josh H. McDermott
Antonio Torralba
360
20
0
07 Jul 2022
Beyond Visual Field of View: Perceiving 3D Environment with Echoes and
  Vision
Beyond Visual Field of View: Perceiving 3D Environment with Echoes and Vision
Xiangjie Sui
Esa Rahtu
Hang Zhao
MDE
403
8
0
03 Jul 2022
SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning
SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic LearningNeural Information Processing Systems (NeurIPS), 2022
Changan Chen
Carl Schissler
Sanchit Garg
Philip Kobernik
Alexander Clegg
P. Calamia
Dhruv Batra
Philip Robinson
Kristen Grauman
3DGS
404
124
0
16 Jun 2022
Few-Shot Audio-Visual Learning of Environment Acoustics
Few-Shot Audio-Visual Learning of Environment AcousticsNeural Information Processing Systems (NeurIPS), 2022
Sagnik Majumder
Changan Chen
Ziad Al-Halah
Kristen Grauman
322
76
0
08 Jun 2022
Towards Generalisable Audio Representations for Audio-Visual Navigation
Towards Generalisable Audio Representations for Audio-Visual Navigation
Shunqi Mao
Chaoyi Zhang
Heng Wang
Weidong (Tom) Cai
198
1
0
01 Jun 2022
Play it by Ear: Learning Skills amidst Occlusion through Audio-Visual
  Imitation Learning
Play it by Ear: Learning Skills amidst Occlusion through Audio-Visual Imitation Learning
Maximilian Du
Olivia Y. Lee
Suraj Nair
Chelsea Finn
OffRL
311
46
0
30 May 2022
Nuclear Norm Maximization Based Curiosity-Driven Learning
Nuclear Norm Maximization Based Curiosity-Driven Learning
Chao Chen
Zijian Gao
Kele Xu
Sen Yang
Yiying Li
Bo Ding
Dawei Feng
Huaimin Wang
662
5
0
21 May 2022
Exploration in Deep Reinforcement Learning: A Survey
Exploration in Deep Reinforcement Learning: A SurveyInformation Fusion (Inf. Fusion), 2022
Pawel Ladosz
Lilian Weng
Minwoo Kim
H. Oh
OffRL
421
550
0
02 May 2022
Brainish: Formalizing A Multimodal Language for Intelligence and
  Consciousness
Brainish: Formalizing A Multimodal Language for Intelligence and Consciousness
Paul Pu Liang
508
7
0
14 Apr 2022
Sound Adversarial Audio-Visual Navigation
Sound Adversarial Audio-Visual NavigationInternational Conference on Learning Representations (ICLR), 2022
Yinfeng Yu
Wenbing Huang
Gang Hua
Changan Chen
Yikai Wang
Xiaohong Liu
AAML
261
48
0
22 Feb 2022
Visual Acoustic Matching
Visual Acoustic MatchingComputer Vision and Pattern Recognition (CVPR), 2022
Changan Chen
Ruohan Gao
P. Calamia
Kristen Grauman
341
67
0
14 Feb 2022
Toward Practical Monocular Indoor Depth Estimation
Toward Practical Monocular Indoor Depth EstimationComputer Vision and Pattern Recognition (CVPR), 2021
Cho-Ying Wu
Jialiang Wang
Michael Hall
Ulrich Neumann
Shuochen Su
3DVMDE
317
91
0
04 Dec 2021
Geometry-Aware Multi-Task Learning for Binaural Audio Generation from
  Video
Geometry-Aware Multi-Task Learning for Binaural Audio Generation from VideoBritish Machine Vision Conference (BMVC), 2021
Rishabh Garg
Ruohan Gao
Kristen Grauman
213
33
0
21 Nov 2021
MultiBench: Multiscale Benchmarks for Multimodal Representation Learning
MultiBench: Multiscale Benchmarks for Multimodal Representation Learning
Paul Pu Liang
Yiwei Lyu
Xiang Fan
Zetian Wu
Yun Cheng
...
Peter Wu
Michelle A. Lee
Yuke Zhu
Ruslan Salakhutdinov
Louis-Philippe Morency
VLM
343
239
0
15 Jul 2021
Deep Learning for Embodied Vision Navigation: A Survey
Deep Learning for Embodied Vision Navigation: A Survey
Fengda Zhu
Yi Zhu
Vincent CS Lee
Xiaodan Liang
Xiaojun Chang
EgoVLM&Ro
603
0
0
07 Jul 2021
Learning Audio-Visual Dereverberation
Learning Audio-Visual DereverberationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Changan Chen
Wei-Ju Sun
David Harwath
Kristen Grauman
270
37
0
14 Jun 2021
Ask & Explore: Grounded Question Answering for Curiosity-Driven
  Exploration
Ask & Explore: Grounded Question Answering for Curiosity-Driven Exploration
Jivat Neet Kaur
Yiding Jiang
Paul Pu Liang
LRM
180
2
0
24 Apr 2021
Touch-based Curiosity for Sparse-Reward Tasks
Touch-based Curiosity for Sparse-Reward Tasks
Sai Rajeswar
Cyril Ibrahim
Nitin Surya
Florian Golemo
David Vazquez
Rameswar Panda
Pedro H. O. Pinheiro
201
6
0
01 Apr 2021
Audio-Visual Floorplan Reconstruction
Audio-Visual Floorplan ReconstructionIEEE International Conference on Computer Vision (ICCV), 2020
Senthil Purushwalkam
S. V. A. Garí
V. Ithapu
Carl Schissler
Philip Robinson
Abhinav Gupta
Kristen Grauman
VGen3DV
393
45
0
31 Dec 2020
SEMI: Self-supervised Exploration via Multisensory Incongruity
SEMI: Self-supervised Exploration via Multisensory IncongruityIEEE International Conference on Robotics and Automation (ICRA), 2020
Jianren Wang
Ziwen Zhuang
Hang Zhao
SSL
213
1
0
26 Sep 2020
Noisy Agents: Self-supervised Exploration by Predicting Auditory Events
Noisy Agents: Self-supervised Exploration by Predicting Auditory EventsIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2020
Chuang Gan
Xiaoyu Chen
Phillip Isola
Antonio Torralba
J. Tenenbaum
198
7
0
27 Jul 2020
ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation
ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation
Chuang Gan
Jeremy Schwartz
S. Alter
Damian Mrowca
Martin Schrimpf
...
Antonio Torralba
J. DiCarlo
J. Tenenbaum
Josh H. McDermott
Daniel L. K. Yamins
VGen
530
324
0
09 Jul 2020
1
Page 1 of 1