ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2201.01928
  4. Cited By
Egocentric Deep Multi-Channel Audio-Visual Active Speaker Localization

Egocentric Deep Multi-Channel Audio-Visual Active Speaker Localization

Computer Vision and Pattern Recognition (CVPR), 2022
6 January 2022
Hao Jiang
Calvin Murdock
V. Ithapu
    EgoV
ArXiv (abs)PDFHTMLGithub

Papers citing "Egocentric Deep Multi-Channel Audio-Visual Active Speaker Localization"

28 / 28 papers shown
Proactive Hearing Assistants that Isolate Egocentric Conversations
Proactive Hearing Assistants that Isolate Egocentric Conversations
Guilin Hu
Malek Itani
Tuochao Chen
Shyamnath Gollakota
184
2
0
14 Nov 2025
Attention-Driven Multimodal Alignment for Long-term Action Quality Assessment
Attention-Driven Multimodal Alignment for Long-term Action Quality AssessmentApplied Soft Computing (ASC), 2025
Xin Wang
Peng-Jie Li
Yuan-Yuan Shen
188
0
0
29 Jul 2025
SoundVista: Novel-View Ambient Sound Synthesis via Visual-Acoustic Binding
SoundVista: Novel-View Ambient Sound Synthesis via Visual-Acoustic BindingComputer Vision and Pattern Recognition (CVPR), 2025
Mingfei Chen
I. D. Gebru
Ishwarya Ananthabhotla
Christian Richardt
Dejan Marković
Jake Sandakly
Steven Krenn
Todd Keebler
Eli Shlizerman
Alexander Richard
304
3
0
08 Apr 2025
egoPPG: Heart Rate Estimation from Eye-Tracking Cameras in Egocentric Systems to Benefit Downstream Vision Tasks
egoPPG: Heart Rate Estimation from Eye-Tracking Cameras in Egocentric Systems to Benefit Downstream Vision Tasks
Björn Braun
Rayan Armani
Manuel Meier
Max Moebus
Christian Holz
EgoV
397
6
0
28 Feb 2025
SocialMind: LLM-based Proactive AR Social Assistive System with
  Human-like Perception for In-situ Live Interactions
SocialMind: LLM-based Proactive AR Social Assistive System with Human-like Perception for In-situ Live InteractionsProceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies (IMWUT), 2024
Bufang Yang
Yunqi Guo
Lilin Xu
Zhenyu Yan
Hongkai Chen
Guoliang Xing
Xiaofan Jiang
476
38
0
05 Dec 2024
AV-PedAware: Self-Supervised Audio-Visual Fusion for Dynamic Pedestrian Awareness
AV-PedAware: Self-Supervised Audio-Visual Fusion for Dynamic Pedestrian AwarenessIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023
Yizhuo Yang
Shenghai Yuan
Muqing Cao
Jianfei Yang
Lihua Xie
610
15
0
11 Nov 2024
M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart Glasses
M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart GlassesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Yufeng Yang
Desh Raj
Ju Lin
Niko Moritz
Junteng Jia
...
Egor Lakomkin
Yiteng Huang
Jacob Donley
Jay Mahadeokar
Ozlem Kalinli
197
6
0
17 Sep 2024
Audio-Visual Speaker Diarization: Current Databases, Approaches and
  Challenges
Audio-Visual Speaker Diarization: Current Databases, Approaches and Challenges
Victoria Mingote
Alfonso Ortega
A. Miguel
Eduardo Lleida
349
4
0
09 Sep 2024
Towards Social AI: A Survey on Understanding Social Interactions
Towards Social AI: A Survey on Understanding Social Interactions
Sangmin Lee
Minzhi Li
Bolin Lai
Wenqi Jia
Fiona Ryan
...
Ozgur Kara
Bikram Boote
Weiyan Shi
Diyi Yang
James M. Rehg
392
15
0
05 Sep 2024
Spherical World-Locking for Audio-Visual Localization in Egocentric
  Videos
Spherical World-Locking for Audio-Visual Localization in Egocentric VideosEuropean Conference on Computer Vision (ECCV), 2024
Heeseung Yun
Ruohan Gao
Ishwarya Ananthabhotla
Anurag Kumar
Jacob Donley
Chao Li
Gunhee Kim
V. Ithapu
Calvin Murdock
257
7
0
09 Aug 2024
Modeling and Driving Human Body Soundfields through Acoustic Primitives
Modeling and Driving Human Body Soundfields through Acoustic Primitives
Chao Huang
Dejan Marković
Chenliang Xu
Alexander Richard
386
14
0
18 Jul 2024
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric
  Videos
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos
Changan Chen
Puyuan Peng
Ami Baid
Zihui Xue
Wei-Ning Hsu
David Harwath
Kristen Grauman
VGen
337
23
0
13 Jun 2024
Audio-Visual Talker Localization in Video for Spatial Sound Reproduction
Audio-Visual Talker Localization in Video for Spatial Sound Reproduction
Davide Berghi
Philip J. B. Jackson
285
1
0
01 Jun 2024
SoundingActions: Learning How Actions Sound from Narrated Egocentric
  Videos
SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos
Changan Chen
Kumar Ashutosh
Rohit Girdhar
David Harwath
Kristen Grauman
EgoVSSL
300
12
0
08 Apr 2024
Multimodal Action Quality Assessment
Multimodal Action Quality Assessment
Ling-an Zeng
Wei-Shi Zheng
604
38
0
31 Jan 2024
Leveraging Visual Supervision for Array-based Active Speaker Detection
  and Localization
Leveraging Visual Supervision for Array-based Active Speaker Detection and Localization
Davide Berghi
Philip J. B. Jackson
257
6
0
21 Dec 2023
The Audio-Visual Conversational Graph: From an Egocentric-Exocentric
  Perspective
The Audio-Visual Conversational Graph: From an Egocentric-Exocentric Perspective
Wenqi Jia
Miao Liu
Hao Jiang
Ishwarya Ananthabhotla
James M. Rehg
V. Ithapu
Ruohan Gao
EgoV
305
17
0
20 Dec 2023
Sounding Bodies: Modeling 3D Spatial Sound of Humans Using Body Pose and
  Audio
Sounding Bodies: Modeling 3D Spatial Sound of Humans Using Body Pose and AudioNeural Information Processing Systems (NeurIPS), 2023
Xudong Xu
Dejan Marković
Jacob Sandakly
Todd Keebler
Steven Krenn
Alexander Richard
191
9
0
01 Nov 2023
Measuring Acoustics with Collaborative Multiple Agents
Measuring Acoustics with Collaborative Multiple AgentsInternational Joint Conference on Artificial Intelligence (IJCAI), 2023
Yinfeng Yu
Changan Chen
Lele Cao
Fangkai Yang
Gang Hua
397
15
0
09 Oct 2023
Audio Visual Speaker Localization from EgoCentric Views
Audio Visual Speaker Localization from EgoCentric Views
Jinzheng Zhao
Yong-mei Xu
Xinyuan Qian
Wenwu Wang
EgoV
311
8
0
28 Sep 2023
A Real-Time Active Speaker Detection System Integrating an Audio-Visual
  Signal with a Spatial Querying Mechanism
A Real-Time Active Speaker Detection System Integrating an Audio-Visual Signal with a Spatial Querying MechanismIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
I. Gurvich
Ido Leichter
Dharmendar Reddy Palle
Yossi Asher
Alon Vinnikov
Igor Abramovski
Vishak Gopal
Ross Cutler
Eyal Krupka
237
5
0
15 Sep 2023
An Outlook into the Future of Egocentric Vision
An Outlook into the Future of Egocentric VisionInternational Journal of Computer Vision (IJCV), 2023
Chiara Plizzari
Gabriele Goletto
Antonino Furnari
Siddhant Bansal
Francesco Ragusa
G. Farinella
Dima Damen
Tatiana Tommasi
EgoV
345
85
0
14 Aug 2023
Learning Spatial Features from Audio-Visual Correspondence in Egocentric
  Videos
Learning Spatial Features from Audio-Visual Correspondence in Egocentric VideosComputer Vision and Pattern Recognition (CVPR), 2023
Sagnik Majumder
Ziad Al-Halah
Kristen Grauman
SSLEgoV
447
9
0
10 Jul 2023
Listen to Look into the Future: Audio-Visual Egocentric Gaze
  Anticipation
Listen to Look into the Future: Audio-Visual Egocentric Gaze AnticipationEuropean Conference on Computer Vision (ECCV), 2023
Bolin Lai
Fiona Ryan
Wenqi Jia
Miao Liu
James M. Rehg
EgoV
450
21
0
06 May 2023
Egocentric Auditory Attention Localization in Conversations
Egocentric Auditory Attention Localization in ConversationsComputer Vision and Pattern Recognition (CVPR), 2023
Fiona Ryan
Hao Jiang
Abhinav Shukla
James M. Rehg
V. Ithapu
EgoV
339
25
0
28 Mar 2023
Novel-View Acoustic Synthesis
Novel-View Acoustic SynthesisComputer Vision and Pattern Recognition (CVPR), 2023
Changan Chen
Alexander Richard
Roman Shapovalov
V. Ithapu
Natalia Neverova
Kristen Grauman
Andrea Vedaldi
322
49
0
20 Jan 2023
Chat2Map: Efficient Scene Mapping from Multi-Ego Conversations
Chat2Map: Efficient Scene Mapping from Multi-Ego ConversationsComputer Vision and Pattern Recognition (CVPR), 2023
Sagnik Majumder
Hao Jiang
Pierre Moulon
E. Henderson
P. Calamia
Kristen Grauman
V. Ithapu
EgoV
384
12
0
04 Jan 2023
Few-Shot Audio-Visual Learning of Environment Acoustics
Few-Shot Audio-Visual Learning of Environment AcousticsNeural Information Processing Systems (NeurIPS), 2022
Sagnik Majumder
Changan Chen
Ziad Al-Halah
Kristen Grauman
322
76
0
08 Jun 2022
1
Page 1 of 1