ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2201.01928
  4. Cited By
Egocentric Deep Multi-Channel Audio-Visual Active Speaker Localization

Egocentric Deep Multi-Channel Audio-Visual Active Speaker Localization

6 January 2022
Hao Jiang
Calvin Murdock
V. Ithapu
    EgoV
ArXivPDFHTML

Papers citing "Egocentric Deep Multi-Channel Audio-Visual Active Speaker Localization"

30 / 30 papers shown
Title
SoundVista: Novel-View Ambient Sound Synthesis via Visual-Acoustic Binding
SoundVista: Novel-View Ambient Sound Synthesis via Visual-Acoustic Binding
Mingfei Chen
I. D. Gebru
Ishwarya Ananthabhotla
Christian Richardt
Dejan Marković
Jake Sandakly
Steven Krenn
Todd Keebler
Eli Shlizerman
Alexander Richard
19
0
0
08 Apr 2025
egoPPG: Heart Rate Estimation from Eye-Tracking Cameras in Egocentric Systems to Benefit Downstream Vision Tasks
egoPPG: Heart Rate Estimation from Eye-Tracking Cameras in Egocentric Systems to Benefit Downstream Vision Tasks
Björn Braun
Rayan Armani
Manuel Meier
Max Moebus
Christian Holz
EgoV
33
0
0
28 Feb 2025
SocialMind: LLM-based Proactive AR Social Assistive System with
  Human-like Perception for In-situ Live Interactions
SocialMind: LLM-based Proactive AR Social Assistive System with Human-like Perception for In-situ Live Interactions
Bufang Yang
Yunqi Guo
Lilin Xu
Zhenyu Yan
Hongkai Chen
Guoliang Xing
Xiaofan Jiang
67
8
0
05 Dec 2024
AV-PedAware: Self-Supervised Audio-Visual Fusion for Dynamic Pedestrian Awareness
AV-PedAware: Self-Supervised Audio-Visual Fusion for Dynamic Pedestrian Awareness
Yizhuo Yang
Shenghai Yuan
Muqing Cao
Jianfei Yang
Lihua Xie
49
7
0
11 Nov 2024
M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart Glasses
M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart Glasses
Yufeng Yang
Desh Raj
Ju Lin
Niko Moritz
J. Jia
...
Egor Lakomkin
Yiteng Huang
Jacob Donley
Jay Mahadeokar
Ozlem Kalinli
13
2
0
17 Sep 2024
Audio-Visual Speaker Diarization: Current Databases, Approaches and
  Challenges
Audio-Visual Speaker Diarization: Current Databases, Approaches and Challenges
Victoria Mingote
Alfonso Ortega
A. Miguel
Eduardo Lleida
22
0
0
09 Sep 2024
Towards Social AI: A Survey on Understanding Social Interactions
Towards Social AI: A Survey on Understanding Social Interactions
Sangmin Lee
Minzhi Li
Bolin Lai
Wenqi Jia
Fiona Ryan
...
Ozgur Kara
Bikram Boote
Weiyan Shi
Diyi Yang
James M. Rehg
18
4
0
05 Sep 2024
Spherical World-Locking for Audio-Visual Localization in Egocentric
  Videos
Spherical World-Locking for Audio-Visual Localization in Egocentric Videos
Heeseung Yun
Ruohan Gao
Ishwarya Ananthabhotla
Anurag Kumar
Jacob Donley
Chao Li
Gunhee Kim
V. Ithapu
Calvin Murdock
29
1
0
09 Aug 2024
Modeling and Driving Human Body Soundfields through Acoustic Primitives
Modeling and Driving Human Body Soundfields through Acoustic Primitives
Chao Huang
Dejan Marković
Chenliang Xu
Alexander Richard
28
4
0
18 Jul 2024
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric
  Videos
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos
Changan Chen
Puyuan Peng
Ami Baid
Zihui Xue
Wei-Ning Hsu
David F. Harwath
Kristen Grauman
VGen
37
7
0
13 Jun 2024
Audio-Visual Talker Localization in Video for Spatial Sound Reproduction
Audio-Visual Talker Localization in Video for Spatial Sound Reproduction
Davide Berghi
Philip J. B. Jackson
29
0
0
01 Jun 2024
SoundingActions: Learning How Actions Sound from Narrated Egocentric
  Videos
SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos
Changan Chen
Kumar Ashutosh
Rohit Girdhar
David F. Harwath
Kristen Grauman
EgoV
SSL
26
6
0
08 Apr 2024
Multimodal Action Quality Assessment
Multimodal Action Quality Assessment
Ling-an Zeng
Wei-Shi Zheng
40
11
0
31 Jan 2024
Leveraging Visual Supervision for Array-based Active Speaker Detection
  and Localization
Leveraging Visual Supervision for Array-based Active Speaker Detection and Localization
Davide Berghi
Philip J. B. Jackson
33
5
0
21 Dec 2023
The Audio-Visual Conversational Graph: From an Egocentric-Exocentric
  Perspective
The Audio-Visual Conversational Graph: From an Egocentric-Exocentric Perspective
Wenqi Jia
Miao Liu
Hao Jiang
Ishwarya Ananthabhotla
James M. Rehg
V. Ithapu
Ruohan Gao
EgoV
18
5
0
20 Dec 2023
Sounding Bodies: Modeling 3D Spatial Sound of Humans Using Body Pose and
  Audio
Sounding Bodies: Modeling 3D Spatial Sound of Humans Using Body Pose and Audio
Xudong Xu
Dejan Marković
Jacob Sandakly
Todd Keebler
Steven Krenn
Alexander Richard
15
2
0
01 Nov 2023
Measuring Acoustics with Collaborative Multiple Agents
Measuring Acoustics with Collaborative Multiple Agents
Yinfeng Yu
Changan Chen
Lele Cao
Fangkai Yang
Fuchun Sun
10
1
0
09 Oct 2023
Audio Visual Speaker Localization from EgoCentric Views
Audio Visual Speaker Localization from EgoCentric Views
Jinzheng Zhao
Yong-mei Xu
Xinyuan Qian
Wenwu Wang
EgoV
20
5
0
28 Sep 2023
A Real-Time Active Speaker Detection System Integrating an Audio-Visual
  Signal with a Spatial Querying Mechanism
A Real-Time Active Speaker Detection System Integrating an Audio-Visual Signal with a Spatial Querying Mechanism
I. Gurvich
Ido Leichter
Dharmendar Reddy Palle
Yossi Asher
Alon Vinnikov
Igor Abramovski
Vishak Gopal
Ross Cutler
Eyal Krupka
13
4
0
15 Sep 2023
An Outlook into the Future of Egocentric Vision
An Outlook into the Future of Egocentric Vision
Chiara Plizzari
Gabriele Goletto
Antonino Furnari
Siddhant Bansal
Francesco Ragusa
G. Farinella
Dima Damen
Tatiana Tommasi
EgoV
25
37
0
14 Aug 2023
Learning Spatial Features from Audio-Visual Correspondence in Egocentric
  Videos
Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos
Sagnik Majumder
Ziad Al-Halah
Kristen Grauman
SSL
EgoV
24
4
0
10 Jul 2023
Listen to Look into the Future: Audio-Visual Egocentric Gaze
  Anticipation
Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation
Bolin Lai
Fiona Ryan
Wenqi Jia
Miao Liu
James M. Rehg
EgoV
16
8
0
06 May 2023
Egocentric Auditory Attention Localization in Conversations
Egocentric Auditory Attention Localization in Conversations
Fiona Ryan
Hao Jiang
Abhinav Shukla
James M. Rehg
V. Ithapu
EgoV
19
15
0
28 Mar 2023
Novel-View Acoustic Synthesis
Novel-View Acoustic Synthesis
Changan Chen
Alexander Richard
Roman Shapovalov
V. Ithapu
Natalia Neverova
Kristen Grauman
Andrea Vedaldi
13
32
0
20 Jan 2023
Chat2Map: Efficient Scene Mapping from Multi-Ego Conversations
Chat2Map: Efficient Scene Mapping from Multi-Ego Conversations
Sagnik Majumder
Hao Jiang
Pierre Moulon
E. Henderson
P. Calamia
Kristen Grauman
V. Ithapu
EgoV
13
7
0
04 Jan 2023
Few-Shot Audio-Visual Learning of Environment Acoustics
Few-Shot Audio-Visual Learning of Environment Acoustics
Sagnik Majumder
Changan Chen
Ziad Al-Halah
Kristen Grauman
22
50
0
08 Jun 2022
The Right to Talk: An Audio-Visual Transformer Approach
The Right to Talk: An Audio-Visual Transformer Approach
Thanh-Dat Truong
C. Duong
T. D. Vu
H. Pham
Bhiksha Raj
Ngan Le
Khoa Luu
52
36
0
06 Aug 2021
VisualVoice: Audio-Visual Speech Separation with Cross-Modal Consistency
VisualVoice: Audio-Visual Speech Separation with Cross-Modal Consistency
Ruohan Gao
Kristen Grauman
CVBM
185
196
0
08 Jan 2021
Audiovisual SlowFast Networks for Video Recognition
Audiovisual SlowFast Networks for Video Recognition
Fanyi Xiao
Yong Jae Lee
Kristen Grauman
Jitendra Malik
Christoph Feichtenhofer
192
204
0
23 Jan 2020
VoxCeleb2: Deep Speaker Recognition
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
214
2,224
0
14 Jun 2018
1