Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2203.17263
Cited By
Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis
Computer Vision and Pattern Recognition (CVPR), 2022
31 March 2022
Karren D. Yang
Dejan Marković
Steven Krenn
Vasu Agrawal
Alexander Richard
VGen
Re-assign community
ArXiv (abs)
PDF
HTML
Github (107★)
Papers citing
"Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis"
20 / 20 papers shown
Title
AUREXA-SE: Audio-Visual Unified Representation Exchange Architecture with Cross-Attention and Squeezeformer for Speech Enhancement
M. Sajid
Deepanshu Gupta
Yash Modi
Sanskriti Jain
Harshith Jai Surya Ganji
A. Rahaman
Harshvardhan Choudhary
Nasir Saleem
Amir Hussain
M. Tanveer
72
0
0
06 Oct 2025
Real-Time System for Audio-Visual Target Speech Enhancement
T. Aleksandra Ma
Sile Yin
Li-Chia Yang
Shuo Zhang
88
0
0
25 Sep 2025
Pain in 3D: Generating Controllable Synthetic Faces for Automated Pain Assessment
Xin Lei Lin
Soroush Mehraban
Abhishek Moturu
Babak Taati
3DH
MedIm
211
0
0
20 Sep 2025
Real-Time Audio-Visual Speech Enhancement Using Pre-trained Visual Representations
Teng
Sile Yin
Li-Chia Yang
Shuo Zhang
124
1
0
29 Jul 2025
Two-stage Audio-Visual Target Speaker Extraction System for Real-Time Processing On Edge Device
Zixuan Li
Xueliang Zhang
Lei Miao
Zhipeng Yan
Ying Sun
Chong Zhu
147
0
0
28 May 2025
FlowAVSE: Efficient Audio-Visual Speech Enhancement with Conditional Flow Matching
Chaeyoung Jung
Suyeon Lee
Ji-Hoon Kim
Joon Son Chung
DiffM
230
18
0
13 Jun 2024
TRAMBA: A Hybrid Transformer and Mamba Architecture for Practical Audio and Bone Conduction Speech Super Resolution and Enhancement on Mobile and Wearable Platforms
Proceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies (IMWUT), 2024
Yueyuan Sui
Minghui Zhao
Junxi Xia
Xiaofan Jiang
S. Xia
Mamba
207
17
0
02 May 2024
Sounding Bodies: Modeling 3D Spatial Sound of Humans Using Body Pose and Audio
Neural Information Processing Systems (NeurIPS), 2023
Xudong Xu
Dejan Marković
Jacob Sandakly
Todd Keebler
Steven Krenn
Alexander Richard
121
8
0
01 Nov 2023
Seeing Through the Conversation: Audio-Visual Speech Separation based on Diffusion Model
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Suyeon Lee
Chaeyoung Jung
Youngjoon Jang
Jaehun Kim
Joon Son Chung
203
14
0
30 Oct 2023
AV2Wav: Diffusion-Based Re-synthesis from Continuous Self-supervised Features for Audio-Visual Speech Enhancement
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Ju-Chieh Chou
Chung-Ming Chien
Karen Livescu
DiffM
353
9
0
14 Sep 2023
Neural Vector Fields: Generalizing Distance Vector Fields by Codebooks and Zero-Curl Regularization
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Xianghui Yang
Guosheng Lin
Huan Wang
Luping Zhou
278
2
0
04 Sep 2023
RepCodec: A Speech Representation Codec for Speech Tokenization
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Zhichao Huang
Chutong Meng
Tom Ko
205
41
0
31 Aug 2023
Audio-visual video-to-speech synthesis with synthesized input audio
Triantafyllos Kefalas
Yannis Panagakis
Maja Pantic
VGen
DiffM
247
1
0
31 Jul 2023
Audio-Visual Speech Enhancement With Selective Off-Screen Speech Extraction
European Signal Processing Conference (EUSIPCO), 2023
Tomoya Yoshinaga
Keitaro Tanaka
Shigeo Morishima
165
1
0
10 Jun 2023
Incorporating Ultrasound Tongue Images for Audio-Visual Speech Enhancement through Knowledge Distillation
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Ruixin Zheng
Yang Ai
Zhenhua Ling
215
14
0
24 May 2023
Neural Vector Fields: Implicit Representation by Explicit Learning
Computer Vision and Pattern Recognition (CVPR), 2023
Xianghui Yang
Guosheng Lin
Huan Wang
Luping Zhou
AI4CE
205
25
0
08 Mar 2023
CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior
Computer Vision and Pattern Recognition (CVPR), 2023
Jinbo Xing
Menghan Xia
Yuechen Zhang
Xiaodong Cun
Jue Wang
T. Wong
280
194
0
06 Jan 2023
ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement
Wei-Ning Hsu
Tal Remez
Bowen Shi
Jacob Donley
Yossi Adi
DiffM
208
13
0
21 Dec 2022
LA-VocE: Low-SNR Audio-visual Speech Enhancement using Neural Vocoders
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Rodrigo Mira
Buye Xu
Jacob Donley
Anurag Kumar
Stavros Petridis
V. Ithapu
Maja Pantic
180
16
0
20 Nov 2022
Context-sensitive neocortical neurons transform the effectiveness and efficiency of neural information processing
Ahsan Adeel
Mario Franco
Mohsin Raza
K. Ahmed
215
9
0
15 Jul 2022
1