ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.03191
  4. Cited By
ContextNet: Improving Convolutional Neural Networks for Automatic Speech
  Recognition with Global Context

ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context

7 May 2020
Wei Han
Zhengdong Zhang
Yu Zhang
Jiahui Yu
Chung-Cheng Chiu
James Qin
Anmol Gulati
Ruoming Pang
Yonghui Wu
ArXivPDFHTML

Papers citing "ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context"

50 / 131 papers shown
Title
Speaker Retrieval in the Wild: Challenges, Effectiveness and Robustness
Speaker Retrieval in the Wild: Challenges, Effectiveness and Robustness
Erfan Loweimi
Mengjie Qian
Kate Knill
Mark Gales
46
0
0
26 Apr 2025
Selective Masking Adversarial Attack on Automatic Speech Recognition Systems
Selective Masking Adversarial Attack on Automatic Speech Recognition Systems
Zheng Fang
Shenyi Zhang
Tao Wang
Bowen Li
Lingchen Zhao
Zhangyi Wang
AAML
28
0
0
06 Apr 2025
CR-CTC: Consistency regularization on CTC for improved speech recognition
CR-CTC: Consistency regularization on CTC for improved speech recognition
Zengwei Yao
Wei Kang
Xiaoyu Yang
Fangjun Kuang
Liyong Guo
Han Zhu
Zengrui Jin
Zhaoqing Li
Long Lin
Daniel Povey
59
0
0
17 Feb 2025
Deep CLAS: Deep Contextual Listen, Attend and Spell
Deep CLAS: Deep Contextual Listen, Attend and Spell
Shifu Xiong
Mengzhi Wang
Genshun Wan
Hang Chen
Jianqing Gao
Lirong Dai
31
0
0
26 Sep 2024
Searching for Effective Preprocessing Method and CNN-based Architecture
  with Efficient Channel Attention on Speech Emotion Recognition
Searching for Effective Preprocessing Method and CNN-based Architecture with Efficient Channel Attention on Speech Emotion Recognition
Byunggun Kim
Younghun Kwon
38
0
0
06 Sep 2024
An Analysis of Linear Complexity Attention Substitutes with BEST-RQ
An Analysis of Linear Complexity Attention Substitutes with BEST-RQ
Ryan Whetten
Titouan Parcollet
Adel Moumen
Marco Dinarelli
Yannick Esteve
41
0
0
04 Sep 2024
Cross-layer Attention Sharing for Large Language Models
Cross-layer Attention Sharing for Large Language Models
Yongyu Mu
Yuzhang Wu
Yuchun Fan
Chenglong Wang
Hengyu Li
Qiaozhi He
Murun Yang
Tong Xiao
Jingbo Zhu
42
5
0
04 Aug 2024
Linear-Complexity Self-Supervised Learning for Speech Processing
Linear-Complexity Self-Supervised Learning for Speech Processing
Shucong Zhang
Titouan Parcollet
Rogier van Dalen
Sourav Bhattacharya
46
1
0
18 Jul 2024
DASB -- Discrete Audio and Speech Benchmark
DASB -- Discrete Audio and Speech Benchmark
Pooneh Mousavi
Luca Della Libera
J. Duret
Artem Ploujnikov
Cem Subakan
Mirco Ravanelli
35
13
0
20 Jun 2024
Exploring Spoken Language Identification Strategies for Automatic
  Transcription of Multilingual Broadcast and Institutional Speech
Exploring Spoken Language Identification Strategies for Automatic Transcription of Multilingual Broadcast and Institutional Speech
Martina Valente
Fabio Brugnara
Giovanni Morrone
Enrico Zovato
Leonardo Badino
35
0
0
13 Jun 2024
Joint Beam Search Integrating CTC, Attention, and Transducer Decoders
Joint Beam Search Integrating CTC, Attention, and Transducer Decoders
Yui Sudo
Muhammad Shakeel
Yosuke Fukumoto
Brian Yan
Jiatong Shi
Yifan Peng
Shinji Watanabe
27
0
0
05 Jun 2024
Denoising LM: Pushing the Limits of Error Correction Models for Speech
  Recognition
Denoising LM: Pushing the Limits of Error Correction Models for Speech Recognition
Zijin Gu
Tatiana Likhomanenko
Richard He Bai
Erik McDermott
R. Collobert
Navdeep Jaitly
AuLLM
58
2
0
24 May 2024
EfficientASR: Speech Recognition Network Compression via Attention
  Redundancy and Chunk-Level FFN Optimization
EfficientASR: Speech Recognition Network Compression via Attention Redundancy and Chunk-Level FFN Optimization
Jianzong Wang
Ziqi Liang
Xulong Zhang
Ning Cheng
Jing Xiao
38
0
0
30 Apr 2024
Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition
Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition
Yash Jain
David M. Chan
Pranav Dheram
Aparna Khare
Olabanji Shonibare
Venkatesh Ravichandran
Shalini Ghosh
40
2
0
28 Mar 2024
TDT-KWS: Fast And Accurate Keyword Spotting Using Token-and-duration
  Transducer
TDT-KWS: Fast And Accurate Keyword Spotting Using Token-and-duration Transducer
Yu Xi
Hao Li
Baochen Yang
Haoyu Li
Hai-kun Xu
Kai Yu
35
1
0
20 Mar 2024
An Embarrassingly Simple Approach for LLM with Strong ASR Capacity
An Embarrassingly Simple Approach for LLM with Strong ASR Capacity
Ziyang Ma
Guanrou Yang
Yifan Yang
Zhifu Gao
Jiaming Wang
...
Fan Yu
Qian Chen
Siqi Zheng
Shiliang Zhang
Xie Chen
AuLLM
52
41
0
13 Feb 2024
Zipformer: A faster and better encoder for automatic speech recognition
Zipformer: A faster and better encoder for automatic speech recognition
Zengwei Yao
Liyong Guo
Xiaoyu Yang
Wei Kang
Fangjun Kuang
Yifan Yang
Zengrui Jin
Long Lin
Daniel Povey
VLM
33
65
0
17 Oct 2023
Improving End-to-End Speech Processing by Efficient Text Data
  Utilization with Latent Synthesis
Improving End-to-End Speech Processing by Efficient Text Data Utilization with Latent Synthesis
Jianqiao Lu
Wenyong Huang
Nianzu Zheng
Xingshan Zeng
Y. Yeung
Xiao Chen
SyDa
32
1
0
09 Oct 2023
Human Transcription Quality Improvement
Human Transcription Quality Improvement
Jian Gao
Hanbo Sun
Cheng Cao
Zheng Du
43
2
0
24 Sep 2023
Investigating End-to-End ASR Architectures for Long Form Audio
  Transcription
Investigating End-to-End ASR Architectures for Long Form Audio Transcription
Nithin Rao Koluguri
Samuel Kriman
Georgy Zelenfroind
Somshubra Majumdar
Dima Rekesh
Vahid Noroozi
Jagadeesh Balam
Boris Ginsburg
AuLLM
39
9
0
18 Sep 2023
End-to-End Speech Recognition and Disfluency Removal with Acoustic
  Language Model Pretraining
End-to-End Speech Recognition and Disfluency Removal with Acoustic Language Model Pretraining
Saksham Bassi
Giulio Duregon
Siddhartha Jalagam
David Roth
38
2
0
08 Sep 2023
Speech Self-Supervised Representations Benchmarking: a Case for Larger
  Probing Heads
Speech Self-Supervised Representations Benchmarking: a Case for Larger Probing Heads
Salah Zaiem
Youcef Kemiche
Titouan Parcollet
S. Essid
Mirco Ravanelli
SSL
27
11
0
28 Aug 2023
Conformer-based Target-Speaker Automatic Speech Recognition for
  Single-Channel Audio
Conformer-based Target-Speaker Automatic Speech Recognition for Single-Channel Audio
Yang Zhang
Krishna C. Puvvada
Vitaly Lavrukhin
Boris Ginsburg
38
14
0
09 Aug 2023
TST: Time-Sparse Transducer for Automatic Speech Recognition
TST: Time-Sparse Transducer for Automatic Speech Recognition
Xiaohui Zhang
Mangui Liang
Zhengkun Tian
Jiangyan Yi
J. Tao
14
0
0
17 Jul 2023
Exploring the Integration of Large Language Models into Automatic Speech
  Recognition Systems: An Empirical Study
Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study
Zeping Min
Jinbo Wang
AuLLM
35
13
0
13 Jul 2023
SummaryMixing: A Linear-Complexity Alternative to Self-Attention for
  Speech Recognition and Understanding
SummaryMixing: A Linear-Complexity Alternative to Self-Attention for Speech Recognition and Understanding
Titouan Parcollet
Rogier van Dalen
Shucong Zhang
S. Bhattacharya
28
6
0
12 Jul 2023
Improving RNN-Transducers with Acoustic LookAhead
Improving RNN-Transducers with Acoustic LookAhead
Vinit Unni
Ashish R. Mittal
P. Jyothi
Sunita Sarawagi
37
2
0
11 Jul 2023
Factors Affecting the Performance of Automated Speaker Verification in
  Alzheimer's Disease Clinical Trials
Factors Affecting the Performance of Automated Speaker Verification in Alzheimer's Disease Clinical Trials
Malikeh Ehghaghi
Marija Stanojevic
Ali Akram
Jekaterina Novikova
19
1
0
20 Jun 2023
Research on an improved Conformer end-to-end Speech Recognition Model
  with R-Drop Structure
Research on an improved Conformer end-to-end Speech Recognition Model with R-Drop Structure
Weidong Ji
Shijie Zan
Guohui Zhou
Xu Wang
SyDa
27
1
0
14 Jun 2023
Speech Self-Supervised Representation Benchmarking: Are We Doing it
  Right?
Speech Self-Supervised Representation Benchmarking: Are We Doing it Right?
Salah Zaiem
Youcef Kemiche
Titouan Parcollet
S. Essid
Mirco Ravanelli
SSL
14
23
0
01 Jun 2023
Bridging the Granularity Gap for Acoustic Modeling
Bridging the Granularity Gap for Acoustic Modeling
Chen Xu
Yuhao Zhang
Chengbo Jiao
Xiaoqian Liu
Chi Hu
Xin Zeng
Tong Xiao
Anxiang Ma
Huizhen Wang
JingBo Zhu
29
6
0
27 May 2023
InterFormer: Interactive Local and Global Features Fusion for Automatic
  Speech Recognition
InterFormer: Interactive Local and Global Features Fusion for Automatic Speech Recognition
Zhibing Lai
Tianren Zhang
Qi Liu
Xinyuan Qian
Li-Fang Wei
Songlu Chen
Feng Chen
Xu-Cheng Yin
35
2
0
24 May 2023
Multi-Head State Space Model for Speech Recognition
Multi-Head State Space Model for Speech Recognition
Yassir Fathullah
Chunyang Wu
Yuan Shangguan
Junteng Jia
Wenhan Xiong
...
Chunxi Liu
Yangyang Shi
Ozlem Kalinli
M. Seltzer
Mark Gales
34
13
0
21 May 2023
A Comparative Study on E-Branchformer vs Conformer in Speech
  Recognition, Translation, and Understanding Tasks
A Comparative Study on E-Branchformer vs Conformer in Speech Recognition, Translation, and Understanding Tasks
Yifan Peng
Kwangyoun Kim
Felix Wu
Brian Yan
Siddhant Arora
William Chen
Jiyang Tang
Suwon Shon
Prashant Sridhar
Shinji Watanabe
29
17
0
18 May 2023
Enhancing multilingual speech recognition in air traffic control by
  sentence-level language identification
Enhancing multilingual speech recognition in air traffic control by sentence-level language identification
Peng Fan
Dongyue Guo
Jianwei Zhang
Bo Yang
Yi Lin
17
6
0
29 Apr 2023
A CTC Alignment-based Non-autoregressive Transformer for End-to-end
  Automatic Speech Recognition
A CTC Alignment-based Non-autoregressive Transformer for End-to-end Automatic Speech Recognition
Ruchao Fan
Wei Chu
Peng Chang
Abeer Alwan
18
10
0
15 Apr 2023
Efficient Sequence Transduction by Jointly Predicting Tokens and
  Durations
Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Hainan Xu
Fei Jia
Somshubra Majumdar
Hengguan Huang
Shinji Watanabe
Boris Ginsburg
27
19
0
13 Apr 2023
Pyramid Multi-branch Fusion DCNN with Multi-Head Self-Attention for
  Mandarin Speech Recognition
Pyramid Multi-branch Fusion DCNN with Multi-Head Self-Attention for Mandarin Speech Recognition
Kai Liu
Hailiang Xiong
Gangqiang Yang
Zhengfeng Du
Yewen Cao
D. Shah
18
0
0
23 Mar 2023
Enhancing Unsupervised Speech Recognition with Diffusion GANs
Enhancing Unsupervised Speech Recognition with Diffusion GANs
Xianchao Wu
DiffM
13
2
0
23 Mar 2023
Sharing Low Rank Conformer Weights for Tiny Always-On Ambient Speech
  Recognition Models
Sharing Low Rank Conformer Weights for Tiny Always-On Ambient Speech Recognition Models
Steven M. Hernandez
Ding Zhao
Shaojin Ding
A. Bruguier
Rohit Prabhavalkar
Tara N. Sainath
Yanzhang He
Ian McGraw
26
7
0
15 Mar 2023
End-to-End Speech Recognition: A Survey
End-to-End Speech Recognition: A Survey
Rohit Prabhavalkar
Takaaki Hori
Tara N. Sainath
Ralf Schluter
Shinji Watanabe
VLM
26
153
0
03 Mar 2023
Improving Medical Speech-to-Text Accuracy with Vision-Language
  Pre-training Model
Improving Medical Speech-to-Text Accuracy with Vision-Language Pre-training Model
Jaeyoung Huh
Sangjoon Park
Jeonghyeon Lee
Jong Chul Ye
LM&MA
25
9
0
27 Feb 2023
Non-pooling Network for medical image segmentation
Non-pooling Network for medical image segmentation
Weihu Song
Heng Yu
SSeg
31
1
0
21 Feb 2023
Emphasizing Unseen Words: New Vocabulary Acquisition for End-to-End
  Speech Recognition
Emphasizing Unseen Words: New Vocabulary Acquisition for End-to-End Speech Recognition
Leyuan Qu
C. Weber
S. Wermter
19
5
0
20 Feb 2023
Residual Information in Deep Speaker Embedding Architectures
Residual Information in Deep Speaker Embedding Architectures
Adriana Stan
34
5
0
06 Feb 2023
Audio-Visual Efficient Conformer for Robust Speech Recognition
Audio-Visual Efficient Conformer for Robust Speech Recognition
Maxime Burchi
Radu Timofte
VLM
13
33
0
04 Jan 2023
4D ASR: Joint modeling of CTC, Attention, Transducer, and Mask-Predict
  decoders
4D ASR: Joint modeling of CTC, Attention, Transducer, and Mask-Predict decoders
Yui Sudo
Muhammad Shakeel
Brian Yan
Jiatong Shi
Shinji Watanabe
30
10
0
21 Dec 2022
Improved Speech Pre-Training with Supervision-Enhanced Acoustic Unit
Improved Speech Pre-Training with Supervision-Enhanced Acoustic Unit
Pengcheng Li
Genshun Wan
Fenglin Ding
Hang Chen
Jianqing Gao
Jia-Yu Pan
Cong Liu
SSL
30
1
0
07 Dec 2022
Learning the joint distribution of two sequences using little or no
  paired data
Learning the joint distribution of two sequences using little or no paired data
Soroosh Mariooryad
Matt Shannon
Siyuan Ma
Tom Bagby
David Kao
Daisy Stanton
Eric Battenberg
RJ Skerry-Ryan
30
2
0
06 Dec 2022
EURO: ESPnet Unsupervised ASR Open-source Toolkit
EURO: ESPnet Unsupervised ASR Open-source Toolkit
Dongji Gao
Jiatong Shi
Shun-Po Chuang
Leibny Paola García-Perera
Hung-yi Lee
Shinji Watanabe
Sanjeev Khudanpur
27
8
0
30 Nov 2022
123
Next