Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1902.08295
Cited By
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
21 February 2019
Jonathan Shen
Patrick Nguyen
Yonghui Wu
Zhiwen Chen
Mengzhao Chen
Ye Jia
Anjuli Kannan
Tara N. Sainath
Yuan Cao
Chung-Cheng Chiu
Yanzhang He
J. Chorowski
Smit Hinsu
Stella Laurenzo
James Qin
Orhan Firat
Wolfgang Macherey
Suyog Gupta
Ankur Bapna
Shuyuan Zhang
Ruoming Pang
Ron J. Weiss
Rohit Prabhavalkar
Qiao Liang
Benoit Jacob
Bowen Liang
HyoukJoong Lee
Ciprian Chelba
Sébastien Jean
Yue Liu
Melvin Johnson
Rohan Anil
Rajat Tibrewal
Xiaobing Liu
Akiko Eriguchi
Navdeep Jaitly
Naveen Ari
Colin Cherry
Parisa Haghani
Otavio Good
Youlong Cheng
R. Álvarez
Isaac Caswell
Wei-Ning Hsu
Zongheng Yang
Kuan Wang
Ekaterina Gonina
Katrin Tomanek
Ben Vanik
Zelin Wu
Llion Jones
M. Schuster
Yanping Huang
Dehao Chen
Kazuki Irie
George F. Foster
J. Richardson
Klaus Macherey
A. Bruguier
Heiga Zen
Colin Raffel
Shankar Kumar
Kanishka Rao
David Rybach
M. Murray
Vijayaditya Peddinti
M. Krikun
M. Bacchiani
T. Jablin
R. Suderman
Ian Williams
Benjamin Lee
Deepti Bhatia
Justin Carlson
Semih Yavuz
Yu Zhang
Ian McGraw
M. Galkin
Qi Ge
Golan Pundak
Chad Whipkey
Todd Wang
Uri Alon
Dmitry Lepikhin
Ye Tian
S. Sabour
William Chan
Shubham Toshniwal
Baohua Liao
M. Nirschl
Pat Rondon
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling"
50 / 162 papers shown
SequenceLayers: Sequence Processing and Streaming Neural Networks Made Easy
RJ Skerry-Ryan
Julián Salazar
Soroosh Mariooryad
David Kao
Daisy Stanton
...
Matt Shannon
Ron J. Weiss
Robin Scheibler
Jonas Rothfuss
Tom Bagby
AI4TS
85
0
0
31 Jul 2025
LLM-Synth4KWS: Scalable Automatic Generation and Synthesis of Confusable Data for Custom Keyword Spotting
Pai Zhu
Quan Wang
Dhruuv Agarwal
Kurt Partridge
111
1
0
29 May 2025
GE2E-KWS: Generalized End-to-End Training and Evaluation for Zero-shot Keyword Spotting
Spoken Language Technology Workshop (SLT), 2024
Pai Zhu
Jacob Bartel
Dhruuv Agarwal
Kurt Partridge
Hyun-jin Park
Quan Wang
186
3
0
22 Oct 2024
Synth4Kws: Synthesized Speech for User Defined Keyword Spotting in Low Resource Environments
Pai Zhu
Dhruuv Agarwal
Jacob Bartel
Kurt Partridge
Hyun Jin Park
Quan Wang
194
3
0
23 Jul 2024
SimulTron: On-Device Simultaneous Speech to Speech Translation
A. Agranovich
Eliya Nachmani
Oleg Rybakov
Yifan Ding
Ye Jia
Nadav Bar
Heiga Zen
Michelle Tadmor Ramanovich
173
0
0
04 Jun 2024
Deferred NAM: Low-latency Top-K Context Injection via Deferred Context Encoding for Non-Streaming ASR
Zelin Wu
Gan Song
Christopher Li
Pat Rondon
Zhong Meng
...
D. Caseiro
Golan Pundak
Tsendsuren Munkhdalai
Angad Chandorkar
Rohit Prabhavalkar
302
5
0
15 Apr 2024
Extreme Encoder Output Frame Rate Reduction: Improving Computational Latencies of Large End-to-End Models
Rohit Prabhavalkar
Zhong Meng
Weiran Wang
Adam Stooke
Xingyu Cai
Yanzhang He
Arun Narayanan
Dongseong Hwang
Tara N. Sainath
Pedro J. Moreno
196
11
0
27 Feb 2024
Efficient Adapter Finetuning for Tail Languages in Streaming Multilingual ASR
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Junwen Bai
Yue Liu
Qiujia Li
Tara N. Sainath
Trevor Strohman
337
7
0
17 Jan 2024
Improved Long-Form Speech Recognition by Jointly Modeling the Primary and Non-primary Speakers
Guru Prakash Arumugam
Shuo-yiin Chang
Tara N. Sainath
Rohit Prabhavalkar
Quan Wang
Shaan Bijwadia
209
4
0
18 Dec 2023
Using Large Language Models to Accelerate Communication for Users with Severe Motor Impairments
Shanqing Cai
Subhashini Venugopalan
Katie Seaver
Xiang Xiao
Katrin Tomanek
...
Daniel E Vance
Blair Casey
Steve M. Gleason
Philip Q. Nelson
Michael P. Brenner
244
10
0
03 Dec 2023
Deep Audio Analyzer: a Framework to Industrialize the Research on Audio Forensics
Valerio Francesco Puglisi
O. Giudice
Sebastiano Battiato
193
1
0
29 Oct 2023
Privacy-preserving and Privacy-attacking Approaches for Speech and Audio -- A Survey
Yuchen Liu
Apu Kapadia
Donald Williamson
AAML
241
1
0
26 Sep 2023
MBR and QE Finetuning: Training-time Distillation of the Best and Most Expensive Decoding Methods
International Conference on Learning Representations (ICLR), 2023
M. Finkelstein
Subhajit Naskar
Mehdi Mirzazadeh
Apurva Shah
Markus Freitag
405
36
0
19 Sep 2023
Improving Frame-level Classifier for Word Timings with Non-peaky CTC in End-to-End Automatic Speech Recognition
Interspeech (Interspeech), 2023
Xianzhao Chen
Yist Y. Lin
Kang Wang
Yi He
Zejun Ma
114
4
0
09 Jun 2023
Edit Distance based RL for RNNT decoding
DongSeon Hwang
Changwan Ryu
K. Sim
165
0
0
31 May 2023
Semantic Segmentation with Bidirectional Language Models Improves Long-form ASR
Interspeech (Interspeech), 2023
Wenjie Huang
Hao Zhang
Shankar Kumar
Shuo-yiin Chang
Tara N. Sainath
212
3
0
28 May 2023
Modular Domain Adaptation for Conformer-Based Streaming ASR
Interspeech (Interspeech), 2023
Qiujia Li
Yue Liu
DongSeon Hwang
Tara N. Sainath
P. M. Mengibar
190
13
0
22 May 2023
Conditional Adapters: Parameter-efficient Transfer Learning with Fast Inference
Neural Information Processing Systems (NeurIPS), 2023
Tao Lei
Junwen Bai
Siddhartha Brahma
Joshua Ainslie
Kenton Lee
...
Vincent Zhao
Yuexin Wu
Yue Liu
Yu Zhang
Ming-Wei Chang
BDL
AI4CE
217
80
0
11 Apr 2023
ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Brian Yan
Jiatong Shi
Yun Tang
Hirofumi Inaguma
Yifan Peng
...
Zhaoheng Ni
Moto Hira
Soumi Maiti
J. Pino
Shinji Watanabe
222
21
0
10 Apr 2023
A Deliberation-based Joint Acoustic and Text Decoder
Interspeech (Interspeech), 2021
S. Mavandadi
Tara N. Sainath
Ke Hu
Zelin Wu
133
7
0
23 Mar 2023
End-to-End Speech Recognition: A Survey
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Rohit Prabhavalkar
Takaaki Hori
Tara N. Sainath
Ralf Schluter
Shinji Watanabe
VLM
282
243
0
03 Mar 2023
Defending against Adversarial Audio via Diffusion Model
International Conference on Learning Representations (ICLR), 2023
Shutong Wu
Zhenghao Hu
Ming-Yu Liu
Weili Nie
Chaowei Xiao
DiffM
214
32
0
02 Mar 2023
Locale Encoding For Scalable Multilingual Keyword Spotting Models
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Pai Zhu
Hyun Jin Park
Alex Park
Angelo Scorza Scarpati
Ignacio López Moreno
171
7
0
25 Feb 2023
PyGlove: Efficiently Exchanging ML Ideas as Code
Daiyi Peng
Xuanyi Dong
Esteban Real
Yifeng Lu
Quoc V. Le
113
0
0
03 Feb 2023
Efficient Domain Adaptation for Speech Foundation Models
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Yue Liu
DongSeon Hwang
Zhouyuan Huo
Junwen Bai
Guru Prakash
...
K. Sim
Yu Zhang
Wei Han
Trevor Strohman
F. Beaufays
AI4CE
260
30
0
03 Feb 2023
From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Chao-Han Huck Yang
Yue Liu
Yu Zhang
Nanxin Chen
Rohit Prabhavalkar
Tara N. Sainath
Trevor Strohman
178
32
0
19 Jan 2023
Localising In-Domain Adaptation of Transformer-Based Biomedical Language Models
Journal of Biomedical Informatics (JBI), 2022
T. M. Buonocore
Claudio Crema
A. Redolfi
Riccardo Bellazzi
Enea Parimbelli
LM&MA
141
29
0
20 Dec 2022
Exploiting Category Names for Few-Shot Classification with Vision-Language Models
Taihong Xiao
Zirui Wang
Liangliang Cao
Jiahui Yu
Shengyang Dai
Ming-Hsuan Yang
VLM
MLLM
251
5
0
29 Nov 2022
VeLO: Training Versatile Learned Optimizers by Scaling Up
Luke Metz
James Harrison
C. Freeman
Amil Merchant
Lucas Beyer
...
Naman Agrawal
Ben Poole
Igor Mordatch
Adam Roberts
Jascha Narain Sohl-Dickstein
310
75
0
17 Nov 2022
Unified End-to-End Speech Recognition and Endpointing for Fast and Efficient Speech Systems
Spoken Language Technology Workshop (SLT), 2022
Shaan Bijwadia
Shuo-yiin Chang
Yue Liu
Tara N. Sainath
Chaoyang Zhang
Yanzhang He
125
9
0
01 Nov 2022
Textless Direct Speech-to-Speech Translation with Discrete Speech Representation
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Xinjian Li
Ye Jia
Chung-Cheng Chiu
269
33
0
31 Oct 2022
Streaming Parrotron for on-device speech-to-speech conversion
Interspeech (Interspeech), 2022
Oleg Rybakov
Fadi Biadsy
Xia Zhang
Liyang Jiang
Phoenix Meadowlark
Shivani Agrawal
257
4
0
25 Oct 2022
ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition
Sanchit Gandhi
Patrick von Platen
Alexander M. Rush
141
27
0
24 Oct 2022
Scaling Up Deliberation for Multilingual ASR
Spoken Language Technology Workshop (SLT), 2022
Ke Hu
Yue Liu
Tara N. Sainath
LRM
304
10
0
11 Oct 2022
A Universally-Deployable ASR Frontend for Joint Acoustic Echo Cancellation, Speech Enhancement, and Voice Separation
Interspeech (Interspeech), 2022
Tom O'Malley
A. Narayanan
Quan Wang
155
5
0
14 Sep 2022
Analysis of Self-Attention Head Diversity for Conformer-based Automatic Speech Recognition
Interspeech (Interspeech), 2022
Kartik Audhkhasi
Yinghui Huang
Bhuvana Ramabhadran
Pedro J. Moreno
128
5
0
13 Sep 2022
A Language Agnostic Multilingual Streaming On-Device ASR System
Interspeech (Interspeech), 2022
Yue Liu
Tara N. Sainath
Ruoming Pang
Shuo-yiin Chang
Qiumin Xu
...
Qiao Liang
Heguang Liu
Yanzhang He
Parisa Haghani
Sameer Bidichandani
AuLLM
172
13
0
29 Aug 2022
RSD-GAN: Regularized Sobolev Defense GAN Against Speech-to-Text Adversarial Attacks
IEEE Signal Processing Letters (SPL), 2022
Mohammad Esmaeilpour
Nourhene Chaalia
P. Cardinal
AAML
192
2
0
14 Jul 2022
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Jiahui Yu
Yuanzhong Xu
Jing Yu Koh
Thang Luong
Gunjan Baid
...
Zarana Parekh
Xin Li
Han Zhang
Jason Baldridge
Yonghui Wu
EGVM
641
1,359
0
22 Jun 2022
When does dough become a bagel? Analyzing the remaining mistakes on ImageNet
Neural Information Processing Systems (NeurIPS), 2022
Vijay Vasudevan
Benjamin Caine
Raphael Gontijo-Lopes
Sara Fridovich-Keil
Rebecca Roelofs
VLM
UQCV
197
69
0
09 May 2022
Building Machine Translation Systems for the Next Thousand Languages
Ankur Bapna
Isaac Caswell
Julia Kreutzer
Orhan Firat
D. Esch
...
Apurva Shah
Yanping Huang
Zhiwen Chen
Yonghui Wu
Macduff Hughes
299
108
0
09 May 2022
Online Model Compression for Federated Learning with Large Models
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Tien-Ju Yang
Yonghui Xiao
Giovanni Motta
F. Beaufays
Rajiv Mathews
Mingqing Chen
FedML
MQ
179
11
0
06 May 2022
A Conformer-based Waveform-domain Neural Acoustic Echo Canceller Optimized for ASR Accuracy
Interspeech (Interspeech), 2022
S. Panchapagesan
A. Narayanan
T. Shabestary
Shuai Shao
N. Howard
Alex Park
James Walker
A. Gruenstein
155
9
0
06 May 2022
CoCa: Contrastive Captioners are Image-Text Foundation Models
Jiahui Yu
Zirui Wang
Vijay Vasudevan
Legg Yeung
Mojtaba Seyedhosseini
Yonghui Wu
VLM
CLIP
OffRL
661
1,596
0
04 May 2022
The Implicit Length Bias of Label Smoothing on Beam Search Decoding
Bowen Liang
Pidong Wang
Yuan Cao
208
1
0
02 May 2022
Mask scalar prediction for improving robust automatic speech recognition
A. Narayanan
James Walker
S. Panchapagesan
N. Howard
Yuma Koizumi
184
4
0
26 Apr 2022
E2E Segmenter: Joint Segmenting and Decoding for Long-Form ASR
Interspeech (Interspeech), 2022
Wenjie Huang
Shuo-yiin Chang
David Rybach
Rohit Prabhavalkar
Tara N. Sainath
Cyril Allauzen
Cal Peyser
Zhiyun Lu
VLM
187
27
0
22 Apr 2022
Scaling Up Models and Data with
t5x
\texttt{t5x}
t5x
and
seqio
\texttt{seqio}
seqio
Journal of machine learning research (JMLR), 2022
Adam Roberts
Hyung Won Chung
Anselm Levskaya
Gaurav Mishra
James Bradbury
...
Brennan Saeta
Ryan Sepassi
A. Spiridonov
Joshua Newlan
Andrea Gesmundo
ALM
289
213
0
31 Mar 2022
4-bit Conformer with Native Quantization Aware Training for Speech Recognition
Interspeech (Interspeech), 2022
Shaojin Ding
Phoenix Meadowlark
Yanzhang He
Lukasz Lew
Shivani Agrawal
Oleg Rybakov
MQ
371
44
0
29 Mar 2022
Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation
Interspeech (Interspeech), 2022
Ye Jia
Yifan Ding
Ankur Bapna
Colin Cherry
Yu Zhang
Alexis Conneau
Nobuyuki Morioka
222
24
0
24 Mar 2022
1
2
3
4
Next