Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1804.04235
Cited By
Adafactor: Adaptive Learning Rates with Sublinear Memory Cost
11 April 2018
Noam M. Shazeer
Mitchell Stern
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Adafactor: Adaptive Learning Rates with Sublinear Memory Cost"
50 / 799 papers shown
The Marginal Value of Momentum for Small Learning Rate SGD
International Conference on Learning Representations (ICLR), 2023
Runzhe Wang
Sadhika Malladi
Tianhao Wang
Kaifeng Lyu
Zhiyuan Li
ODL
234
10
0
27 Jul 2023
f-Divergence Minimization for Sequence-Level Knowledge Distillation
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Yuqiao Wen
Zichao Li
Wenyu Du
Lili Mou
280
83
0
27 Jul 2023
Towards Generalist Biomedical AI
Tao Tu
Shekoofeh Azizi
Danny Driess
M. Schaekermann
Mohamed Amin
...
Yossi Matias
K. Singhal
Peter R. Florence
Alan Karthikesalingam
Vivek Natarajan
LM&MA
MedIm
AI4MH
279
410
0
26 Jul 2023
No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models
Neural Information Processing Systems (NeurIPS), 2023
Jean Kaddour
Oscar Key
Piotr Nawrot
Pasquale Minervini
Matt J. Kusner
429
58
0
12 Jul 2023
RoPDA: Robust Prompt-based Data Augmentation for Low-Resource Named Entity Recognition
AAAI Conference on Artificial Intelligence (AAAI), 2023
Sihan Song
Jian Zhao
Jian Zhao
195
6
0
11 Jul 2023
Event Extraction as Question Generation and Answering
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Di Lu
Shihao Ran
Joel R. Tetreault
A. Jaimes
205
51
0
10 Jul 2023
Scaling In-Context Demonstrations with Structured Attention
Tianle Cai
Kaixuan Huang
Jason D. Lee
Mengdi Wang
LRM
166
9
0
05 Jul 2023
CAME: Confidence-guided Adaptive Memory Efficient Optimization
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Yang Luo
Xiaozhe Ren
Zangwei Zheng
Zhuo Jiang
Xin Jiang
Yang You
ODL
345
35
0
05 Jul 2023
Could Small Language Models Serve as Recommenders? Towards Data-centric Cold-start Recommendations
The Web Conference (WWW), 2023
Xuansheng Wu
Huachi Zhou
Yucheng Shi
Wenlin Yao
Xiao Shi Huang
Ninghao Liu
LRM
293
29
0
29 Jun 2023
YouTube-ASL: A Large-Scale, Open-Domain American Sign Language-English Parallel Corpus
Neural Information Processing Systems (NeurIPS), 2023
David C. Uthus
Garrett Tanzer
Manfred Georg
SLR
277
71
0
27 Jun 2023
Is Pre-training Truly Better Than Meta-Learning?
Alycia Lee
P. Yu
Saumya Goyal
Yu-Xiong Wang
Oluwasanmi Koyejo
276
8
0
24 Jun 2023
On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes
International Conference on Learning Representations (ICLR), 2023
Rishabh Agarwal
Nino Vieillard
Yongchao Zhou
Piotr Stańczyk
Sabela Ramos
Matthieu Geist
Olivier Bachem
319
183
0
23 Jun 2023
A Reference-less Quality Metric for Automatic Speech Recognition via Contrastive-Learning of a Multi-Language Model with Self-Supervision
K. Yuksel
Thiago Castro Ferreira
Ahmet Gunduz
Mohamed Al-Badrashiny
Golara Javadi
129
7
0
21 Jun 2023
NoRefER: a Referenceless Quality Metric for Automatic Speech Recognition via Semi-Supervised Language Model Fine-Tuning with Contrastive Learning
Interspeech (Interspeech), 2023
K. Yuksel
Thiago Castro Ferreira
Golara Javadi
Mohamed El-Badrashiny
Ahmet Gunduz
152
5
0
21 Jun 2023
GLIMMER: generalized late-interaction memory reranker
Michiel de Jong
Yury Zemlyanskiy
Nicholas FitzGerald
Sumit Sanghai
William W. Cohen
Joshua Ainslie
RALM
232
9
0
17 Jun 2023
Conformal Language Modeling
International Conference on Learning Representations (ICLR), 2023
Victor Quach
Adam Fisch
Tal Schuster
Adam Yala
J. Sohn
Tommi Jaakkola
Regina Barzilay
574
97
0
16 Jun 2023
Scaling Open-Vocabulary Object Detection
Neural Information Processing Systems (NeurIPS), 2023
Matthias Minderer
A. Gritsenko
N. Houlsby
VLM
ObjD
423
315
0
16 Jun 2023
Understanding Optimization of Deep Learning via Jacobian Matrix and Lipschitz Constant
Xianbiao Qi
Jianan Wang
Lei Zhang
202
0
0
15 Jun 2023
Interleaving Pre-Trained Language Models and Large Language Models for Zero-Shot NL2SQL Generation
Zihui Gu
Ju Fan
Nan Tang
Songyue Zhang
Yuxin Zhang
Zui Chen
Lei Cao
Guoliang Li
Sam Madden
Xiaoyong Du
217
32
0
15 Jun 2023
AutoML in the Age of Large Language Models: Current Challenges, Future Opportunities and Risks
Alexander Tornede
Difan Deng
Theresa Eimer
Joseph Giovanelli
Aditya Mohan
...
Sarah Segel
Daphne Theodorakopoulos
Tanja Tornede
Henning Wachsmuth
Marius Lindauer
325
36
0
13 Jun 2023
AraMUS: Pushing the Limits of Data and Model Scale for Arabic Natural Language Processing
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Asaad Alghamdi
Xinyu Duan
Wei Jiang
Zhenhai Wang
Yimeng Wu
...
Yifei Zheng
Mehdi Rezagholizadeh
Baoxing Huai
Peilun Cheng
Abbas Ghaddar
VLM
140
10
0
11 Jun 2023
PoET: A generative model of protein families as sequences-of-sequences
Neural Information Processing Systems (NeurIPS), 2023
Timothy F. Truong
Tristan Bepler
SLR
211
69
0
09 Jun 2023
Leaping through tree space: continuous phylogenetic inference for rooted and unrooted trees
Genome Biology and Evolution (GBE), 2023
Matthew J. Penn
Neil Scheidwasser
Joseph Penn
C. Donnelly
D. Duchêne
Samir Bhatt
302
6
0
09 Jun 2023
Unbalanced Optimal Transport for Unbalanced Word Alignment
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Yuki Arase
Han Bao
Sho Yokoi
OT
140
6
0
07 Jun 2023
Click: Controllable Text Generation with Sequence Likelihood Contrastive Learning
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Chujie Zheng
Pei Ke
Zheng Zhang
Shiyu Huang
BDL
240
45
0
06 Jun 2023
LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Dongfu Jiang
Xiang Ren
Bill Yuchen Lin
ELM
445
487
0
05 Jun 2023
SamToNe: Improving Contrastive Loss for Dual Encoder Retrieval Models with Same Tower Negatives
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Fedor Moiseev
Gustavo Hernández Ábrego
Peter Dornbach
I. Zitouni
Enrique Alfonseca
Zhe Dong
192
9
0
05 Jun 2023
Harnessing large-language models to generate private synthetic text
Alexey Kurakin
Natalia Ponomareva
Umar Syed
Liam MacDermed
Seth Neel
SILM
SyDa
290
54
0
02 Jun 2023
THiFLY Research at SemEval-2023 Task 7: A Multi-granularity System for CTR-based Textual Entailment and Evidence Retrieval
International Workshop on Semantic Evaluation (SemEval), 2023
Yuxuan Zhou
Ziyun Jin
Meiwei Li
Chenyi Guo
Xien Liu
Xinxin You
Ji Wu
137
12
0
02 Jun 2023
From Pixels to UI Actions: Learning to Follow Instructions via Graphical User Interfaces
Neural Information Processing Systems (NeurIPS), 2023
Peter Shaw
Mandar Joshi
James Cohan
Jonathan Berant
Panupong Pasupat
Hexiang Hu
Urvashi Khandelwal
Kenton Lee
Kristina Toutanova
LLMAG
LM&Ro
269
75
0
31 May 2023
Toward Understanding Why Adam Converges Faster Than SGD for Transformers
Yan Pan
Yuanzhi Li
293
52
0
31 May 2023
Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Paul Roit
Johan Ferret
Lior Shani
Roee Aharoni
Geoffrey Cideron
...
Olivier Bachem
G. Elidan
Avinatan Hassidim
Olivier Pietquin
Idan Szpektor
HILM
289
100
0
31 May 2023
Adam Accumulation to Reduce Memory Footprints of both Activations and Gradients for Large-scale DNN Training
European Conference on Artificial Intelligence (ECAI), 2023
Yijia Zhang
Yibo Han
Shijie Cao
Guohao Dai
Youshan Miao
Ting Cao
Fan Yang
Ningyi Xu
118
5
0
31 May 2023
Correcting Semantic Parses with Natural Language through Dynamic Schema Encoding
Parker Glenn
Parag Dakle
Preethi Raghavan
197
3
0
31 May 2023
Comparing and combining some popular NER approaches on Biomedical tasks
Workshop on Biomedical Natural Language Processing (BioNLP), 2023
Harsh Verma
S. Bergler
Narjes Tahaei
196
7
0
30 May 2023
Brainformers: Trading Simplicity for Efficiency
International Conference on Machine Learning (ICML), 2023
Yan-Quan Zhou
Nan Du
Yanping Huang
Daiyi Peng
Chang Lan
...
Zhifeng Chen
Quoc V. Le
Claire Cui
J.H.J. Laundon
J. Dean
MoE
247
36
0
29 May 2023
Federated Learning for Semantic Parsing: Task Formulation, Evaluation Setup, New Algorithms
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Tianshu Zhang
Wei-Han Lee
Yan Koyfman
Yu-Chuan Su
Huan Sun
FedML
117
8
0
26 May 2023
Diable: Efficient Dialogue State Tracking as Operations on Tables
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Pietro Lesci
Yoshinari Fujinuma
Momchil Hardalov
Chao Shang
Yassine Benajiba
Lluís Marquez
LMTD
304
8
0
26 May 2023
Three Towers: Flexible Contrastive Learning with Pretrained Image Models
Neural Information Processing Systems (NeurIPS), 2023
Jannik Kossen
Mark Collier
Basil Mustafa
Tianlin Li
Xiaohua Zhai
Lucas Beyer
Andreas Steiner
Jesse Berent
Rodolphe Jenatton
Efi Kokiopoulou
VLM
212
18
0
26 May 2023
Learning to Imagine: Visually-Augmented Natural Language Generation
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Tianyi Tang
Yushuo Chen
Yifan Du
Junyi Li
Wayne Xin Zhao
Ji-Rong Wen
DiffM
427
10
0
26 May 2023
Domain Aligned Prefix Averaging for Domain Generalization in Abstractive Summarization
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Pranav Ajit Nair
Sukomal Pal
Pradeepika Verm
MoMe
239
2
0
26 May 2023
Incorporating Distributions of Discourse Structure for Long Document Abstractive Summarization
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Dongqi Pu
Yifa Wang
Vera Demberg
224
28
0
26 May 2023
Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer
Neural Information Processing Systems (NeurIPS), 2023
Yuandong Tian
Yiping Wang
Beidi Chen
S. Du
MLT
493
100
0
25 May 2023
SING: A Plug-and-Play DNN Learning Technique
Adrien Courtois
Damien Scieur
Jean-Michel Morel
Pablo Arias
Thomas Eboli
162
0
0
25 May 2023
RewriteLM: An Instruction-Tuned Large Language Model for Text Rewriting
AAAI Conference on Artificial Intelligence (AAAI), 2023
Lei Shu
Liangchen Luo
Jayakumar Hoskere
Yun Zhu
Canoee Liu
Simon Tong
Jindong Chen
Lei Meng
KELM
LRM
273
75
0
25 May 2023
Lexinvariant Language Models
Neural Information Processing Systems (NeurIPS), 2023
Qian Huang
E. Zelikman
Sarah Chen
Yuhuai Wu
Gregory Valiant
Abigail Z. Jacobs
176
1
0
24 May 2023
The Role of Output Vocabulary in T2T LMs for SPARQL Semantic Parsing
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Debayan Banerjee
Pranav Ajit Nair
Ricardo Usbeck
Chris Biemann
200
3
0
24 May 2023
A Mechanistic Interpretation of Arithmetic Reasoning in Language Models using Causal Mediation Analysis
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Alessandro Stolfo
Yonatan Belinkov
Mrinmaya Sachan
MILM
KELM
LRM
280
67
0
24 May 2023
Active Learning for Natural Language Generation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yotam Perlitz
Ariel Gera
Michal Shmueli-Scheuer
D. Sheinwald
Noam Slonim
L. Ein-Dor
334
4
0
24 May 2023
Text encoders bottleneck compositionality in contrastive vision-language models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Amita Kamath
Jack Hessel
Kai-Wei Chang
CoGe
CLIP
VLM
273
30
0
24 May 2023
Previous
1
2
3
...
7
8
9
...
14
15
16
Next
Page 8 of 16
Page
of 16
Go