Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1606.08415
Cited By
Gaussian Error Linear Units (GELUs)
27 June 2016
Dan Hendrycks
Kevin Gimpel
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Gaussian Error Linear Units (GELUs)"
30 / 780 papers shown
Title
Unsupervised Cross-lingual Representation Learning for Speech Recognition
Alexis Conneau
Alexei Baevski
R. Collobert
Abdel-rahman Mohamed
Michael Auli
SSL
41
754
0
24 Jun 2020
Categorical Normalizing Flows via Continuous Transformations
Phillip Lippe
E. Gavves
BDL
15
43
0
17 Jun 2020
VirTex: Learning Visual Representations from Textual Annotations
Karan Desai
Justin Johnson
SSL
VLM
24
432
0
11 Jun 2020
Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers
Anne Lauscher
Olga Majewska
Leonardo F. R. Ribeiro
Iryna Gurevych
Nikolai Rozanov
Goran Glavavs
KELM
31
79
0
24 May 2020
Normalized Attention Without Probability Cage
Oliver Richter
Roger Wattenhofer
14
21
0
19 May 2020
Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations
Sam Coope
Tyler Farghly
D. Gerz
Ivan Vulić
Matthew Henderson
19
62
0
18 May 2020
UDapter: Language Adaptation for Truly Universal Dependency Parsing
A. Ustun
Arianna Bisazza
G. Bouma
Gertjan van Noord
27
113
0
29 Apr 2020
Random Features for Kernel Approximation: A Survey on Algorithms, Theory, and Beyond
Fanghui Liu
Xiaolin Huang
Yudong Chen
Johan A. K. Suykens
BDL
36
172
0
23 Apr 2020
Evolving Normalization-Activation Layers
Hanxiao Liu
Andrew Brock
Karen Simonyan
Quoc V. Le
12
79
0
06 Apr 2020
Keyphrase Extraction with Span-based Feature Representations
Funan Mu
Zhenting Yu
Lifeng Wang
Yequan Wang
Qingyu Yin
Yibo Sun
Liqun Liu
Teng Ma
Jing Tang
Xing Zhou
24
17
0
13 Feb 2020
Distributional Soft Actor-Critic: Off-Policy Reinforcement Learning for Addressing Value Estimation Errors
Jingliang Duan
Yang Guan
Shengbo Eben Li
Yangang Ren
B. Cheng
OffRL
15
173
0
09 Jan 2020
Machine Learning from a Continuous Viewpoint
E. Weinan
Chao Ma
Lei Wu
23
102
0
30 Dec 2019
TreeGen: A Tree-Based Transformer Architecture for Code Generation
Zeyu Sun
Qihao Zhu
Yingfei Xiong
Yican Sun
Lili Mou
Lu Zhang
17
173
0
22 Nov 2019
Symmetrical Gaussian Error Linear Units (SGELUs)
Chao Yu
Zhiguo Su
4
10
0
10 Nov 2019
ConveRT: Efficient and Accurate Conversational Representations from Transformers
Matthew Henderson
I. Casanueva
Nikola Mrkvsić
Pei-hao Su
Tsung-Hsien
Ivan Vulić
13
196
0
09 Nov 2019
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
M. Lewis
Yinhan Liu
Naman Goyal
Marjan Ghazvininejad
Abdel-rahman Mohamed
Omer Levy
Veselin Stoyanov
Luke Zettlemoyer
AIMat
VLM
41
10,583
0
29 Oct 2019
Generative Pre-Training for Speech with Autoregressive Predictive Coding
Yu-An Chung
James R. Glass
SSL
15
173
0
23 Oct 2019
Transformer-based Acoustic Modeling for Hybrid Speech Recognition
Yongqiang Wang
Abdel-rahman Mohamed
Duc Le
Chunxi Liu
Alex Xiao
...
Xiaohui Zhang
Frank Zhang
Christian Fuegen
Geoffrey Zweig
M. Seltzer
14
248
0
22 Oct 2019
Multi-Task Learning for Conversational Question Answering over a Large-Scale Knowledge Base
Tao Shen
Xiubo Geng
Tao Qin
Daya Guo
Duyu Tang
Nan Duan
Guodong Long
Daxin Jiang
25
81
0
11 Oct 2019
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSL
AIMat
64
6,370
0
26 Sep 2019
Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models
Cheolhyoung Lee
Kyunghyun Cho
Wanmo Kang
MoE
249
205
0
25 Sep 2019
Cross-Lingual Natural Language Generation via Pre-Training
Zewen Chi
Li Dong
Furu Wei
Wenhui Wang
Xian-Ling Mao
Heyan Huang
19
136
0
23 Sep 2019
Multi-Task Self-Supervised Learning for Disfluency Detection
Shaolei Wang
Wanxiang Che
Qi Liu
Pengda Qin
Ting Liu
William Yang Wang
SSL
14
56
0
15 Aug 2019
A Generalized Framework of Sequence Generation with Application to Undirected Sequence Models
Elman Mansimov
Alex Jinpeng Wang
Sean Welleck
Kyunghyun Cho
AIMat
20
46
0
29 May 2019
Language Modeling with Deep Transformers
Kazuki Irie
Albert Zeyer
Ralf Schluter
Hermann Ney
KELM
27
172
0
10 May 2019
Unified Language Model Pre-training for Natural Language Understanding and Generation
Li Dong
Nan Yang
Wenhui Wang
Furu Wei
Xiaodong Liu
Yu-Chiang Frank Wang
Jianfeng Gao
M. Zhou
H. Hon
ELM
AI4CE
77
1,550
0
08 May 2019
Generating Long Sequences with Sparse Transformers
R. Child
Scott Gray
Alec Radford
Ilya Sutskever
11
1,847
0
23 Apr 2019
Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT
Shijie Wu
Mark Dredze
VLM
SSeg
22
671
0
19 Apr 2019
Neural Empirical Bayes
Saeed Saremi
Aapo Hyvarinen
10
65
0
06 Mar 2019
NEU: A Meta-Algorithm for Universal UAP-Invariant Feature Representation
Anastasis Kratsios
Cody B. Hyndman
OOD
22
17
0
31 Aug 2018
Previous
1
2
3
...
14
15
16