Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1908.08593
Cited By
v1
v2 (latest)
Revealing the Dark Secrets of BERT
21 August 2019
Olga Kovaleva
Alexey Romanov
Anna Rogers
Anna Rumshisky
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Revealing the Dark Secrets of BERT"
50 / 185 papers shown
Title
Let's Play Mono-Poly: BERT Can Reveal Words' Polysemy Level and Partitionability into Senses
Aina Garí Soler
Marianna Apidianaki
MILM
284
70
0
29 Apr 2021
Morph Call: Probing Morphosyntactic Content of Multilingual Transformers
Vladislav Mikhailov
O. Serikov
Ekaterina Artemova
82
9
0
26 Apr 2021
Probing for Bridging Inference in Transformer Language Models
Onkar Pandit
Yufang Hou
94
15
0
19 Apr 2021
Knowledge Neurons in Pretrained Transformers
Damai Dai
Li Dong
Y. Hao
Zhifang Sui
Baobao Chang
Furu Wei
KELM
MU
155
466
0
18 Apr 2021
Fast, Effective, and Self-Supervised: Transforming Masked Language Models into Universal Lexical and Sentence Encoders
Fangyu Liu
Ivan Vulić
Anna Korhonen
Nigel Collier
VLM
OffRL
124
121
0
16 Apr 2021
Probing Across Time: What Does RoBERTa Know and When?
Leo Z. Liu
Yizhong Wang
Jungo Kasai
Hannaneh Hajishirzi
Noah A. Smith
KELM
114
88
0
16 Apr 2021
Effect of Post-processing on Contextualized Word Representations
Hassan Sajjad
Firoj Alam
Fahim Dalvi
Nadir Durrani
61
9
0
15 Apr 2021
DirectProbe: Studying Representations without Classifiers
Yichu Zhou
Vivek Srikumar
97
29
0
13 Apr 2021
Transformers: "The End of History" for NLP?
Anton Chernyavskiy
Dmitry Ilvovsky
Preslav Nakov
117
30
0
09 Apr 2021
Attention Head Masking for Inference Time Content Selection in Abstractive Summarization
Shuyang Cao
Lu Wang
CVBM
52
12
0
06 Apr 2021
SparseBERT: Rethinking the Importance Analysis in Self-attention
Han Shi
Jiahui Gao
Xiaozhe Ren
Hang Xu
Xiaodan Liang
Zhenguo Li
James T. Kwok
97
54
0
25 Feb 2021
On the Evolution of Syntactic Information Encoded by BERT's Contextualized Representations
Laura Pérez-Mayos
Roberto Carlini
Miguel Ballesteros
Leo Wanner
62
7
0
27 Jan 2021
Deep Subjecthood: Higher-Order Grammatical Features in Multilingual BERT
Isabel Papadimitriou
Ethan A. Chi
Richard Futrell
Kyle Mahowald
85
44
0
26 Jan 2021
Regulatory Compliance through Doc2Doc Information Retrieval: A case study in EU/UK legislation where text similarity has limitations
Ilias Chalkidis
Manos Fergadiotis
Nikolaos Manginas
Eva Katakalou
Prodromos Malakasiotis
AILaw
52
27
0
26 Jan 2021
The heads hypothesis: A unifying statistical approach towards understanding multi-headed attention in BERT
Madhura Pande
Aakriti Budhraja
Preksha Nema
Pratyush Kumar
Mitesh M. Khapra
66
19
0
22 Jan 2021
Classifying Scientific Publications with BERT -- Is Self-Attention a Feature Selection Method?
Andrés García-Silva
José Manuél Gómez-Pérez
43
11
0
20 Jan 2021
Red Alarm for Pre-trained Models: Universal Vulnerability to Neuron-Level Backdoor Attacks
Zhengyan Zhang
Guangxuan Xiao
Yongwei Li
Tian Lv
Fanchao Qi
Zhiyuan Liu
Yasheng Wang
Xin Jiang
Maosong Sun
AAML
153
74
0
18 Jan 2021
Of Non-Linearity and Commutativity in BERT
Sumu Zhao
Damian Pascual
Gino Brunner
Roger Wattenhofer
103
17
0
12 Jan 2021
Inserting Information Bottlenecks for Attribution in Transformers
Zhiying Jiang
Raphael Tang
Ji Xin
Jimmy J. Lin
55
6
0
27 Dec 2020
Gender Bias in Multilingual Neural Machine Translation: The Architecture Matters
Marta R. Costa-jussá
Carlos Escolano
Christine Basta
Javier Ferrando
Roser Batlle-Roca
Ksenia Kharitonova
74
18
0
24 Dec 2020
Pre-Training a Language Model Without Human Language
Cheng-Han Chiang
Hung-yi Lee
71
13
0
22 Dec 2020
Self-Supervised Learning for Visual Summary Identification in Scientific Publications
Shintaro Yamamoto
Anne Lauscher
Simone Paolo Ponzetto
Goran Glavaš
Shigeo Morishima
SSL
34
3
0
21 Dec 2020
Infusing Finetuning with Semantic Dependencies
Zhaofeng Wu
Hao Peng
Noah A. Smith
71
37
0
10 Dec 2020
Know What You Don't Need: Single-Shot Meta-Pruning for Attention Heads
Zhengyan Zhang
Fanchao Qi
Zhiyuan Liu
Qun Liu
Maosong Sun
VLM
91
31
0
07 Nov 2020
Influence Patterns for Explaining Information Flow in BERT
Kaiji Lu
Zifan Wang
Piotr (Peter) Mardziel
Anupam Datta
GNN
103
16
0
02 Nov 2020
RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark
Tatiana Shavrina
Alena Fenogenova
Anton A. Emelyanov
Denis Shevelev
Ekaterina Artemova
Valentin Malykh
Vladislav Mikhailov
Maria Tikhonova
Andrey Chertok
Andrey Evlampiev
VLM
ELM
89
82
0
29 Oct 2020
Fine-grained Information Status Classification Using Discourse Context-Aware BERT
Yufang Hou
48
6
0
26 Oct 2020
Towards Fully Bilingual Deep Language Modeling
Li-Hsin Chang
S. Pyysalo
Jenna Kanerva
Filip Ginter
67
3
0
22 Oct 2020
Optimal Subarchitecture Extraction For BERT
Adrian de Wynter
Daniel J. Perry
MQ
96
18
0
20 Oct 2020
Mischief: A Simple Black-Box Attack Against Transformer Architectures
Adrian de Wynter
AAML
74
1
0
16 Oct 2020
Detecting ESG topics using domain-specific language models and data augmentation approaches
Timothy Nugent
N. Stelea
Jochen L. Leidner
67
13
0
16 Oct 2020
Understanding Neural Abstractive Summarization Models via Uncertainty
Jiacheng Xu
Shrey Desai
Greg Durrett
UQLM
81
47
0
15 Oct 2020
Does Chinese BERT Encode Word Structure?
Yile Wang
Leyang Cui
Yue Zhang
74
6
0
15 Oct 2020
Formalizing Trust in Artificial Intelligence: Prerequisites, Causes and Goals of Human Trust in AI
Alon Jacovi
Ana Marasović
Tim Miller
Yoav Goldberg
328
450
0
15 Oct 2020
Pretrained Transformers for Text Ranking: BERT and Beyond
Jimmy J. Lin
Rodrigo Nogueira
Andrew Yates
VLM
393
628
0
13 Oct 2020
Layer-wise Guided Training for BERT: Learning Incrementally Refined Document Representations
Nikolaos Manginas
Ilias Chalkidis
Prodromos Malakasiotis
44
4
0
12 Oct 2020
SJTU-NICT's Supervised and Unsupervised Neural Machine Translation Systems for the WMT20 News Translation Task
Z. Li
Hai Zhao
Rui Wang
Kehai Chen
Masao Utiyama
Eiichiro Sumita
62
15
0
11 Oct 2020
PyMT5: multi-mode translation of natural language and Python code with transformers
Colin B. Clement
Dawn Drain
Jonathan Timcheck
Alexey Svyatkovskiy
Neel Sundaresan
84
157
0
07 Oct 2020
BERT Knows Punta Cana is not just beautiful, it's gorgeous: Ranking Scalar Adjectives with Contextualised Representations
Aina Garí Soler
Marianna Apidianaki
39
19
0
06 Oct 2020
Pretrained Language Model Embryology: The Birth of ALBERT
Cheng-Han Chiang
Sung-Feng Huang
Hung-yi Lee
69
42
0
06 Oct 2020
Linguistic Profiling of a Neural Language Model
Alessio Miaschi
D. Brunato
F. Dell’Orletta
Giulia Venturi
101
49
0
05 Oct 2020
Pruning Redundant Mappings in Transformer Models via Spectral-Normalized Identity Prior
Zi Lin
Jeremiah Zhe Liu
Ziao Yang
Nan Hua
Dan Roth
94
47
0
05 Oct 2020
Rethinking Attention with Performers
K. Choromanski
Valerii Likhosherstov
David Dohan
Xingyou Song
Andreea Gane
...
Afroz Mohiuddin
Lukasz Kaiser
David Belanger
Lucy J. Colwell
Adrian Weller
196
1,605
0
30 Sep 2020
Bridging Information-Seeking Human Gaze and Machine Reading Comprehension
J. Malmaud
R. Levy
Yevgeni Berzak
78
33
0
30 Sep 2020
Repulsive Attention: Rethinking Multi-head Attention as Bayesian Inference
Bang An
Jie Lyu
Zhenyi Wang
Chunyuan Li
Changwei Hu
Fei Tan
Ruiyi Zhang
Yifan Hu
Changyou Chen
AAML
97
28
0
20 Sep 2020
Attention Flows: Analyzing and Comparing Attention Mechanisms in Language Models
Joseph F DeRose
Jiayao Wang
M. Berger
65
84
0
03 Sep 2020
A Comparison of Pre-trained Vision-and-Language Models for Multimodal Representation Learning across Medical Images and Reports
Yikuan Li
Hanyin Wang
Yuan Luo
70
66
0
03 Sep 2020
On Commonsense Cues in BERT for Solving Commonsense Tasks
Leyang Cui
Sijie Cheng
Yu Wu
Yue Zhang
SSL
CML
LRM
57
15
0
10 Aug 2020
ConvBERT: Improving BERT with Span-based Dynamic Convolution
Zihang Jiang
Weihao Yu
Daquan Zhou
Yunpeng Chen
Jiashi Feng
Shuicheng Yan
135
162
0
06 Aug 2020
Deep Learning Brasil -- NLP at SemEval-2020 Task 9: Overview of Sentiment Analysis of Code-Mixed Tweets
Manoel Veríssimo dos Santos Neto
Ayrton Amaral
Nádia Félix F. da Silva
A. S. Soares
23
4
0
28 Jul 2020
Previous
1
2
3
4
Next