ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.01698
  4. Cited By
Open Sesame: Getting Inside BERT's Linguistic Knowledge

Open Sesame: Getting Inside BERT's Linguistic Knowledge

4 June 2019
Yongjie Lin
Y. Tan
Robert Frank
ArXiv (abs)PDFHTML

Papers citing "Open Sesame: Getting Inside BERT's Linguistic Knowledge"

16 / 166 papers shown
Title
jiant: A Software Toolkit for Research on General-Purpose Text
  Understanding Models
jiant: A Software Toolkit for Research on General-Purpose Text Understanding ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2020
Yada Pruksachatkun
Philip Yeres
Haokun Liu
Jason Phang
Phu Mon Htut
Alex Jinpeng Wang
Ian Tenney
Samuel R. Bowman
SSeg
230
96
0
04 Mar 2020
A Primer in BERTology: What we know about how BERT works
A Primer in BERTology: What we know about how BERT worksTransactions of the Association for Computational Linguistics (TACL), 2020
Anna Rogers
Olga Kovaleva
Anna Rumshisky
OffRL
405
1,694
0
27 Feb 2020
Fixed Encoder Self-Attention Patterns in Transformer-Based Machine
  Translation
Fixed Encoder Self-Attention Patterns in Transformer-Based Machine TranslationFindings (Findings), 2020
Alessandro Raganato
Yves Scherrer
Jörg Tiedemann
339
96
0
24 Feb 2020
Feature Importance Estimation with Self-Attention Networks
Feature Importance Estimation with Self-Attention NetworksEuropean Conference on Artificial Intelligence (ECAI), 2020
Blaž Škrlj
Jannis Brugger
Nada Lavrac
Matej Petković
FAttMILM
162
55
0
11 Feb 2020
oLMpics -- On what Language Model Pre-training Captures
oLMpics -- On what Language Model Pre-training CapturesTransactions of the Association for Computational Linguistics (TACL), 2019
Alon Talmor
Yanai Elazar
Yoav Goldberg
Jonathan Berant
LRM
289
306
0
31 Dec 2019
Unsupervised Transfer Learning via BERT Neuron Selection
Unsupervised Transfer Learning via BERT Neuron Selection
M. Valipour
E. Lee
Jaime R. Jamacaro
C. Bessega
103
5
0
10 Dec 2019
Do Attention Heads in BERT Track Syntactic Dependencies?
Do Attention Heads in BERT Track Syntactic Dependencies?
Phu Mon Htut
Jason Phang
Shikha Bordia
Samuel R. Bowman
187
142
0
27 Nov 2019
BERTs of a feather do not generalize together: Large variability in
  generalization across models with similar test set performance
BERTs of a feather do not generalize together: Large variability in generalization across models with similar test set performanceBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackBoxNLP), 2019
R. Thomas McCoy
Junghyun Min
Tal Linzen
352
156
0
07 Nov 2019
HUBERT Untangles BERT to Improve Transfer across NLP Tasks
HUBERT Untangles BERT to Improve Transfer across NLP Tasks
M. Moradshahi
Hamid Palangi
M. Lam
P. Smolensky
Jianfeng Gao
315
18
0
25 Oct 2019
Enhancing the Transformer with Explicit Relational Encoding for Math
  Problem Solving
Enhancing the Transformer with Explicit Relational Encoding for Math Problem Solving
Imanol Schlag
P. Smolensky
Roland Fernandez
Nebojsa Jojic
Jürgen Schmidhuber
Jianfeng Gao
179
56
0
15 Oct 2019
Is Multilingual BERT Fluent in Language Generation?
Is Multilingual BERT Fluent in Language Generation?
Samuel Rönnqvist
Jenna Kanerva
T. Salakoski
Filip Ginter
169
74
0
09 Oct 2019
Does BERT agree? Evaluating knowledge of structure dependence through
  agreement relations
Does BERT agree? Evaluating knowledge of structure dependence through agreement relations
Geoff Bacon
T. Regier
92
21
0
26 Aug 2019
Compositionality decomposed: how do neural networks generalise?
Compositionality decomposed: how do neural networks generalise?Journal of Artificial Intelligence Research (JAIR), 2019
Dieuwke Hupkes
Verna Dankers
Mathijs Mul
Elia Bruni
CoGe
362
369
0
22 Aug 2019
On Identifiability in Transformers
On Identifiability in TransformersInternational Conference on Learning Representations (ICLR), 2019
Gino Brunner
Yang Liu
Damian Pascual
Oliver Richter
Massimiliano Ciaramita
Roger Wattenhofer
ViT
243
200
0
12 Aug 2019
What BERT is not: Lessons from a new suite of psycholinguistic
  diagnostics for language models
What BERT is not: Lessons from a new suite of psycholinguistic diagnostics for language modelsTransactions of the Association for Computational Linguistics (TACL), 2019
Allyson Ettinger
303
635
0
31 Jul 2019
Theoretical Limitations of Self-Attention in Neural Sequence Models
Theoretical Limitations of Self-Attention in Neural Sequence ModelsTransactions of the Association for Computational Linguistics (TACL), 2019
Michael Hahn
292
333
0
16 Jun 2019
Previous
1234