Open Sesame: Getting Inside BERT's Linguistic Knowledge

4 June 2019

Papers citing "Open Sesame: Getting Inside BERT's Linguistic Knowledge"

16 / 166 papers shown

Title
jiant: A Software Toolkit for Research on General-Purpose Text Understanding ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2020 Yada Pruksachatkun Philip Yeres Haokun Liu Jason Phang Phu Mon Htut Alex Jinpeng Wang Ian Tenney Samuel R. Bowman SSeg 230 96 0 04 Mar 2020
A Primer in BERTology: What we know about how BERT worksTransactions of the Association for Computational Linguistics (TACL), 2020 Anna Rogers Olga Kovaleva Anna Rumshisky OffRL 405 1,694 0 27 Feb 2020
Fixed Encoder Self-Attention Patterns in Transformer-Based Machine TranslationFindings (Findings), 2020 Alessandro Raganato Yves Scherrer Jörg Tiedemann 339 96 0 24 Feb 2020
Feature Importance Estimation with Self-Attention NetworksEuropean Conference on Artificial Intelligence (ECAI), 2020 Blaž Škrlj Jannis Brugger Nada Lavrac Matej Petković FAtt MILM 162 55 0 11 Feb 2020
oLMpics -- On what Language Model Pre-training CapturesTransactions of the Association for Computational Linguistics (TACL), 2019 Alon Talmor Yanai Elazar Yoav Goldberg Jonathan Berant LRM 289 306 0 31 Dec 2019
Unsupervised Transfer Learning via BERT Neuron Selection M. Valipour E. Lee Jaime R. Jamacaro C. Bessega 103 5 0 10 Dec 2019
Do Attention Heads in BERT Track Syntactic Dependencies? Phu Mon Htut Jason Phang Shikha Bordia Samuel R. Bowman 187 142 0 27 Nov 2019
BERTs of a feather do not generalize together: Large variability in generalization across models with similar test set performanceBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackBoxNLP), 2019 R. Thomas McCoy Junghyun Min Tal Linzen 352 156 0 07 Nov 2019
HUBERT Untangles BERT to Improve Transfer across NLP Tasks M. Moradshahi Hamid Palangi M. Lam P. Smolensky Jianfeng Gao 315 18 0 25 Oct 2019
Enhancing the Transformer with Explicit Relational Encoding for Math Problem Solving Imanol Schlag P. Smolensky Roland Fernandez Nebojsa Jojic Jürgen Schmidhuber Jianfeng Gao 179 56 0 15 Oct 2019
Is Multilingual BERT Fluent in Language Generation? Samuel Rönnqvist Jenna Kanerva T. Salakoski Filip Ginter 169 74 0 09 Oct 2019
Does BERT agree? Evaluating knowledge of structure dependence through agreement relations Geoff Bacon T. Regier 92 21 0 26 Aug 2019
Compositionality decomposed: how do neural networks generalise?Journal of Artificial Intelligence Research (JAIR), 2019 Dieuwke Hupkes Verna Dankers Mathijs Mul Elia Bruni CoGe 362 369 0 22 Aug 2019
On Identifiability in TransformersInternational Conference on Learning Representations (ICLR), 2019 Gino Brunner Yang Liu Damian Pascual Oliver Richter Massimiliano Ciaramita Roger Wattenhofer ViT 243 200 0 12 Aug 2019
What BERT is not: Lessons from a new suite of psycholinguistic diagnostics for language modelsTransactions of the Association for Computational Linguistics (TACL), 2019 Allyson Ettinger 303 635 0 31 Jul 2019
Theoretical Limitations of Self-Attention in Neural Sequence ModelsTransactions of the Association for Computational Linguistics (TACL), 2019 Michael Hahn 292 333 0 16 Jun 2019