ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.01698
  4. Cited By
Open Sesame: Getting Inside BERT's Linguistic Knowledge

Open Sesame: Getting Inside BERT's Linguistic Knowledge

4 June 2019
Yongjie Lin
Y. Tan
Robert Frank
ArXiv (abs)PDFHTML

Papers citing "Open Sesame: Getting Inside BERT's Linguistic Knowledge"

50 / 166 papers shown
Title
Improving Semantic Matching through Dependency-Enhanced Pre-trained
  Model with Adaptive Fusion
Improving Semantic Matching through Dependency-Enhanced Pre-trained Model with Adaptive FusionConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Jian Song
Di Liang
Rumei Li
Yun Li
Sirui Wang
Minlong Peng
Wei Wu
Yongxin Yu
236
20
0
16 Oct 2022
Probing of Quantitative Values in Abstractive Summarization Models
Probing of Quantitative Values in Abstractive Summarization Models
Nathan M. White
179
0
0
03 Oct 2022
Downstream Datasets Make Surprisingly Good Pretraining Corpora
Downstream Datasets Make Surprisingly Good Pretraining CorporaAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Kundan Krishna
Saurabh Garg
Jeffrey P. Bigham
Zachary Chase Lipton
186
37
0
28 Sep 2022
Negation, Coordination, and Quantifiers in Contextualized Language
  Models
Negation, Coordination, and Quantifiers in Contextualized Language ModelsInternational Conference on Computational Linguistics (COLING), 2022
A. Kalouli
Rita Sevastjanova
C. Beck
Maribel Romero
199
12
0
16 Sep 2022
Visual Comparison of Language Model Adaptation
Visual Comparison of Language Model AdaptationIEEE Transactions on Visualization and Computer Graphics (TVCG), 2022
Rita Sevastjanova
E. Cakmak
Haiqin Yang
Robert Bamler
Mennatallah El-Assady
VLM
222
19
0
17 Aug 2022
What does Transformer learn about source code?
What does Transformer learn about source code?
Kechi Zhang
Ge Li
Zhi Jin
ViT
144
10
0
18 Jul 2022
Forming Trees with Treeformers
Forming Trees with TreeformersRecent Advances in Natural Language Processing (RANLP), 2022
Nilay Patel
Jeffrey Flanigan
AI4CE
263
4
0
14 Jul 2022
Masked Part-Of-Speech Model: Does Modeling Long Context Help
  Unsupervised POS-tagging?
Masked Part-Of-Speech Model: Does Modeling Long Context Help Unsupervised POS-tagging?North American Chapter of the Association for Computational Linguistics (NAACL), 2022
Xiang Zhou
Shiyue Zhang
Joey Tianyi Zhou
125
1
0
30 Jun 2022
STNDT: Modeling Neural Population Activity with a Spatiotemporal
  Transformer
STNDT: Modeling Neural Population Activity with a Spatiotemporal TransformerNeural Information Processing Systems (NeurIPS), 2022
Trung Le
Eli Shlizerman
137
38
0
09 Jun 2022
A computational psycholinguistic evaluation of the syntactic abilities
  of Galician BERT models at the interface of dependency resolution and
  training time
A computational psycholinguistic evaluation of the syntactic abilities of Galician BERT models at the interface of dependency resolution and training time
Iria de-Dios-Flores
Marcos Garcia
185
4
0
06 Jun 2022
Word-order typology in Multilingual BERT: A case study in
  subordinate-clause detection
Word-order typology in Multilingual BERT: A case study in subordinate-clause detection
Dmitry Nikolaev
Sebastian Padó
141
6
0
24 May 2022
Acceptability Judgements via Examining the Topology of Attention Maps
Acceptability Judgements via Examining the Topology of Attention MapsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
D. Cherniavskii
Eduard Tulchinskii
Vladislav Mikhailov
Irina Proskurina
Laida Kushnareva
Ekaterina Artemova
S. Barannikov
Irina Piontkovskaya
D. Piontkovski
Evgeny Burnaev
941
23
0
19 May 2022
Is the Computation of Abstract Sameness Relations Human-Like in Neural
  Language Models?
Is the Computation of Abstract Sameness Relations Human-Like in Neural Language Models?
Lukas Thoma
Benjamin Roth
153
0
0
12 May 2022
Extracting Latent Steering Vectors from Pretrained Language Models
Extracting Latent Steering Vectors from Pretrained Language ModelsFindings (Findings), 2022
Nishant Subramani
Nivedita Suresh
Matthew E. Peters
LLMSV
162
137
0
10 May 2022
Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to
  Store Speaker Information
Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to Store Speaker Information
Chiyu Feng
Po-Chun Hsu
Hung-yi Lee
SSL
142
9
0
08 May 2022
UniTE: Unified Translation Evaluation
UniTE: Unified Translation EvaluationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Boyi Deng
Dayiheng Liu
Baosong Yang
Haibo Zhang
Boxing Chen
Yang Li
Lidia S. Chao
165
43
0
28 Apr 2022
Parameter-Efficient Tuning by Manipulating Hidden States of Pretrained
  Language Models For Classification Tasks
Parameter-Efficient Tuning by Manipulating Hidden States of Pretrained Language Models For Classification Tasks
Haoran Yang
Piji Li
Wai Lam
173
4
0
10 Apr 2022
Interpretation of Black Box NLP Models: A Survey
Interpretation of Black Box NLP Models: A Survey
Shivani Choudhary
N. Chatterjee
S. K. Saha
FAtt
188
16
0
31 Mar 2022
VL-InterpreT: An Interactive Visualization Tool for Interpreting
  Vision-Language Transformers
VL-InterpreT: An Interactive Visualization Tool for Interpreting Vision-Language TransformersComputer Vision and Pattern Recognition (CVPR), 2022
Estelle Aflalo
Meng Du
Shao-Yen Tseng
Yongfei Liu
Chenfei Wu
Nan Duan
Vasudev Lal
185
55
0
30 Mar 2022
GPT-D: Inducing Dementia-related Linguistic Anomalies by Deliberate
  Degradation of Artificial Neural Language Models
GPT-D: Inducing Dementia-related Linguistic Anomalies by Deliberate Degradation of Artificial Neural Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Changye Li
D. Knopman
Weizhe Xu
T. Cohen
Serguei V. S. Pakhomov
100
23
0
25 Mar 2022
Word Order Does Matter (And Shuffled Language Models Know It)
Word Order Does Matter (And Shuffled Language Models Know It)Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Vinit Ravishankar
Mostafa Abdou
Artur Kulmizev
Anders Søgaard
154
49
0
21 Mar 2022
Visualizing and Understanding Patch Interactions in Vision Transformer
Visualizing and Understanding Patch Interactions in Vision TransformerIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022
Jie Ma
Yalong Bai
Bineng Zhong
Wei Zhang
Ting Yao
Tao Mei
ViT
116
50
0
11 Mar 2022
Discontinuous Constituency and BERT: A Case Study of Dutch
Discontinuous Constituency and BERT: A Case Study of DutchFindings (Findings), 2022
Konstantinos Kogkalidis
G. Wijnholds
120
7
0
02 Mar 2022
Trees in transformers: a theoretical analysis of the Transformer's
  ability to represent trees
Trees in transformers: a theoretical analysis of the Transformer's ability to represent trees
Qi He
João Sedoc
J. Rodu
129
1
0
16 Dec 2021
Vector Space Semantics for Lambek Calculus with Soft Subexponentials
Vector Space Semantics for Lambek Calculus with Soft Subexponentials
Lachlan McPheat
Hadi Wazni
M. Sadrzadeh
242
5
0
22 Nov 2021
Interpreting Deep Learning Models in Natural Language Processing: A
  Review
Interpreting Deep Learning Models in Natural Language Processing: A Review
Xiaofei Sun
Diyi Yang
Xiaoya Li
Tianwei Zhang
Yuxian Meng
Han Qiu
Guoyin Wang
Eduard H. Hovy
Jiwei Li
183
51
0
20 Oct 2021
The World of an Octopus: How Reporting Bias Influences a Language
  Model's Perception of Color
The World of an Octopus: How Reporting Bias Influences a Language Model's Perception of Color
Cory Paik
Stéphane Aroca-Ouellette
Alessandro Roncone
Katharina Kann
157
38
0
15 Oct 2021
Solving Aspect Category Sentiment Analysis as a Text Generation Task
Solving Aspect Category Sentiment Analysis as a Text Generation Task
Jian Liu
Zhiyang Teng
Leyang Cui
Hanmeng Liu
Yue Zhang
223
80
0
14 Oct 2021
Shaking Syntactic Trees on the Sesame Street: Multilingual Probing with
  Controllable Perturbations
Shaking Syntactic Trees on the Sesame Street: Multilingual Probing with Controllable Perturbations
Ekaterina Taktasheva
Vladislav Mikhailov
Ekaterina Artemova
211
14
0
28 Sep 2021
Transformers Generalize Linearly
Transformers Generalize Linearly
Jackson Petty
Robert Frank
AI4CE
368
18
0
24 Sep 2021
Awakening Latent Grounding from Pretrained Language Models for Semantic
  Parsing
Awakening Latent Grounding from Pretrained Language Models for Semantic ParsingFindings (Findings), 2021
Qian Liu
Dejian Yang
Jiahui Zhang
Jiaqi Guo
Bin Zhou
Jian-Guang Lou
161
42
0
22 Sep 2021
Incorporating Residual and Normalization Layers into Analysis of Masked
  Language Models
Incorporating Residual and Normalization Layers into Analysis of Masked Language Models
Goro Kobayashi
Tatsuki Kuribayashi
Sho Yokoi
Kentaro Inui
353
54
0
15 Sep 2021
The Stem Cell Hypothesis: Dilemma behind Multi-Task Learning with
  Transformer Encoders
The Stem Cell Hypothesis: Dilemma behind Multi-Task Learning with Transformer Encoders
Han He
Jinho Choi
236
124
0
14 Sep 2021
GradTS: A Gradient-Based Automatic Auxiliary Task Selection Method Based
  on Transformer Networks
GradTS: A Gradient-Based Automatic Auxiliary Task Selection Method Based on Transformer Networks
Weicheng Ma
Renze Lou
Kai Zhang
Lili Wang
Soroush Vosoughi
138
8
0
13 Sep 2021
COMBO: State-of-the-Art Morphosyntactic Analysis
COMBO: State-of-the-Art Morphosyntactic Analysis
Mateusz Klimaszewski
Alina Wróblewska
AI4CE
106
6
0
11 Sep 2021
Does Pretraining for Summarization Require Knowledge Transfer?
Does Pretraining for Summarization Require Knowledge Transfer?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Kundan Krishna
Jeffrey P. Bigham
Zachary Chase Lipton
200
41
0
10 Sep 2021
How much pretraining data do language models need to learn syntax?
How much pretraining data do language models need to learn syntax?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Laura Pérez-Mayos
Miguel Ballesteros
Leo Wanner
125
36
0
07 Sep 2021
Legal Search in Case Law and Statute Law
Legal Search in Case Law and Statute LawInternational Conference on Legal Knowledge and Information Systems (JURIX), 2021
Julien Rossi
Evangelos Kanoulas
AILawELM
222
8
0
23 Aug 2021
VerbCL: A Dataset of Verbatim Quotes for Highlight Extraction in Case
  Law
VerbCL: A Dataset of Verbatim Quotes for Highlight Extraction in Case LawInternational Conference on Information and Knowledge Management (CIKM), 2021
Julien Rossi
Svitlana Vakulenko
Evangelos Kanoulas
AILaw
207
2
0
23 Aug 2021
Local Structure Matters Most: Perturbation Study in NLU
Local Structure Matters Most: Perturbation Study in NLUFindings (Findings), 2021
Louis Clouâtre
Prasanna Parthasarathi
Payel Das
Sarath Chandar
173
16
0
29 Jul 2021
Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge
  Bases
Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases
Boxi Cao
Hongyu Lin
Xianpei Han
Le Sun
Lingyong Yan
M. Liao
Tong Xue
Jin Xu
157
149
0
17 Jun 2021
Pre-Trained Models: Past, Present and Future
Pre-Trained Models: Past, Present and FutureAI Open (AO), 2021
Xu Han
Zhengyan Zhang
Ning Ding
Yuxian Gu
Xiao Liu
...
Jie Tang
Ji-Rong Wen
Jinhui Yuan
Wayne Xin Zhao
Jun Zhu
AIFinMQAI4MH
347
975
0
14 Jun 2021
Transient Chaos in BERT
Transient Chaos in BERT
Katsuma Inoue
Soh Ohara
Yasuo Kuniyoshi
Kohei Nakajima
195
3
0
06 Jun 2021
The Case for Translation-Invariant Self-Attention in Transformer-Based
  Language Models
The Case for Translation-Invariant Self-Attention in Transformer-Based Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2021
Ulme Wennberg
G. Henter
MILM
173
25
0
03 Jun 2021
Enriching Transformers with Structured Tensor-Product Representations
  for Abstractive Summarization
Enriching Transformers with Structured Tensor-Product Representations for Abstractive SummarizationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2021
Yichen Jiang
Asli Celikyilmaz
P. Smolensky
Paul Soulos
Sudha Rao
Hamid Palangi
Roland Fernandez
Caitlin Smith
Joey Tianyi Zhou
Jianfeng Gao
126
21
0
02 Jun 2021
Using Integrated Gradients and Constituency Parse Trees to explain
  Linguistic Acceptability learnt by BERT
Using Integrated Gradients and Constituency Parse Trees to explain Linguistic Acceptability learnt by BERTICON (ICON), 2021
Anmol Nayak
Hariprasad Timmapathini
169
5
0
01 Jun 2021
TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference
TR-BERT: Dynamic Token Reduction for Accelerating BERT InferenceNorth American Chapter of the Association for Computational Linguistics (NAACL), 2021
Deming Ye
Yankai Lin
Yufei Huang
Maosong Sun
MQ
195
74
0
25 May 2021
Self-Attention Networks Can Process Bounded Hierarchical Languages
Self-Attention Networks Can Process Bounded Hierarchical LanguagesAnnual Meeting of the Association for Computational Linguistics (ACL), 2021
Shunyu Yao
Binghui Peng
Christos H. Papadimitriou
Karthik Narasimhan
278
97
0
24 May 2021
Accounting for Agreement Phenomena in Sentence Comprehension with
  Transformer Language Models: Effects of Similarity-based Interference on
  Surprisal and Attention
Accounting for Agreement Phenomena in Sentence Comprehension with Transformer Language Models: Effects of Similarity-based Interference on Surprisal and AttentionWorkshop on Cognitive Modeling and Computational Linguistics (CMCL), 2021
S. Ryu
Richard L. Lewis
163
33
0
26 Apr 2021
Factual Probing Is [MASK]: Learning vs. Learning to Recall
Factual Probing Is [MASK]: Learning vs. Learning to RecallNorth American Chapter of the Association for Computational Linguistics (NAACL), 2021
Zexuan Zhong
Dan Friedman
Danqi Chen
280
438
0
12 Apr 2021
Previous
1234
Next