ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.01698
  4. Cited By
Open Sesame: Getting Inside BERT's Linguistic Knowledge

Open Sesame: Getting Inside BERT's Linguistic Knowledge

4 June 2019
Yongjie Lin
Y. Tan
Robert Frank
ArXiv (abs)PDFHTML

Papers citing "Open Sesame: Getting Inside BERT's Linguistic Knowledge"

50 / 166 papers shown
Title
Are We Paying Attention to Her? Investigating Gender Disambiguation and Attention in Machine Translation
Are We Paying Attention to Her? Investigating Gender Disambiguation and Attention in Machine Translation
Chiara Manna
Afra Alishahi
Frédéric Blain
Eva Vanmassenhove
271
3
0
13 May 2025
Linguistic Interpretability of Transformer-based Language Models: a systematic review
Linguistic Interpretability of Transformer-based Language Models: a systematic review
Miguel López-Otal
Jorge Gracia
Jordi Bernad
Carlos Bobed
Lucía Pitarch-Ballesteros
Emma Anglés-Herrero
VLM
324
5
0
09 Apr 2025
Construction Identification and Disambiguation Using BERT: A Case Study of NPN
Construction Identification and Disambiguation Using BERT: A Case Study of NPN
Wesley Scivetti
Nathan Schneider
270
1
0
24 Mar 2025
Learning Task Representations from In-Context Learning
Learning Task Representations from In-Context LearningAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Baturay Saglam
Zhuoran Yang
Zhuoran Yang
Dionysis Kalogerias
Amin Karbasi
242
6
0
08 Feb 2025
FinchGPT: a Transformer based language model for birdsong analysis
FinchGPT: a Transformer based language model for birdsong analysis
Kosei Kobayashi
Kosuke Matsuzaki
Masaya Taniguchi
Keisuke Sakaguchi
Kentaro Inui
Kentaro Abe
231
2
0
01 Feb 2025
Reverse Probing: Evaluating Knowledge Transfer via Finetuned Task Embeddings for Coreference Resolution
Reverse Probing: Evaluating Knowledge Transfer via Finetuned Task Embeddings for Coreference ResolutionWorkshop on Representation Learning for NLP (RepL4NLP), 2025
Tatiana Anikina
Arne Binder
David Harbecke
Stalin Varanasi
Leonhard Hennig
Simon Ostermann
Sebastian Möller
Josef van Genabith
331
0
0
31 Jan 2025
Beyond Human-Like Processing: Large Language Models Perform Equivalently on Forward and Backward Scientific Text
Xiaoliang Luo
Michael Ramscar
Bradley C. Love
156
2
0
17 Nov 2024
Tokenization and Morphology in Multilingual Language Models: A
  Comparative Analysis of mT5 and ByT5
Tokenization and Morphology in Multilingual Language Models: A Comparative Analysis of mT5 and ByT5
Thao Anh Dang
Limor Raviv
Lukas Galke
245
8
0
15 Oct 2024
Investigating large language models for their competence in extracting
  grammatically sound sentences from transcribed noisy utterances
Investigating large language models for their competence in extracting grammatically sound sentences from transcribed noisy utterancesConference on Computational Natural Language Learning (CoNLL), 2024
Alina Wróblewska
140
0
0
07 Oct 2024
Unveiling Language Competence Neurons: A Psycholinguistic Approach to
  Model Interpretability
Unveiling Language Competence Neurons: A Psycholinguistic Approach to Model InterpretabilityInternational Conference on Computational Linguistics (COLING), 2024
Xufeng Duan
Xinyu Zhou
Bei Xiao
Zhenguang G. Cai
MILM
174
8
0
24 Sep 2024
Cultural Value Differences of LLMs: Prompt, Language, and Model Size
Cultural Value Differences of LLMs: Prompt, Language, and Model Size
Qishuai Zhong
Yike Yun
Aixin Sun
141
8
0
17 Jun 2024
Concentrate Attention: Towards Domain-Generalizable Prompt Optimization
  for Language Models
Concentrate Attention: Towards Domain-Generalizable Prompt Optimization for Language ModelsNeural Information Processing Systems (NeurIPS), 2024
Chengzhengxu Li
Xiaoming Liu
Zhaohan Zhang
Yichen Wang
Chen Liu
Y. Lan
Chao Shen
335
8
0
15 Jun 2024
What Should Embeddings Embed? Autoregressive Models Represent Latent
  Generating Distributions
What Should Embeddings Embed? Autoregressive Models Represent Latent Generating Distributions
Liyi Zhang
Michael Y. Li
Thomas Griffiths
154
5
0
06 Jun 2024
Learning Syntax Without Planting Trees: Understanding Hierarchical Generalization in Transformers
Learning Syntax Without Planting Trees: Understanding Hierarchical Generalization in Transformers
Kabir Ahuja
Vidhisha Balachandran
Madhur Panwar
Tianxing He
Noah A. Smith
Navin Goyal
Yulia Tsvetkov
224
2
0
25 Apr 2024
What do Transformers Know about Government?
What do Transformers Know about Government?
Jue Hou
Anisia Katinskaia
Lari Kotilainen
Sathianpong Trangcasanchai
Anh Vu
R. Yangarber
223
2
0
22 Apr 2024
Can AI Models Appreciate Document Aesthetics? An Exploration of
  Legibility and Layout Quality in Relation to Prediction Confidence
Can AI Models Appreciate Document Aesthetics? An Exploration of Legibility and Layout Quality in Relation to Prediction Confidence
Hsiu-Wei Yang
Abhinav Agrawal
Pavlos Fragkogiannis
Shubham Nitin Mulay
214
2
0
27 Mar 2024
What has LeBenchmark Learnt about French Syntax?
What has LeBenchmark Learnt about French Syntax?
Zdravko Dugonjić
Adrien Pupier
Benjamin Lecouteux
Maximin Coavoux
126
1
0
04 Mar 2024
Topic Aware Probing: From Sentence Length Prediction to Idiom
  Identification how reliant are Neural Language Models on Topic?
Topic Aware Probing: From Sentence Length Prediction to Idiom Identification how reliant are Neural Language Models on Topic?
Vasudevan Nedumpozhimana
John D. Kelleher
159
2
0
04 Mar 2024
Unveiling Linguistic Regions in Large Language Models
Unveiling Linguistic Regions in Large Language Models
Zhihao Zhang
Jun Zhao
Tao Gui
Tao Gui
Xuanjing Huang
249
17
0
22 Feb 2024
Towards Probing Contact Center Large Language Models
Towards Probing Contact Center Large Language Models
Varun Nathan
Ayush Kumar
Digvijay Ingle
Jithendra Vepa
126
0
0
26 Dec 2023
Deep de Finetti: Recovering Topic Distributions from Large Language
  Models
Deep de Finetti: Recovering Topic Distributions from Large Language Models
Liyi Zhang
R. Thomas McCoy
T. Sumers
Jian-Qiao Zhu
Thomas Griffiths
BDL
189
8
0
21 Dec 2023
Helping Language Models Learn More: Multi-dimensional Task Prompt for
  Few-shot Tuning
Helping Language Models Learn More: Multi-dimensional Task Prompt for Few-shot TuningIEEE International Conference on Systems, Man and Cybernetics (SMC), 2023
Jinta Weng
Jiarui Zhang
Yue Hu
Daidong Fa
Xiaofeng Xu
Heyan Huang
163
2
0
13 Dec 2023
Transformers are uninterpretable with myopic methods: a case study with
  bounded Dyck grammars
Transformers are uninterpretable with myopic methods: a case study with bounded Dyck grammarsNeural Information Processing Systems (NeurIPS), 2023
Kaiyue Wen
Yuchen Li
Bing Liu
Andrej Risteski
246
27
0
03 Dec 2023
How Well Do Text Embedding Models Understand Syntax?
How Well Do Text Embedding Models Understand Syntax?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yan Zhang
Zhaopeng Feng
Zhiyang Teng
Zuozhu Liu
Haizhou Li
205
4
0
14 Nov 2023
How Abstract Is Linguistic Generalization in Large Language Models?
  Experiments with Argument Structure
How Abstract Is Linguistic Generalization in Large Language Models? Experiments with Argument Structure
Michael Wilson
Jackson Petty
Robert Frank
208
21
0
08 Nov 2023
Retrofitting Light-weight Language Models for Emotions using Supervised
  Contrastive Learning
Retrofitting Light-weight Language Models for Emotions using Supervised Contrastive LearningConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Sapan Shah
Sreedhar Reddy
Pushpak Bhattacharyya
170
1
0
29 Oct 2023
Understanding the Role of Input Token Characters in Language Models: How
  Does Information Loss Affect Performance?
Understanding the Role of Input Token Characters in Language Models: How Does Information Loss Affect Performance?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Ahmed Alajrami
Katerina Margatina
Nikolaos Aletras
AAML
124
2
0
26 Oct 2023
Large Language Models are biased to overestimate profoundness
Large Language Models are biased to overestimate profoundnessConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Eugenio Herrera-Berg
Tomás Vergara Browne
Pablo León-Villagrá
Marc-Lluís Vives
Cristian Buc Calderon
ELM
94
8
0
22 Oct 2023
Disentangling the Linguistic Competence of Privacy-Preserving BERT
Disentangling the Linguistic Competence of Privacy-Preserving BERTBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP), 2023
Stefan Arnold
Nils Kemmerzell
Annika Schreiner
216
0
0
17 Oct 2023
Homophone Disambiguation Reveals Patterns of Context Mixing in Speech
  Transformers
Homophone Disambiguation Reveals Patterns of Context Mixing in Speech Transformers
Hosein Mohebbi
Grzegorz Chrupała
Willem H. Zuidema
Afra Alishahi
156
19
0
15 Oct 2023
Are Emergent Abilities in Large Language Models just In-Context
  Learning?
Are Emergent Abilities in Large Language Models just In-Context Learning?Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Sheng Lu
Irina Bigoulaeva
Rachneet Sachdeva
Harish Tayyar Madabushi
Iryna Gurevych
LRMELMReLM
383
129
0
04 Sep 2023
Explainability for Large Language Models: A Survey
Explainability for Large Language Models: A SurveyACM Transactions on Intelligent Systems and Technology (ACM TIST), 2023
Haiyan Zhao
Hanjie Chen
Fan Yang
Ninghao Liu
Huiqi Deng
Hengyi Cai
Shuaiqiang Wang
D. Yin
Jundong Li
LRM
375
687
0
02 Sep 2023
Why do universal adversarial attacks work on large language models?:
  Geometry might be the answer
Why do universal adversarial attacks work on large language models?: Geometry might be the answer
Varshini Subhash
Anna Bialas
Weiwei Pan
Finale Doshi-Velez
AAML
165
16
0
01 Sep 2023
Decoding Layer Saliency in Language Transformers
Decoding Layer Saliency in Language TransformersInternational Conference on Machine Learning (ICML), 2023
Elizabeth M. Hou
Greg Castañón
MILM
246
2
0
09 Aug 2023
Switch-BERT: Learning to Model Multimodal Interactions by Switching
  Attention and Input
Switch-BERT: Learning to Model Multimodal Interactions by Switching Attention and InputEuropean Conference on Computer Vision (ECCV), 2023
Qingpei Guo
Kaisheng Yao
Wei Chu
MLLM
79
6
0
25 Jun 2023
Incorporating Distributions of Discourse Structure for Long Document
  Abstractive Summarization
Incorporating Distributions of Discourse Structure for Long Document Abstractive SummarizationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Dongqi Pu
Yifa Wang
Vera Demberg
192
26
0
26 May 2023
Retentive or Forgetful? Diving into the Knowledge Memorizing Mechanism
  of Language Models
Retentive or Forgetful? Diving into the Knowledge Memorizing Mechanism of Language ModelsInternational Conference on Language Resources and Evaluation (LREC), 2023
Boxi Cao
Qiaoyu Tang
Hongyu Lin
Shanshan Jiang
Bin Dong
Xianpei Han
Jiawei Chen
Tianshu Wang
Le Sun
CLLKELM
131
23
0
16 May 2023
AttentionViz: A Global View of Transformer Attention
AttentionViz: A Global View of Transformer AttentionIEEE Transactions on Visualization and Computer Graphics (TVCG), 2023
Catherine Yeh
Yida Chen
Aoyu Wu
Cynthia Chen
Fernanda Viégas
Martin Wattenberg
ViT
259
86
0
04 May 2023
The Life Cycle of Knowledge in Big Language Models: A Survey
The Life Cycle of Knowledge in Big Language Models: A SurveyMachine Intelligence Research (MIR), 2023
Boxi Cao
Hongyu Lin
Xianpei Han
Le Sun
KELM
198
30
0
14 Mar 2023
Input-length-shortening and text generation via attention values
Input-length-shortening and text generation via attention values
Necset Ozkan Tan
A. Peng
Joshua Bensemann
Qiming Bao
Tim Hartill
M. Gahegan
Michael Witbrock
146
2
0
14 Mar 2023
Spelling convention sensitivity in neural language models
Spelling convention sensitivity in neural language modelsFindings (Findings), 2023
Elizabeth Nielsen
Christo Kirov
Brian Roark
103
1
0
06 Mar 2023
Does Deep Learning Learn to Abstract? A Systematic Probing Framework
Does Deep Learning Learn to Abstract? A Systematic Probing FrameworkInternational Conference on Learning Representations (ICLR), 2023
Shengnan An
Zeqi Lin
B. Chen
Qiang Fu
Nanning Zheng
Jian-Guang Lou
182
6
0
23 Feb 2023
False perspectives on human language: why statistics needs linguistics
False perspectives on human language: why statistics needs linguistics
Matteo Greco
Andrea Cometa
F. Artoni
Robert Frank
A. Moro
64
6
0
17 Feb 2023
Quantifying Context Mixing in Transformers
Quantifying Context Mixing in TransformersConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
Hosein Mohebbi
Willem H. Zuidema
Grzegorz Chrupała
Afra Alishahi
396
36
0
30 Jan 2023
How poor is the stimulus? Evaluating hierarchical generalization in
  neural networks trained on child-directed speech
How poor is the stimulus? Evaluating hierarchical generalization in neural networks trained on child-directed speechAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Aditya Yedetore
Tal Linzen
Robert Frank
R. Thomas McCoy
134
24
0
26 Jan 2023
SensePOLAR: Word sense aware interpretability for pre-trained contextual
  word embeddings
SensePOLAR: Word sense aware interpretability for pre-trained contextual word embeddingsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Jan Engler
Sandipan Sikdar
Marlene Lutz
M. Strohmaier
157
9
0
11 Jan 2023
Can Large Language Models Change User Preference Adversarially?
Can Large Language Models Change User Preference Adversarially?
Varshini Subhash
AAML
158
9
0
05 Jan 2023
Explainability of Text Processing and Retrieval Methods: A Survey
Explainability of Text Processing and Retrieval Methods: A Survey
Sourav Saha
Debapriyo Majumdar
Mandar Mitra
258
5
0
14 Dec 2022
What do Large Language Models Learn beyond Language?
What do Large Language Models Learn beyond Language?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Avinash Madasu
Shashank Srivastava
LRMAI4CE
161
6
0
21 Oct 2022
Probing with Noise: Unpicking the Warp and Weft of Embeddings
Probing with Noise: Unpicking the Warp and Weft of EmbeddingsBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP), 2022
Filip Klubicka
John D. Kelleher
157
4
0
21 Oct 2022
1234
Next