ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.04341
  4. Cited By
What Does BERT Look At? An Analysis of BERT's Attention

What Does BERT Look At? An Analysis of BERT's Attention

11 June 2019
Kevin Clark
Urvashi Khandelwal
Omer Levy
Christopher D. Manning
    MILM
ArXivPDFHTML

Papers citing "What Does BERT Look At? An Analysis of BERT's Attention"

50 / 883 papers shown
Title
Dissecting Paraphrases: The Impact of Prompt Syntax and supplementary
  Information on Knowledge Retrieval from Pretrained Language Models
Dissecting Paraphrases: The Impact of Prompt Syntax and supplementary Information on Knowledge Retrieval from Pretrained Language Models
Stephan Linzbach
Dimitar Dimitrov
Laura Kallmeyer
Kilian Evang
Hajira Jabeen
Stefan Dietze
KELM
31
0
0
02 Apr 2024
What Can Transformer Learn with Varying Depth? Case Studies on Sequence
  Learning Tasks
What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks
Xingwu Chen
Difan Zou
ViT
24
12
0
02 Apr 2024
Green AI: Exploring Carbon Footprints, Mitigation Strategies, and Trade
  Offs in Large Language Model Training
Green AI: Exploring Carbon Footprints, Mitigation Strategies, and Trade Offs in Large Language Model Training
Vivian Liu
Yiqiao Yin
35
11
0
01 Apr 2024
Extending Token Computation for LLM Reasoning
Extending Token Computation for LLM Reasoning
Bingli Liao
Danilo Vasconcellos Vargas
LRM
19
2
0
22 Mar 2024
Open-Vocabulary Attention Maps with Token Optimization for Semantic
  Segmentation in Diffusion Models
Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models
Pablo Marcos-Manchón
Roberto Alcover-Couso
Juan C. Sanmiguel
Jose M. Martínez
VLM
42
18
0
21 Mar 2024
Embedded Named Entity Recognition using Probing Classifiers
Embedded Named Entity Recognition using Probing Classifiers
Nicholas Popovic
Michael Färber
40
1
0
18 Mar 2024
Code-Mixed Probes Show How Pre-Trained Models Generalise On
  Code-Switched Text
Code-Mixed Probes Show How Pre-Trained Models Generalise On Code-Switched Text
Frances Adriana Laureano De Leon
Harish Tayyar Madabushi
Mark Lee
33
3
0
07 Mar 2024
iScore: Visual Analytics for Interpreting How Language Models
  Automatically Score Summaries
iScore: Visual Analytics for Interpreting How Language Models Automatically Score Summaries
Adam Joseph Coscia
Langdon Holmes
Wesley Morris
Joon Suh Choi
Scott Crossley
Alex Endert
25
6
0
07 Mar 2024
KnowledgeVIS: Interpreting Language Models by Comparing
  Fill-in-the-Blank Prompts
KnowledgeVIS: Interpreting Language Models by Comparing Fill-in-the-Blank Prompts
Adam Joseph Coscia
Alex Endert
VLM
30
9
0
07 Mar 2024
Where does In-context Translation Happen in Large Language Models
Where does In-context Translation Happen in Large Language Models
Suzanna Sia
David Mueller
Kevin Duh
LRM
33
0
0
07 Mar 2024
Measuring Meaning Composition in the Human Brain with Composition Scores
  from Large Language Models
Measuring Meaning Composition in the Human Brain with Composition Scores from Large Language Models
Changjiang Gao
Jixing Li
Jiajun Chen
Shujian Huang
18
2
0
07 Mar 2024
The Heuristic Core: Understanding Subnetwork Generalization in
  Pretrained Language Models
The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models
Adithya Bhaskar
Dan Friedman
Danqi Chen
27
5
0
06 Mar 2024
Towards Understanding Cross and Self-Attention in Stable Diffusion for
  Text-Guided Image Editing
Towards Understanding Cross and Self-Attention in Stable Diffusion for Text-Guided Image Editing
Bingyan Liu
Chengyu Wang
Tingfeng Cao
Kui Jia
Jun Huang
DiffM
35
51
0
06 Mar 2024
Word Importance Explains How Prompts Affect Language Model Outputs
Word Importance Explains How Prompts Affect Language Model Outputs
Stefan Hackmann
Haniyeh Mahmoudian
Mark Steadman
Michael Schmidt
AAML
156
5
0
05 Mar 2024
Towards Measuring and Modeling "Culture" in LLMs: A Survey
Towards Measuring and Modeling "Culture" in LLMs: A Survey
Muhammad Farid Adilazuarda
Sagnik Mukherjee
Pradhyumna Lavania
Siddhant Singh
Alham Fikri Aji
Jacki OÑeill
Ashutosh Modi
Monojit Choudhury
52
54
0
05 Mar 2024
Topic Aware Probing: From Sentence Length Prediction to Idiom
  Identification how reliant are Neural Language Models on Topic?
Topic Aware Probing: From Sentence Length Prediction to Idiom Identification how reliant are Neural Language Models on Topic?
Vasudevan Nedumpozhimana
John D. Kelleher
29
1
0
04 Mar 2024
Massive Activations in Large Language Models
Massive Activations in Large Language Models
Mingjie Sun
Xinlei Chen
J. Zico Kolter
Zhuang Liu
60
68
0
27 Feb 2024
RAVEL: Evaluating Interpretability Methods on Disentangling Language
  Model Representations
RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations
Jing-ling Huang
Zhengxuan Wu
Christopher Potts
Mor Geva
Atticus Geiger
57
26
0
27 Feb 2024
What Do Language Models Hear? Probing for Auditory Representations in
  Language Models
What Do Language Models Hear? Probing for Auditory Representations in Language Models
Jerry Ngo
Yoon Kim
AuLLM
MILM
18
8
0
26 Feb 2024
Layer-wise Regularized Dropout for Neural Language Models
Layer-wise Regularized Dropout for Neural Language Models
Shiwen Ni
Min Yang
Ruifeng Xu
Chengming Li
Xiping Hu
28
0
0
26 Feb 2024
From Adoption to Adaption: Tracing the Diffusion of New Emojis on
  Twitter
From Adoption to Adaption: Tracing the Diffusion of New Emojis on Twitter
Yuhang Zhou
Xuan Lu
Wei Ai
27
1
0
22 Feb 2024
When Only Time Will Tell: Interpreting How Transformers Process Local
  Ambiguities Through the Lens of Restart-Incrementality
When Only Time Will Tell: Interpreting How Transformers Process Local Ambiguities Through the Lens of Restart-Incrementality
Brielen Madureira
Patrick Kahardipraja
David Schlangen
31
2
0
20 Feb 2024
Identifying Semantic Induction Heads to Understand In-Context Learning
Identifying Semantic Induction Heads to Understand In-Context Learning
Jie Ren
Qipeng Guo
Hang Yan
Dongrui Liu
Xipeng Qiu
Dahua Lin
19
24
0
20 Feb 2024
Chain of Thought Empowers Transformers to Solve Inherently Serial
  Problems
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems
Zhiyuan Li
Hong Liu
Denny Zhou
Tengyu Ma
LRM
AI4CE
20
95
0
20 Feb 2024
Intriguing Differences Between Zero-Shot and Systematic Evaluations of
  Vision-Language Transformer Models
Intriguing Differences Between Zero-Shot and Systematic Evaluations of Vision-Language Transformer Models
Shaeke Salman
M. Shams
Xiuwen Liu
Lingjiong Zhu
VLM
19
2
0
13 Feb 2024
Improving Black-box Robustness with In-Context Rewriting
Improving Black-box Robustness with In-Context Rewriting
Kyle O'Brien
Nathan Ng
Isha Puri
Jorge Mendez
Hamid Palangi
Yoon Kim
Marzyeh Ghassemi
Tom Hartvigsen
44
6
0
13 Feb 2024
CMA-R:Causal Mediation Analysis for Explaining Rumour Detection
CMA-R:Causal Mediation Analysis for Explaining Rumour Detection
Lin Tian
Xiuzhen Zhang
Jey Han Lau
24
0
0
13 Feb 2024
Insights into Natural Language Database Query Errors: From Attention
  Misalignment to User Handling Strategies
Insights into Natural Language Database Query Errors: From Attention Misalignment to User Handling Strategies
Zheng Ning
Yuan Tian
Zheng Zhang
Tianyi Zhang
T. Li
42
6
0
11 Feb 2024
Inducing Systematicity in Transformers by Attending to Structurally
  Quantized Embeddings
Inducing Systematicity in Transformers by Attending to Structurally Quantized Embeddings
Yichen Jiang
Xiang Zhou
Mohit Bansal
23
1
0
09 Feb 2024
AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for
  Transformers
AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers
Reduan Achtibat
Sayed Mohammad Vakilzadeh Hatefi
Maximilian Dreyer
Aakriti Jain
Thomas Wiegand
Sebastian Lapuschkin
Wojciech Samek
28
24
0
08 Feb 2024
Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank
  Modifications
Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
Boyi Wei
Kaixuan Huang
Yangsibo Huang
Tinghao Xie
Xiangyu Qi
Mengzhou Xia
Prateek Mittal
Mengdi Wang
Peter Henderson
AAML
55
79
0
07 Feb 2024
Attention Meets Post-hoc Interpretability: A Mathematical Perspective
Attention Meets Post-hoc Interpretability: A Mathematical Perspective
Gianluigi Lopardo
F. Precioso
Damien Garreau
11
4
0
05 Feb 2024
Approximate Attributions for Off-the-Shelf Siamese Transformers
Approximate Attributions for Off-the-Shelf Siamese Transformers
Lucas Moller
Dmitry Nikolaev
Sebastian Padó
21
4
0
05 Feb 2024
Sequence Shortening for Context-Aware Machine Translation
Sequence Shortening for Context-Aware Machine Translation
Paweł Mąka
Yusuf Can Semerci
Jan Scholtes
Gerasimos Spanakis
17
2
0
02 Feb 2024
Contextual Feature Extraction Hierarchies Converge in Large Language
  Models and the Brain
Contextual Feature Extraction Hierarchies Converge in Large Language Models and the Brain
Gavin Mischler
Yinghao Aaron Li
Stephan Bickel
A. Mehta
N. Mesgarani
17
23
0
31 Jan 2024
Rethinking Interpretability in the Era of Large Language Models
Rethinking Interpretability in the Era of Large Language Models
Chandan Singh
J. Inala
Michel Galley
Rich Caruana
Jianfeng Gao
LRM
AI4CE
75
61
0
30 Jan 2024
Engineering A Large Language Model From Scratch
Engineering A Large Language Model From Scratch
Abiodun Finbarrs Oketunji
32
0
0
30 Jan 2024
Semantic Sensitivities and Inconsistent Predictions: Measuring the
  Fragility of NLI Models
Semantic Sensitivities and Inconsistent Predictions: Measuring the Fragility of NLI Models
Erik Arakelyan
Zhaoqi Liu
Isabelle Augenstein
AAML
37
9
0
25 Jan 2024
Facing the Elephant in the Room: Visual Prompt Tuning or Full
  Finetuning?
Facing the Elephant in the Room: Visual Prompt Tuning or Full Finetuning?
Cheng Han
Qifan Wang
Yiming Cui
Wenguan Wang
Lifu Huang
Siyuan Qi
Dongfang Liu
VLM
44
19
0
23 Jan 2024
Anisotropy Is Inherent to Self-Attention in Transformers
Anisotropy Is Inherent to Self-Attention in Transformers
Nathan Godey
Eric Villemonte de la Clergerie
Benoît Sagot
13
16
0
22 Jan 2024
Anchor function: a type of benchmark functions for studying language
  models
Anchor function: a type of benchmark functions for studying language models
Zhongwang Zhang
Zhiwei Wang
Junjie Yao
Zhangchen Zhou
Xiaolong Li
E. Weinan
Z. Xu
32
5
0
16 Jan 2024
Transformers are Multi-State RNNs
Transformers are Multi-State RNNs
Matanel Oren
Michael Hassid
Nir Yarden
Yossi Adi
Roy Schwartz
OffRL
24
34
0
11 Jan 2024
How Proficient Are Large Language Models in Formal Languages? An
  In-Depth Insight for Knowledge Base Question Answering
How Proficient Are Large Language Models in Formal Languages? An In-Depth Insight for Knowledge Base Question Answering
Jinxi Liu
S. Cao
Jiaxin Shi
Tingjian Zhang
Lunyiu Nie
Linmei Hu
Lei Hou
Juanzi Li
ELM
16
3
0
11 Jan 2024
Model Editing Harms General Abilities of Large Language Models:
  Regularization to the Rescue
Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue
Jia-Chen Gu
Haoyang Xu
Jun-Yu Ma
Pan Lu
Zhen-Hua Ling
Kai-Wei Chang
Nanyun Peng
KELM
28
34
0
09 Jan 2024
Language Model as an Annotator: Unsupervised Context-aware Quality
  Phrase Generation
Language Model as an Annotator: Unsupervised Context-aware Quality Phrase Generation
Zhihao Zhang
Yuan Zuo
Chenghua Lin
Junjie Wu
8
5
0
28 Dec 2023
Towards Probing Contact Center Large Language Models
Towards Probing Contact Center Large Language Models
Varun Nathan
Ayush Kumar
Digvijay Ingle
Jithendra Vepa
27
0
0
26 Dec 2023
Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models:
  A Critical Review and Assessment
Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models: A Critical Review and Assessment
Lingling Xu
Haoran Xie
S. J. Qin
Xiaohui Tao
F. Wang
35
132
0
19 Dec 2023
Dynamic Syntax Mapping: A New Approach to Unsupervised Syntax Parsing
Dynamic Syntax Mapping: A New Approach to Unsupervised Syntax Parsing
Buvarp Gohsh
Woods Ali
Michael Anders
30
0
0
18 Dec 2023
Can persistent homology whiten Transformer-based black-box models? A
  case study on BERT compression
Can persistent homology whiten Transformer-based black-box models? A case study on BERT compression
Luis Balderas
Miguel Lastra
José M. Benítez
22
1
0
17 Dec 2023
Where exactly does contextualization in a PLM happen?
Where exactly does contextualization in a PLM happen?
Soniya Vijayakumar
Tanja Baumel
Simon Ostermann
Josef van Genabith
18
1
0
11 Dec 2023
Previous
12345...161718
Next