ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.04341
  4. Cited By
What Does BERT Look At? An Analysis of BERT's Attention

What Does BERT Look At? An Analysis of BERT's Attention

11 June 2019
Kevin Clark
Urvashi Khandelwal
Omer Levy
Christopher D. Manning
    MILM
ArXivPDFHTML

Papers citing "What Does BERT Look At? An Analysis of BERT's Attention"

50 / 883 papers shown
Title
How and where does CLIP process negation?
How and where does CLIP process negation?
Vincent Quantmeyer
Pablo Mosteiro
Albert Gatt
CoGe
29
6
0
15 Jul 2024
Lookback Lens: Detecting and Mitigating Contextual Hallucinations in
  Large Language Models Using Only Attention Maps
Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps
Yung-Sung Chuang
Linlu Qiu
Cheng-Yu Hsieh
Ranjay Krishna
Yoon Kim
James R. Glass
HILM
18
33
0
09 Jul 2024
Evaluating Human Alignment and Model Faithfulness of LLM Rationale
Evaluating Human Alignment and Model Faithfulness of LLM Rationale
Mohsen Fayyaz
Fan Yin
Jiao Sun
Nanyun Peng
52
3
0
28 Jun 2024
Fibottention: Inceptive Visual Representation Learning with Diverse
  Attention Across Heads
Fibottention: Inceptive Visual Representation Learning with Diverse Attention Across Heads
Ali Khaleghi Rahimian
Manish Kumar Govind
Subhajit Maity
Dominick Reilly
Christian Kummerle
Srijan Das
A. Dutta
38
1
0
27 Jun 2024
Sparser is Faster and Less is More: Efficient Sparse Attention for
  Long-Range Transformers
Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers
Chao Lou
Zixia Jia
Zilong Zheng
Kewei Tu
ODL
31
18
0
24 Jun 2024
Are there identifiable structural parts in the sentence embedding whole?
Are there identifiable structural parts in the sentence embedding whole?
Vivi Nastase
Paola Merlo
32
3
0
24 Jun 2024
Found in the Middle: Calibrating Positional Attention Bias Improves Long
  Context Utilization
Found in the Middle: Calibrating Positional Attention Bias Improves Long Context Utilization
Cheng-Yu Hsieh
Yung-Sung Chuang
Chun-Liang Li
Zifeng Wang
Long T. Le
...
James R. Glass
Alexander Ratner
Chen-Yu Lee
Ranjay Krishna
Tomas Pfister
40
30
0
23 Jun 2024
Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large
  Language Models without Training through Attention Calibration
Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration
Zhongzhi Yu
Zheng Wang
Yonggan Fu
Huihong Shi
Khalid Shaikh
Yingyan Celine Lin
41
20
0
22 Jun 2024
Improving Interpretability and Robustness for the Detection of
  AI-Generated Images
Improving Interpretability and Robustness for the Detection of AI-Generated Images
T. Gaintseva
Laida Kushnareva
German Magai
Irina Piontkovskaya
Sergey I. Nikolenko
Martin Benning
S. Barannikov
Gregory Slabaugh
24
1
0
21 Jun 2024
SRViT: Vision Transformers for Estimating Radar Reflectivity from
  Satellite Observations at Scale
SRViT: Vision Transformers for Estimating Radar Reflectivity from Satellite Observations at Scale
Jason Stock
Kyle Hilburn
Imme Ebert-Uphoff
Charles Anderson
40
1
0
20 Jun 2024
In Tree Structure Should Sentence Be Generated
In Tree Structure Should Sentence Be Generated
Yaguang Li
Xin Chen
20
0
0
20 Jun 2024
A Primal-Dual Framework for Transformers and Neural Networks
A Primal-Dual Framework for Transformers and Neural Networks
Tan M. Nguyen
Tam Nguyen
Nhat Ho
Andrea L. Bertozzi
Richard G. Baraniuk
Stanley J. Osher
ViT
21
13
0
19 Jun 2024
StableSemantics: A Synthetic Language-Vision Dataset of Semantic
  Representations in Naturalistic Images
StableSemantics: A Synthetic Language-Vision Dataset of Semantic Representations in Naturalistic Images
Rushikesh Zawar
Shaurya Dewan
Andrew F. Luo
Margaret M. Henderson
Michael J. Tarr
Leila Wehbe
VGen
CoGe
36
1
0
19 Jun 2024
Composited-Nested-Learning with Data Augmentation for Nested Named
  Entity Recognition
Composited-Nested-Learning with Data Augmentation for Nested Named Entity Recognition
Xingming Liao
Nankai Lin
Haowen Li
Lianglun Cheng
Zhuowei Wang
Chong Chen
38
0
0
18 Jun 2024
Improving the Evaluation and Actionability of Explanation Methods for
  Multivariate Time Series Classification
Improving the Evaluation and Actionability of Explanation Methods for Multivariate Time Series Classification
D. Serramazza
Thach le Nguyen
Georgiana Ifrim
19
2
0
18 Jun 2024
Attention Score is not All You Need for Token Importance Indicator in KV
  Cache Reduction: Value Also Matters
Attention Score is not All You Need for Token Importance Indicator in KV Cache Reduction: Value Also Matters
Zhiyu Guo
Hidetaka Kamigaito
Taro Watanabe
24
20
0
18 Jun 2024
A Hopfieldian View-based Interpretation for Chain-of-Thought Reasoning
A Hopfieldian View-based Interpretation for Chain-of-Thought Reasoning
Lijie Hu
Liang Liu
Shu Yang
Xin Chen
Hongru Xiao
Mengdi Li
Pan Zhou
Muhammad Asif Ali
Di Wang
LRM
35
5
0
18 Jun 2024
InternalInspector $I^2$: Robust Confidence Estimation in LLMs through
  Internal States
InternalInspector I2I^2I2: Robust Confidence Estimation in LLMs through Internal States
Mohammad Beigi
Ying Shen
Runing Yang
Zihao Lin
Qifan Wang
Ankith Mohan
Jianfeng He
Ming Jin
Chang-Tien Lu
Lifu Huang
HILM
34
4
0
17 Jun 2024
CrAM: Credibility-Aware Attention Modification in LLMs for Combating
  Misinformation in RAG
CrAM: Credibility-Aware Attention Modification in LLMs for Combating Misinformation in RAG
Boyi Deng
Wenjie Wang
Fengbin Zhu
Qifan Wang
Fuli Feng
33
4
0
17 Jun 2024
Outlier Reduction with Gated Attention for Improved Post-training
  Quantization in Large Sequence-to-sequence Speech Foundation Models
Outlier Reduction with Gated Attention for Improved Post-training Quantization in Large Sequence-to-sequence Speech Foundation Models
Dominik Wagner
Ilja Baumann
K. Riedhammer
Tobias Bocklet
MQ
30
1
0
16 Jun 2024
Concentrate Attention: Towards Domain-Generalizable Prompt Optimization
  for Language Models
Concentrate Attention: Towards Domain-Generalizable Prompt Optimization for Language Models
Chengzhengxu Li
Xiaoming Liu
Zhaohan Zhang
Yichen Wang
Chen Liu
Y. Lan
Chao Shen
46
2
0
15 Jun 2024
Exploring the Correlation between Human and Machine Evaluation of
  Simultaneous Speech Translation
Exploring the Correlation between Human and Machine Evaluation of Simultaneous Speech Translation
Xiaoman Wang
Claudio Fantinuoli
19
1
0
14 Jun 2024
Applications of Explainable artificial intelligence in Earth system
  science
Applications of Explainable artificial intelligence in Earth system science
Feini Huang
Shijie Jiang
Lu Li
Yongkun Zhang
Ye Zhang
Ruqing Zhang
Qingliang Li
Danxi Li
Wei Shangguan
Yongjiu Dai
30
2
0
12 Jun 2024
Analyzing Multi-Head Attention on Trojan BERT Models
Analyzing Multi-Head Attention on Trojan BERT Models
Jingwei Wang
32
0
0
12 Jun 2024
VTrans: Accelerating Transformer Compression with Variational
  Information Bottleneck based Pruning
VTrans: Accelerating Transformer Compression with Variational Information Bottleneck based Pruning
Oshin Dutta
Ritvik Gupta
Sumeet Agarwal
39
1
0
07 Jun 2024
KGLink: A column type annotation method that combines knowledge graph
  and pre-trained language model
KGLink: A column type annotation method that combines knowledge graph and pre-trained language model
Yubo Wang
Hao Xin
Lei Chen
LMTD
19
3
0
01 Jun 2024
Contextual Counting: A Mechanistic Study of Transformers on a
  Quantitative Task
Contextual Counting: A Mechanistic Study of Transformers on a Quantitative Task
Siavash Golkar
Alberto Bietti
Mariel Pettee
Michael Eickenberg
M. Cranmer
...
Ruben Ohana
Liam Parker
Bruno Régaldo-Saint Blancard
Kyunghyun Cho
Shirley Ho
47
1
0
30 May 2024
Are queries and keys always relevant? A case study on Transformer wave functions
Are queries and keys always relevant? A case study on Transformer wave functions
Riccardo Rende
Luciano Loris Viteritti
24
5
0
29 May 2024
Exploring Activation Patterns of Parameters in Language Models
Exploring Activation Patterns of Parameters in Language Models
Yudong Wang
Damai Dai
Zhifang Sui
24
1
0
28 May 2024
Multi-objective Representation for Numbers in Clinical Narratives: A CamemBERT-Bio-Based Alternative to Large-Scale LLMs
Multi-objective Representation for Numbers in Clinical Narratives: A CamemBERT-Bio-Based Alternative to Large-Scale LLMs
Boammani Aser Lompo
Thanh-Dung Le
28
1
0
28 May 2024
InversionView: A General-Purpose Method for Reading Information from
  Neural Activations
InversionView: A General-Purpose Method for Reading Information from Neural Activations
Xinting Huang
Madhur Panwar
Navin Goyal
Michael Hahn
26
3
0
27 May 2024
Disentangling and Integrating Relational and Sensory Information in
  Transformer Architectures
Disentangling and Integrating Relational and Sensory Information in Transformer Architectures
Awni Altabaa
John Lafferty
27
3
0
26 May 2024
Incremental Comprehension of Garden-Path Sentences by Large Language
  Models: Semantic Interpretation, Syntactic Re-Analysis, and Attention
Incremental Comprehension of Garden-Path Sentences by Large Language Models: Semantic Interpretation, Syntactic Re-Analysis, and Attention
Andrew Li
Xianle Feng
Siddhant Narang
Austin Peng
Tianle Cai
Raj Sanjay Shah
Sashank Varma
LRM
31
5
0
25 May 2024
Layer-Condensed KV Cache for Efficient Inference of Large Language
  Models
Layer-Condensed KV Cache for Efficient Inference of Large Language Models
Haoyi Wu
Kewei Tu
MQ
41
17
0
17 May 2024
Multi-Evidence based Fact Verification via A Confidential Graph Neural
  Network
Multi-Evidence based Fact Verification via A Confidential Graph Neural Network
Yuqing Lan
Zhenghao Liu
Yu Gu
Xiaoyuan Yi
Xiaohua Li
Liner Yang
Ge Yu
30
0
0
17 May 2024
TFWT: Tabular Feature Weighting with Transformer
TFWT: Tabular Feature Weighting with Transformer
Xinhao Zhang
Zaitian Wang
Lu Jiang
Wanfu Gao
Pengfei Wang
Kunpeng Liu
LMTD
14
14
0
14 May 2024
Explaining Text Similarity in Transformer Models
Explaining Text Similarity in Transformer Models
Alexandros Vasileiou
Oliver Eberle
43
7
0
10 May 2024
Potential and Limitations of LLMs in Capturing Structured Semantics: A
  Case Study on SRL
Potential and Limitations of LLMs in Capturing Structured Semantics: A Case Study on SRL
Ning Cheng
Zhaohui Yan
Ziming Wang
Zhijie Li
Jiaming Yu
Zilong Zheng
Kewei Tu
Jinan Xu
Wenjuan Han
29
5
0
10 May 2024
Interpretability Needs a New Paradigm
Interpretability Needs a New Paradigm
Andreas Madsen
Himabindu Lakkaraju
Siva Reddy
Sarath Chandar
39
4
0
08 May 2024
Interpretable Cross-Examination Technique (ICE-T): Using highly
  informative features to boost LLM performance
Interpretable Cross-Examination Technique (ICE-T): Using highly informative features to boost LLM performance
Goran Muric
Ben Delay
Steven Minton
32
1
0
08 May 2024
What does the Knowledge Neuron Thesis Have to do with Knowledge?
What does the Knowledge Neuron Thesis Have to do with Knowledge?
Jingcheng Niu
Andrew Liu
Zining Zhu
Gerald Penn
36
30
0
03 May 2024
Context-Aware Machine Translation with Source Coreference Explanation
Context-Aware Machine Translation with Source Coreference Explanation
Huy Hien Vu
Hidetaka Kamigaito
Taro Watanabe
LRM
27
1
0
30 Apr 2024
Talking Nonsense: Probing Large Language Models' Understanding of
  Adversarial Gibberish Inputs
Talking Nonsense: Probing Large Language Models' Understanding of Adversarial Gibberish Inputs
Valeriia Cherepanova
James Zou
AAML
31
4
0
26 Apr 2024
Detecting Conceptual Abstraction in LLMs
Detecting Conceptual Abstraction in LLMs
Michaela Regneri
Alhassan Abdelhalim
Soren Laue
33
1
0
24 Apr 2024
What do Transformers Know about Government?
What do Transformers Know about Government?
Jue Hou
Anisia Katinskaia
Lari Kotilainen
Sathianpong Trangcasanchai
Anh Vu
R. Yangarber
24
2
0
22 Apr 2024
Large language models and linguistic intentionality
Large language models and linguistic intentionality
J. Grindrod
33
5
0
15 Apr 2024
LM Transparency Tool: Interactive Tool for Analyzing Transformer
  Language Models
LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models
Igor Tufanov
Karen Hambardzumyan
Javier Ferrando
Elena Voita
KELM
28
6
0
10 Apr 2024
How does Multi-Task Training Affect Transformer In-Context Capabilities?
  Investigations with Function Classes
How does Multi-Task Training Affect Transformer In-Context Capabilities? Investigations with Function Classes
Harmon Bhasin
Timothy Ossowski
Yiqiao Zhong
Junjie Hu
22
0
0
04 Apr 2024
Okay, Let's Do This! Modeling Event Coreference with Generated
  Rationales and Knowledge Distillation
Okay, Let's Do This! Modeling Event Coreference with Generated Rationales and Knowledge Distillation
Abhijnan Nath
Shadi Manafi
Avyakta Chelle
Nikhil Krishnaswamy
38
1
0
04 Apr 2024
On Linearizing Structured Data in Encoder-Decoder Language Models:
  Insights from Text-to-SQL
On Linearizing Structured Data in Encoder-Decoder Language Models: Insights from Text-to-SQL
Yutong Shao
N. Nakashole
22
1
0
03 Apr 2024
Previous
123456...161718
Next