ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.14680
  4. Cited By
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts
  in the Vocabulary Space

Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space

28 March 2022
Mor Geva
Avi Caciularu
Ke Wang
Yoav Goldberg
    KELM
ArXivPDFHTML

Papers citing "Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space"

50 / 269 papers shown
Title
Learning and Unlearning of Fabricated Knowledge in Language Models
Learning and Unlearning of Fabricated Knowledge in Language Models
Chen Sun
Nolan Miller
A. Zhmoginov
Max Vladymyrov
Mark Sandler
KELM
MU
25
1
0
29 Oct 2024
Looking Beyond The Top-1: Transformers Determine Top Tokens In Order
Looking Beyond The Top-1: Transformers Determine Top Tokens In Order
Daria Lioubashevski
Tomer Schlank
Gabriel Stanovsky
Ariel Goldstein
29
1
0
26 Oct 2024
Label Set Optimization via Activation Distribution Kurtosis for
  Zero-shot Classification with Generative Models
Label Set Optimization via Activation Distribution Kurtosis for Zero-shot Classification with Generative Models
Yue Li
Zhixue Zhao
Carolina Scarton
27
0
0
24 Oct 2024
CogSteer: Cognition-Inspired Selective Layer Intervention for Efficient
  Semantic Steering in Large Language Models
CogSteer: Cognition-Inspired Selective Layer Intervention for Efficient Semantic Steering in Large Language Models
Xintong Wang
Jingheng Pan
Longqin Jiang
Liang Ding
Xingshan Li
Chris Biemann
LLMSV
21
0
0
23 Oct 2024
Neuron-based Personality Trait Induction in Large Language Models
Neuron-based Personality Trait Induction in Large Language Models
Jia Deng
Tianyi Tang
Yanbin Yin
Wenhao Yang
Wayne Xin Zhao
Ji-Rong Wen
33
1
0
16 Oct 2024
O-Edit: Orthogonal Subspace Editing for Language Model Sequential
  Editing
O-Edit: Orthogonal Subspace Editing for Language Model Sequential Editing
Yuchen Cai
Ding Cao
KELM
18
2
0
15 Oct 2024
Improving Instruction-Following in Language Models through Activation Steering
Improving Instruction-Following in Language Models through Activation Steering
Alessandro Stolfo
Vidhisha Balachandran
Safoora Yousefi
Eric Horvitz
Besmira Nushi
LLMSV
49
13
0
15 Oct 2024
Jailbreak Instruction-Tuned LLMs via end-of-sentence MLP Re-weighting
Jailbreak Instruction-Tuned LLMs via end-of-sentence MLP Re-weighting
Yifan Luo
Zhennan Zhou
Meitan Wang
Bin Dong
19
0
0
14 Oct 2024
Towards Interpreting Visual Information Processing in Vision-Language Models
Towards Interpreting Visual Information Processing in Vision-Language Models
Clement Neo
Luke Ong
Philip H. S. Torr
Mor Geva
David M. Krueger
Fazl Barez
84
6
0
09 Oct 2024
Jet Expansions of Residual Computation
Jet Expansions of Residual Computation
Yihong Chen
Xiangxiang Xu
Yao Lu
Pontus Stenetorp
Luca Franceschi
24
2
0
08 Oct 2024
Locate-then-edit for Multi-hop Factual Recall under Knowledge Editing
Locate-then-edit for Multi-hop Factual Recall under Knowledge Editing
Zhuoran Zhang
Y. Li
Zijian Kan
Keyuan Cheng
Lijie Hu
Di Wang
KELM
29
4
0
08 Oct 2024
From Tokens to Words: On the Inner Lexicon of LLMs
From Tokens to Words: On the Inner Lexicon of LLMs
Guy Kaplan
Matanel Oren
Yuval Reif
Roy Schwartz
41
12
0
08 Oct 2024
Mechanistic?
Mechanistic?
Naomi Saphra
Sarah Wiegreffe
AI4CE
21
9
0
07 Oct 2024
FAME: Towards Factual Multi-Task Model Editing
FAME: Towards Factual Multi-Task Model Editing
Li Zeng
Yingyu Shan
Zeming Liu
Jiashu Yao
Yuhang Guo
KELM
24
1
0
07 Oct 2024
Activation Scaling for Steering and Interpreting Language Models
Activation Scaling for Steering and Interpreting Language Models
Niklas Stoehr
Kevin Du
Vésteinn Snæbjarnarson
Robert West
Ryan Cotterell
Aaron Schein
LLMSV
LRM
29
4
0
07 Oct 2024
Toxic Subword Pruning for Dialogue Response Generation on Large Language
  Models
Toxic Subword Pruning for Dialogue Response Generation on Large Language Models
Hongyuan Lu
Wai Lam
17
0
0
05 Oct 2024
Mitigating Memorization In Language Models
Mitigating Memorization In Language Models
Mansi Sakarvadia
Aswathy Ajith
Arham Khan
Nathaniel Hudson
Caleb Geniesse
Kyle Chard
Yaoqing Yang
Ian Foster
Michael W. Mahoney
KELM
MU
47
0
0
03 Oct 2024
Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition
Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition
Jiyeon Kim
Hyunji Lee
Hyowon Cho
Joel Jang
Hyeonbin Hwang
Seungpil Won
Youbin Ahn
Dohaeng Lee
Minjoon Seo
KELM
55
2
0
02 Oct 2024
Investigating OCR-Sensitive Neurons to Improve Entity Recognition in
  Historical Documents
Investigating OCR-Sensitive Neurons to Improve Entity Recognition in Historical Documents
Emanuela Boros
Maud Ehrmann
31
0
0
25 Sep 2024
Interpreting Arithmetic Mechanism in Large Language Models through
  Comparative Neuron Analysis
Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis
Zeping Yu
Sophia Ananiadou
LRM
MILM
27
6
0
21 Sep 2024
Normalized Narrow Jump To Conclusions: Normalized Narrow Shortcuts for
  Parameter Efficient Early Exit Transformer Prediction
Normalized Narrow Jump To Conclusions: Normalized Narrow Shortcuts for Parameter Efficient Early Exit Transformer Prediction
Amrit Diggavi Seshadri
14
1
0
21 Sep 2024
Optimal ablation for interpretability
Optimal ablation for interpretability
Maximilian Li
Lucas Janson
FAtt
44
2
0
16 Sep 2024
Attend First, Consolidate Later: On the Importance of Attention in
  Different LLM Layers
Attend First, Consolidate Later: On the Importance of Attention in Different LLM Layers
Amit Ben Artzy
Roy Schwartz
27
5
0
05 Sep 2024
Interpreting and Improving Large Language Models in Arithmetic
  Calculation
Interpreting and Improving Large Language Models in Arithmetic Calculation
Wei Zhang
Chaoqun Wan
Yonggang Zhang
Yiu-ming Cheung
Xinmei Tian
Xu Shen
Jieping Ye
LRM
24
18
0
03 Sep 2024
Decompose the model: Mechanistic interpretability in image models with
  Generalized Integrated Gradients (GIG)
Decompose the model: Mechanistic interpretability in image models with Generalized Integrated Gradients (GIG)
Yearim Kim
Sangyu Han
Sangbum Han
Nojun Kwak
45
0
0
03 Sep 2024
From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning
From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning
Wei Chen
Zhen Huang
Liang Xie
Binbin Lin
Houqiang Li
...
Deng Cai
Yonggang Zhang
Wenxiao Wang
Xu Shen
Jieping Ye
40
6
0
03 Sep 2024
Benchmarking the Performance of Large Language Models on the Cerebras
  Wafer Scale Engine
Benchmarking the Performance of Large Language Models on the Cerebras Wafer Scale Engine
Zuoning Zhang
Dhruv Parikh
Youning Zhang
Viktor Prasanna
21
1
0
30 Aug 2024
Modularity in Transformers: Investigating Neuron Separability &
  Specialization
Modularity in Transformers: Investigating Neuron Separability & Specialization
Nicholas Pochinkov
Thomas Jones
Mohammed Rashidur Rahman
27
0
0
30 Aug 2024
VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths
  Vision Computation
VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation
Shiwei Wu
Joya Chen
Kevin Qinghong Lin
Qimeng Wang
Yan Gao
Qianli Xu
Tong Bill Xu
Yao Hu
Enhong Chen
Mike Zheng Shou
VLM
45
12
0
29 Aug 2024
A Mechanistic Interpretation of Syllogistic Reasoning in Auto-Regressive Language Models
A Mechanistic Interpretation of Syllogistic Reasoning in Auto-Regressive Language Models
Geonhee Kim
Marco Valentino
André Freitas
LRM
AI4CE
28
7
0
16 Aug 2024
GLGait: A Global-Local Temporal Receptive Field Network for Gait
  Recognition in the Wild
GLGait: A Global-Local Temporal Receptive Field Network for Gait Recognition in the Wild
Guozhen Peng
Yunhong Wang
Yuwei Zhao
Shaoxiong Zhang
Annan Li
CVBM
ViT
11
2
0
13 Aug 2024
Unveiling Factual Recall Behaviors of Large Language Models through
  Knowledge Neurons
Unveiling Factual Recall Behaviors of Large Language Models through Knowledge Neurons
Yifei Wang
Yuheng Chen
Wanting Wen
Yu Sheng
Linjing Li
D. Zeng
KELM
28
5
0
06 Aug 2024
The Mechanics of Conceptual Interpretation in GPT Models: Interpretative
  Insights
The Mechanics of Conceptual Interpretation in GPT Models: Interpretative Insights
Nura Aljaafari
Danilo S. Carvalho
André Freitas
KELM
27
0
0
05 Aug 2024
Machine Unlearning in Generative AI: A Survey
Machine Unlearning in Generative AI: A Survey
Zheyuan Liu
Guangyao Dou
Zhaoxuan Tan
Yijun Tian
Meng-Long Jiang
MU
31
13
0
30 Jul 2024
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs
Nitay Calderon
Roi Reichart
32
10
0
27 Jul 2024
An Efficient Inference Framework for Early-exit Large Language Models
An Efficient Inference Framework for Early-exit Large Language Models
Ruijie Miao
Yihan Yan
Xinshuo Yao
Tong Yang
22
0
0
25 Jul 2024
Knowledge Mechanisms in Large Language Models: A Survey and Perspective
Knowledge Mechanisms in Large Language Models: A Survey and Perspective
Meng Wang
Yunzhi Yao
Ziwen Xu
Shuofei Qiao
Shumin Deng
...
Yong-jia Jiang
Pengjun Xie
Fei Huang
Huajun Chen
Ningyu Zhang
47
27
0
22 Jul 2024
Answer, Assemble, Ace: Understanding How LMs Answer Multiple Choice Questions
Answer, Assemble, Ace: Understanding How LMs Answer Multiple Choice Questions
Sarah Wiegreffe
Oyvind Tafjord
Yonatan Belinkov
Hanna Hajishirzi
Ashish Sabharwal
34
3
0
21 Jul 2024
Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing
Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing
Huanqian Wang
Yang Yue
Rui Lu
Jingxin Shi
Andrew Zhao
Shenzhi Wang
Shiji Song
Gao Huang
LM&Ro
KELM
28
6
0
11 Jul 2024
Missed Causes and Ambiguous Effects: Counterfactuals Pose Challenges for
  Interpreting Neural Networks
Missed Causes and Ambiguous Effects: Counterfactuals Pose Challenges for Interpreting Neural Networks
Aaron Mueller
CML
23
9
0
05 Jul 2024
Securing Multi-turn Conversational Language Models Against Distributed
  Backdoor Triggers
Securing Multi-turn Conversational Language Models Against Distributed Backdoor Triggers
Terry Tong
Jiashu Xu
Qin Liu
Muhao Chen
AAML
SILM
32
1
0
04 Jul 2024
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models
Daking Rai
Yilun Zhou
Shi Feng
Abulhair Saparov
Ziyu Yao
73
18
0
02 Jul 2024
The Remarkable Robustness of LLMs: Stages of Inference?
The Remarkable Robustness of LLMs: Stages of Inference?
Vedang Lad
Wes Gurnee
Max Tegmark
33
33
0
27 Jun 2024
Encourage or Inhibit Monosemanticity? Revisit Monosemanticity from a
  Feature Decorrelation Perspective
Encourage or Inhibit Monosemanticity? Revisit Monosemanticity from a Feature Decorrelation Perspective
Hanqi Yan
Yanzheng Xiang
Guangyi Chen
Yifei Wang
Lin Gui
Yulan He
27
5
0
25 Jun 2024
Confidence Regulation Neurons in Language Models
Confidence Regulation Neurons in Language Models
Alessandro Stolfo
Ben Wu
Wes Gurnee
Yonatan Belinkov
Xingyi Song
Mrinmaya Sachan
Neel Nanda
29
12
0
24 Jun 2024
Preference Tuning For Toxicity Mitigation Generalizes Across Languages
Preference Tuning For Toxicity Mitigation Generalizes Across Languages
Xiaochen Li
Zheng-Xin Yong
Stephen H. Bach
CLL
23
13
0
23 Jun 2024
Beyond the Doors of Perception: Vision Transformers Represent Relations
  Between Objects
Beyond the Doors of Perception: Vision Transformers Represent Relations Between Objects
Michael A. Lepori
Alexa R. Tartaglini
Wai Keen Vong
Thomas Serre
Brenden Lake
Ellie Pavlick
34
2
0
22 Jun 2024
Beyond Individual Facts: Investigating Categorical Knowledge Locality of
  Taxonomy and Meronomy Concepts in GPT Models
Beyond Individual Facts: Investigating Categorical Knowledge Locality of Taxonomy and Meronomy Concepts in GPT Models
Christopher Burger
Yifan Hu
Thai Le
KELM
34
0
0
22 Jun 2024
Distributional reasoning in LLMs: Parallel reasoning processes in
  multi-hop reasoning
Distributional reasoning in LLMs: Parallel reasoning processes in multi-hop reasoning
Yuval Shalev
Amir Feder
Ariel Goldstein
LRM
29
4
0
19 Jun 2024
Locating and Extracting Relational Concepts in Large Language Models
Locating and Extracting Relational Concepts in Large Language Models
Zijian Wang
Britney White
Chang Xu
KELM
38
0
0
19 Jun 2024
Previous
123456
Next