ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.14680
  4. Cited By
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts
  in the Vocabulary Space

Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space

28 March 2022
Mor Geva
Avi Caciularu
Ke Wang
Yoav Goldberg
    KELM
ArXivPDFHTML

Papers citing "Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space"

50 / 269 papers shown
Title
When Parts are Greater Than Sums: Individual LLM Components Can
  Outperform Full Models
When Parts are Greater Than Sums: Individual LLM Components Can Outperform Full Models
Ting-Yun Chang
Jesse Thomason
Robin Jia
36
4
0
19 Jun 2024
Hopping Too Late: Exploring the Limitations of Large Language Models on
  Multi-Hop Queries
Hopping Too Late: Exploring the Limitations of Large Language Models on Multi-Hop Queries
Eden Biran
Daniela Gottesman
Sohee Yang
Mor Geva
Amir Globerson
LRM
34
21
0
18 Jun 2024
An Investigation of Neuron Activation as a Unified Lens to Explain
  Chain-of-Thought Eliciting Arithmetic Reasoning of LLMs
An Investigation of Neuron Activation as a Unified Lens to Explain Chain-of-Thought Eliciting Arithmetic Reasoning of LLMs
Daking Rai
Ziyu Yao
LRM
23
7
0
18 Jun 2024
InternalInspector $I^2$: Robust Confidence Estimation in LLMs through
  Internal States
InternalInspector I2I^2I2: Robust Confidence Estimation in LLMs through Internal States
Mohammad Beigi
Ying Shen
Runing Yang
Zihao Lin
Qifan Wang
Ankith Mohan
Jianfeng He
Ming Jin
Chang-Tien Lu
Lifu Huang
HILM
29
4
0
17 Jun 2024
Intrinsic Evaluation of Unlearning Using Parametric Knowledge Traces
Intrinsic Evaluation of Unlearning Using Parametric Knowledge Traces
Yihuai Hong
Lei Yu
Shauli Ravfogel
Haiqin Yang
Mor Geva
KELM
MU
58
17
0
17 Jun 2024
In-Context Editing: Learning Knowledge from Self-Induced Distributions
In-Context Editing: Learning Knowledge from Self-Induced Distributions
Siyuan Qi
Bangcheng Yang
Kailin Jiang
Xiaobo Wang
Jiaqi Li
Yifan Zhong
Yaodong Yang
Zilong Zheng
KELM
99
8
0
17 Jun 2024
FreeCtrl: Constructing Control Centers with Feedforward Layers for
  Learning-Free Controllable Text Generation
FreeCtrl: Constructing Control Centers with Feedforward Layers for Learning-Free Controllable Text Generation
Zijian Feng
Hanzhang Zhou
Zixiao Zhu
Kezhi Mao
21
1
0
14 Jun 2024
Sharing Matters: Analysing Neurons Across Languages and Tasks in LLMs
Sharing Matters: Analysing Neurons Across Languages and Tasks in LLMs
Weixuan Wang
Barry Haddow
Wei Peng
Alexandra Birch
MILM
28
8
0
13 Jun 2024
REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space
REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space
Tomer Ashuach
Martin Tutek
Yonatan Belinkov
KELM
MU
56
4
0
13 Jun 2024
Memorization in deep learning: A survey
Memorization in deep learning: A survey
Jiaheng Wei
Yanjun Zhang
Leo Yu Zhang
Ming Ding
Chao Chen
Kok-Leong Ong
Jun Zhang
Yang Xiang
40
5
0
06 Jun 2024
Pre-trained Large Language Models Use Fourier Features to Compute
  Addition
Pre-trained Large Language Models Use Fourier Features to Compute Addition
Tianyi Zhou
Deqing Fu
Vatsal Sharan
Robin Jia
LRM
29
9
0
05 Jun 2024
Self-Control of LLM Behaviors by Compressing Suffix Gradient into Prefix
  Controller
Self-Control of LLM Behaviors by Compressing Suffix Gradient into Prefix Controller
Min Cai
Yuchen Zhang
Shichang Zhang
Fan Yin
Difan Zou
Yisong Yue
Ziniu Hu
21
0
0
04 Jun 2024
Editing the Mind of Giants: An In-Depth Exploration of Pitfalls of
  Knowledge Editing in Large Language Models
Editing the Mind of Giants: An In-Depth Exploration of Pitfalls of Knowledge Editing in Large Language Models
Cheng-Hsun Hsueh
Paul Kuo-Ming Huang
Tzu-Han Lin
Che-Wei Liao
Hung-Chieh Fang
Chao-Wei Huang
Yun-Nung Chen
KELM
31
5
0
03 Jun 2024
UniBias: Unveiling and Mitigating LLM Bias through Internal Attention
  and FFN Manipulation
UniBias: Unveiling and Mitigating LLM Bias through Internal Attention and FFN Manipulation
Hanzhang Zhou
Zijian Feng
Zixiao Zhu
Junlang Qian
Kezhi Mao
28
6
0
31 May 2024
Calibrating Reasoning in Language Models with Internal Consistency
Calibrating Reasoning in Language Models with Internal Consistency
Zhihui Xie
Jizhou Guo
Tong Yu
Shuai Li
LRM
43
8
0
29 May 2024
Are PPO-ed Language Models Hackable?
Are PPO-ed Language Models Hackable?
Suraj Anand
David Getzen
18
0
0
28 May 2024
Knowledge Circuits in Pretrained Transformers
Knowledge Circuits in Pretrained Transformers
Yunzhi Yao
Ningyu Zhang
Zekun Xi
Meng Wang
Ziwen Xu
Shumin Deng
Huajun Chen
KELM
64
19
0
28 May 2024
Perturbation-Restrained Sequential Model Editing
Perturbation-Restrained Sequential Model Editing
Junjie Ma
Hong Wang
Haoyang Xu
Zhen-Hua Ling
Jia-Chen Gu
KELM
53
8
0
27 May 2024
No Two Devils Alike: Unveiling Distinct Mechanisms of Fine-tuning
  Attacks
No Two Devils Alike: Unveiling Distinct Mechanisms of Fine-tuning Attacks
Chak Tou Leong
Yi Cheng
Kaishuai Xu
Jian Wang
Hanlin Wang
Wenjie Li
AAML
41
17
0
25 May 2024
Sparse Matrix in Large Language Model Fine-tuning
Sparse Matrix in Large Language Model Fine-tuning
Haoze He
Juncheng Billy Li
Xuan Jiang
Heather Miller
MoE
19
3
0
24 May 2024
Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to
  the Edge of Generalization
Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization
Boshi Wang
Xiang Yue
Yu-Chuan Su
Huan Sun
LRM
21
41
0
23 May 2024
Model Editing as a Robust and Denoised variant of DPO: A Case Study on Toxicity
Model Editing as a Robust and Denoised variant of DPO: A Case Study on Toxicity
Rheeya Uppaal
Apratim De
Yiting He
Yiquao Zhong
Junjie Hu
29
7
0
22 May 2024
Unveiling and Manipulating Prompt Influence in Large Language Models
Unveiling and Manipulating Prompt Influence in Large Language Models
Zijian Feng
Hanzhang Zhou
Zixiao Zhu
Junlang Qian
Kezhi Mao
23
2
0
20 May 2024
A Comprehensive Survey of Accelerated Generation Techniques in Large
  Language Models
A Comprehensive Survey of Accelerated Generation Techniques in Large Language Models
Mahsa Khoshnoodi
Vinija Jain
Mingye Gao
Malavika Srikanth
Aman Chadha
OffRL
23
1
0
15 May 2024
Natural Language Processing RELIES on Linguistics
Natural Language Processing RELIES on Linguistics
Juri Opitz
Shira Wein
Nathan Schneider
AI4CE
44
7
0
09 May 2024
Anchored Answers: Unravelling Positional Bias in GPT-2's Multiple-Choice
  Questions
Anchored Answers: Unravelling Positional Bias in GPT-2's Multiple-Choice Questions
Ruizhe Li
Yanjun Gao
KELM
27
5
0
06 May 2024
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding
Mostafa Elhoushi
Akshat Shrivastava
Diana Liskovich
Basil Hosmer
Bram Wasti
...
Saurabh Agarwal
Ahmed Roman
Ahmed Aly
Beidi Chen
Carole-Jean Wu
LRM
33
82
0
25 Apr 2024
Mechanistic Interpretability for AI Safety -- A Review
Mechanistic Interpretability for AI Safety -- A Review
Leonard Bereska
E. Gavves
AI4CE
38
111
0
22 Apr 2024
Understanding the role of FFNs in driving multilingual behaviour in LLMs
Understanding the role of FFNs in driving multilingual behaviour in LLMs
Sunit Bhattacharya
Ondrej Bojar
19
2
0
22 Apr 2024
Latent Concept-based Explanation of NLP Models
Latent Concept-based Explanation of NLP Models
Xuemin Yu
Fahim Dalvi
Nadir Durrani
Marzia Nouri
Hassan Sajjad
LRM
FAtt
19
1
0
18 Apr 2024
Explainable Generative AI (GenXAI): A Survey, Conceptualization, and
  Research Agenda
Explainable Generative AI (GenXAI): A Survey, Conceptualization, and Research Agenda
Johannes Schneider
79
26
0
15 Apr 2024
LM Transparency Tool: Interactive Tool for Analyzing Transformer
  Language Models
LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models
Igor Tufanov
Karen Hambardzumyan
Javier Ferrando
Elena Voita
KELM
28
6
0
10 Apr 2024
Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language
  Translation
Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language Translation
Tong Su
Xin Peng
Sarubi Thillainathan
David Guzmán
Surangika Ranathunga
En-Shiun Annie Lee
27
2
0
05 Apr 2024
Eigenpruning: an Interpretability-Inspired PEFT Method
Eigenpruning: an Interpretability-Inspired PEFT Method
T. Browne
Álvaro Soto
A. Aizawa
23
1
0
04 Apr 2024
On Large Language Models' Hallucination with Regard to Known Facts
On Large Language Models' Hallucination with Regard to Known Facts
Che Jiang
Biqing Qi
Xiangyu Hong
Dayuan Fu
Yang Cheng
Fandong Meng
Mo Yu
Bowen Zhou
Jie Zhou
HILM
LRM
31
17
0
29 Mar 2024
Mechanistic Understanding and Mitigation of Language Model Non-Factual
  Hallucinations
Mechanistic Understanding and Mitigation of Language Model Non-Factual Hallucinations
Lei Yu
Meng Cao
Jackie Chi Kit Cheung
Yue Dong
HILM
33
6
0
27 Mar 2024
Detoxifying Large Language Models via Knowledge Editing
Detoxifying Large Language Models via Knowledge Editing
Meng Wang
Ningyu Zhang
Ziwen Xu
Zekun Xi
Shumin Deng
Yunzhi Yao
Qishen Zhang
Linyi Yang
Jindong Wang
Huajun Chen
KELM
38
54
0
21 Mar 2024
Locating and Mitigating Gender Bias in Large Language Models
Locating and Mitigating Gender Bias in Large Language Models
Yuchen Cai
Ding Cao
Rongxi Guo
Yaqin Wen
Guiquan Liu
Enhong Chen
27
5
0
21 Mar 2024
Editing Knowledge Representation of Language Model via Rephrased Prefix
  Prompts
Editing Knowledge Representation of Language Model via Rephrased Prefix Prompts
Yuchen Cai
Ding Cao
Rongxi Guo
Yaqin Wen
Guiquan Liu
Enhong Chen
KELM
29
3
0
21 Mar 2024
Diffusion Lens: Interpreting Text Encoders in Text-to-Image Pipelines
Diffusion Lens: Interpreting Text Encoders in Text-to-Image Pipelines
Michael Toker
Hadas Orgad
Mor Ventura
Dana Arad
Yonatan Belinkov
DiffM
58
12
0
09 Mar 2024
Measuring Meaning Composition in the Human Brain with Composition Scores
  from Large Language Models
Measuring Meaning Composition in the Human Brain with Composition Scores from Large Language Models
Changjiang Gao
Jixing Li
Jiajun Chen
Shujian Huang
15
2
0
07 Mar 2024
How do Large Language Models Handle Multilingualism?
How do Large Language Models Handle Multilingualism?
Yiran Zhao
Wenxuan Zhang
Guizhen Chen
Kenji Kawaguchi
Lidong Bing
LRM
33
52
0
29 Feb 2024
Cutting Off the Head Ends the Conflict: A Mechanism for Interpreting and
  Mitigating Knowledge Conflicts in Language Models
Cutting Off the Head Ends the Conflict: A Mechanism for Interpreting and Mitigating Knowledge Conflicts in Language Models
Zhuoran Jin
Pengfei Cao
Hongbang Yuan
Yubo Chen
Jiexin Xu
Huaijun Li
Xiaojian Jiang
Kang Liu
Jun Zhao
178
32
0
28 Feb 2024
Editing Factual Knowledge and Explanatory Ability of Medical Large
  Language Models
Editing Factual Knowledge and Explanatory Ability of Medical Large Language Models
Derong Xu
Ziheng Zhang
Zhihong Zhu
Zhenxi Lin
Qidong Liu
...
Wanyu Wang
Yuyang Ye
Xiangyu Zhao
Yefeng Zheng
Enhong Chen
KELM
27
9
0
28 Feb 2024
TruthX: Alleviating Hallucinations by Editing Large Language Models in
  Truthful Space
TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space
Shaolei Zhang
Tian Yu
Yang Feng
HILM
KELM
29
39
0
27 Feb 2024
Do Large Language Models Latently Perform Multi-Hop Reasoning?
Do Large Language Models Latently Perform Multi-Hop Reasoning?
Sohee Yang
E. Gribovskaya
Nora Kassner
Mor Geva
Sebastian Riedel
ReLM
LRM
35
75
0
26 Feb 2024
LLM Inference Unveiled: Survey and Roofline Model Insights
LLM Inference Unveiled: Survey and Roofline Model Insights
Zhihang Yuan
Yuzhang Shang
Yang Zhou
Zhen Dong
Zhe Zhou
...
Yong Jae Lee
Yan Yan
Beidi Chen
Guangyu Sun
Kurt Keutzer
37
77
0
26 Feb 2024
Interpreting Context Look-ups in Transformers: Investigating
  Attention-MLP Interactions
Interpreting Context Look-ups in Transformers: Investigating Attention-MLP Interactions
Clement Neo
Shay B. Cohen
Fazl Barez
36
4
0
23 Feb 2024
Understanding and Patching Compositional Reasoning in LLMs
Understanding and Patching Compositional Reasoning in LLMs
Zhaoyi Li
Gangwei Jiang
Hong Xie
Linqi Song
Defu Lian
Ying Wei
LRM
46
20
0
22 Feb 2024
The Hidden Space of Transformer Language Adapters
The Hidden Space of Transformer Language Adapters
Jesujoba Oluwadara Alabi
Marius Mosbach
Matan Eyal
Dietrich Klakow
Mor Geva
48
7
1
20 Feb 2024
Previous
123456
Next