Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.12130
Cited By
LM-Debugger: An Interactive Tool for Inspection and Intervention in Transformer-Based Language Models
26 April 2022
Mor Geva
Avi Caciularu
Guy Dar
Paul Roit
Shoval Sadde
Micah Shlain
Bar Tamir
Yoav Goldberg
KELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LM-Debugger: An Interactive Tool for Inspection and Intervention in Transformer-Based Language Models"
7 / 7 papers shown
Title
Safe LoRA: the Silver Lining of Reducing Safety Risks when Fine-tuning Large Language Models
Chia-Yi Hsu
Yu-Lin Tsai
Chih-Hsun Lin
Pin-Yu Chen
Chia-Mu Yu
Chun-ying Huang
44
31
0
27 May 2024
Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models
Asma Ghandeharioun
Avi Caciularu
Adam Pearce
Lucas Dixon
Mor Geva
25
86
0
11 Jan 2024
Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks
Rahul Ramesh
Ekdeep Singh Lubana
Mikail Khona
Robert P. Dick
Hidenori Tanaka
CoGe
22
6
0
21 Nov 2023
Towards Learning and Explaining Indirect Causal Effects in Neural Networks
Abbaavaram Gowtham Reddy
Saketh Bachu
Harsh Nilesh Pathak
Ben Godfrey
V. Balasubramanian
V. Varshaneya
Satya Narayanan Kar
CML
21
0
0
24 Mar 2023
Editing Models with Task Arithmetic
Gabriel Ilharco
Marco Tulio Ribeiro
Mitchell Wortsman
Suchin Gururangan
Ludwig Schmidt
Hannaneh Hajishirzi
Ali Farhadi
KELM
MoMe
MU
34
421
0
08 Dec 2022
An Interpretability Evaluation Benchmark for Pre-trained Language Models
Ya-Ming Shen
Lijie Wang
Ying Chen
Xinyan Xiao
Jing Liu
Hua-Hong Wu
22
4
0
28 Jul 2022
Putting Words in BERT's Mouth: Navigating Contextualized Vector Spaces with Pseudowords
Taelin Karidi
Yichu Zhou
Nathan Schneider
Omri Abend
Vivek Srikumar
78
13
0
23 Sep 2021
1