ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.12130
  4. Cited By
LM-Debugger: An Interactive Tool for Inspection and Intervention in
  Transformer-Based Language Models

LM-Debugger: An Interactive Tool for Inspection and Intervention in Transformer-Based Language Models

26 April 2022
Mor Geva
Avi Caciularu
Guy Dar
Paul Roit
Shoval Sadde
Micah Shlain
Bar Tamir
Yoav Goldberg
    KELM
ArXivPDFHTML

Papers citing "LM-Debugger: An Interactive Tool for Inspection and Intervention in Transformer-Based Language Models"

7 / 7 papers shown
Title
Safe LoRA: the Silver Lining of Reducing Safety Risks when Fine-tuning Large Language Models
Safe LoRA: the Silver Lining of Reducing Safety Risks when Fine-tuning Large Language Models
Chia-Yi Hsu
Yu-Lin Tsai
Chih-Hsun Lin
Pin-Yu Chen
Chia-Mu Yu
Chun-ying Huang
44
31
0
27 May 2024
Patchscopes: A Unifying Framework for Inspecting Hidden Representations
  of Language Models
Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models
Asma Ghandeharioun
Avi Caciularu
Adam Pearce
Lucas Dixon
Mor Geva
25
86
0
11 Jan 2024
Compositional Capabilities of Autoregressive Transformers: A Study on
  Synthetic, Interpretable Tasks
Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks
Rahul Ramesh
Ekdeep Singh Lubana
Mikail Khona
Robert P. Dick
Hidenori Tanaka
CoGe
22
6
0
21 Nov 2023
Towards Learning and Explaining Indirect Causal Effects in Neural
  Networks
Towards Learning and Explaining Indirect Causal Effects in Neural Networks
Abbaavaram Gowtham Reddy
Saketh Bachu
Harsh Nilesh Pathak
Ben Godfrey
V. Balasubramanian
V. Varshaneya
Satya Narayanan Kar
CML
21
0
0
24 Mar 2023
Editing Models with Task Arithmetic
Editing Models with Task Arithmetic
Gabriel Ilharco
Marco Tulio Ribeiro
Mitchell Wortsman
Suchin Gururangan
Ludwig Schmidt
Hannaneh Hajishirzi
Ali Farhadi
KELM
MoMe
MU
34
421
0
08 Dec 2022
An Interpretability Evaluation Benchmark for Pre-trained Language Models
An Interpretability Evaluation Benchmark for Pre-trained Language Models
Ya-Ming Shen
Lijie Wang
Ying Chen
Xinyan Xiao
Jing Liu
Hua-Hong Wu
22
4
0
28 Jul 2022
Putting Words in BERT's Mouth: Navigating Contextualized Vector Spaces
  with Pseudowords
Putting Words in BERT's Mouth: Navigating Contextualized Vector Spaces with Pseudowords
Taelin Karidi
Yichu Zhou
Nathan Schneider
Omri Abend
Vivek Srikumar
78
13
0
23 Sep 2021
1