ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2312.16291
  4. Cited By
Observable Propagation: Uncovering Feature Vectors in Transformers

Observable Propagation: Uncovering Feature Vectors in Transformers

26 December 2023
Jacob Dunefsky
Arman Cohan
ArXivPDFHTML

Papers citing "Observable Propagation: Uncovering Feature Vectors in Transformers"

7 / 7 papers shown
Title
Uncovering Intermediate Variables in Transformers using Circuit Probing
Uncovering Intermediate Variables in Transformers using Circuit Probing
Michael A. Lepori
Thomas Serre
Ellie Pavlick
49
7
0
07 Nov 2023
Characterizing Mechanisms for Factual Recall in Language Models
Characterizing Mechanisms for Factual Recall in Language Models
Qinan Yu
Jack Merullo
Ellie Pavlick
KELM
24
10
0
24 Oct 2023
On the Expressivity Role of LayerNorm in Transformers' Attention
On the Expressivity Role of LayerNorm in Transformers' Attention
Shaked Brody
Shiyu Jin
Xinghao Zhu
MoE
54
21
0
04 May 2023
Finding Neurons in a Haystack: Case Studies with Sparse Probing
Finding Neurons in a Haystack: Case Studies with Sparse Probing
Wes Gurnee
Neel Nanda
Matthew Pauly
Katherine Harvey
Dmitrii Troitskii
Dimitris Bertsimas
MILM
153
170
0
02 May 2023
Interpretability in the Wild: a Circuit for Indirect Object
  Identification in GPT-2 small
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
Kevin Wang
Alexandre Variengien
Arthur Conmy
Buck Shlegeris
Jacob Steinhardt
205
486
0
01 Nov 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
236
1,508
0
31 Dec 2020
1