Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2409.06328
Cited By
Extracting Paragraphs from LLM Token Activations
10 September 2024
Nicholas Pochinkov
Angelo Benoit
Lovkush Agarwal
Zainab Ali Majid
Lucile Ter-Minassian
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Extracting Paragraphs from LLM Token Activations"
3 / 3 papers shown
Title
Robustly identifying concepts introduced during chat fine-tuning using crosscoders
Julian Minder
Clement Dumas
Caden Juang
Bilal Chugtai
Neel Nanda
25
0
0
03 Apr 2025
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
Kevin Wang
Alexandre Variengien
Arthur Conmy
Buck Shlegeris
Jacob Steinhardt
210
486
0
01 Nov 2022
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models
Nandan Thakur
Nils Reimers
Andreas Rucklé
Abhishek Srivastava
Iryna Gurevych
VLM
229
720
0
17 Apr 2021
1