A Non-Linear Structural Probe

North American Chapter of the Association for Computational Linguistics (NAACL), 2021

21 May 2021

Papers citing "A Non-Linear Structural Probe"

23 / 23 papers shown

Freeze, Diffuse, Decode: Geometry-Aware Adaptation of Pretrained Transformer Embeddings for Antimicrobial Peptide Design

148

28 Nov 2025

Seed-Induced Uniqueness in Transformer Models: Subspace Alignment Governs Subliminal Transfer

02 Nov 2025

Beyond Linear Probes: Dynamic Safety Monitoring for Language Models

147

30 Sep 2025

Probing Syntax in Large Language Models: Successes and Remaining Challenges

272

05 Aug 2025

The Non-Linear Representation Dilemma: Is Causal Abstraction Enough for Mechanistic Interpretability?

196

11 Jul 2025

Linguistic Interpretability of Transformer-based Language Models: a systematic review

Lucía Pitarch-Ballesteros

Emma Anglés-Herrero

VLM

357

09 Apr 2025

A polar coordinate system represents syntax in large language modelsNeural Information Processing Systems (NeurIPS), 2024

345

07 Dec 2024

Probe-Me-Not: Protecting Pre-trained Encoders from Malicious ProbingNetwork and Distributed System Security Symposium (NDSS), 2024

353

19 Nov 2024

Mechanistic?BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackBoxNLP), 2024

Naomi Saphra

Sarah Wiegreffe

AI4CE

260

07 Oct 2024

A Critical Study of What Code-LLMs (Do Not) Learn

302

17 Jun 2024

Non-Linear Inference Time Intervention: Improving LLM Truthfulness

172

27 Mar 2024

Hitting "Probe"rty with Non-Linearity, and More

Avik Pal

Madhura Pawar

190

25 Feb 2024

Rethinking the Construction of Effective Metrics for Understanding the Mechanisms of Pretrained Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

You Li

Jinhui Yin

Yuming Lin

186

19 Oct 2023

Disentangling the Linguistic Competence of Privacy-Preserving BERTBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP), 2023

Stefan Arnold

Nils Kemmerzell

Annika Schreiner

252

17 Oct 2023

Arithmetic with Language Models: from Memorization to ComputationNeural Networks (Neural Netw.), 2023

Davide Maltoni

Matteo Ferrara

KELM LRM

209

02 Aug 2023

Sociodemographic Bias in Language Models: A Survey and Forward Path

Vipul Gupta

Pranav Narayanan Venkit

Shomir Wilson

R. Passonneau

441

13 Jun 2023

The Architectural Bottleneck PrincipleConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

181

11 Nov 2022

Emergent Linguistic Structures in Neural Networks are Fragile

Emanuele La Malfa

Matthew Wicker

Marta Kiatkowska

632

31 Oct 2022

AST-Probe: Recovering abstract syntax trees from hidden representations of pre-trained language modelsInternational Conference on Automated Software Engineering (ASE), 2022

José Antonio Hernández López

Martin Weyssow

Jesús Sánchez Cuadrado

H. Sahraoui

172

23 Jun 2022

Kernelized Concept ErasureConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

518

28 Jan 2022

The Dangers of Underclaiming: Reasons for Caution When Reporting How NLP Systems Fail

Sam Bowman

OffRL

369

15 Oct 2021

Structural Persistence in Language Models: Priming as a Window into Abstract Language Representations

233

30 Sep 2021

Conditional probing: measuring usable information beyond a baseline

John Hewitt

Kawin Ethayarajh

Abigail Z. Jacobs

Christopher D. Manning

208

19 Sep 2021