Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2506.18167
Cited By

Understanding Reasoning in Thinking Language Models via Steering Vectors

v1v2v3v4 (latest)

Understanding Reasoning in Thinking Language Models via Steering Vectors

22 June 2025

Constantin Venhoff

Iván Arcuschin

ArXiv (abs)PDF HTML Github (29★)

Papers citing "Understanding Reasoning in Thinking Language Models via Steering Vectors"

32 / 32 papers shown

SALT: Steering Activations towards Leakage-free Thinking in Chain of Thought

SALT: Steering Activations towards Leakage-free Thinking in Chain of Thought

Shashank Kesineni

Maheep Chaudhary

KELM PILM LLMSV LRM ELM

580

1

0

11 Nov 2025

Rank-1 LoRAs Encode Interpretable Reasoning Signals

Rank-1 LoRAs Encode Interpretable Reasoning Signals

347

0

0

10 Nov 2025

MONICA: Real-Time Monitoring and Calibration of Chain-of-Thought Sycophancy in Large Reasoning Models

MONICA: Real-Time Monitoring and Calibration of Chain-of-Thought Sycophancy in Large Reasoning Models

138

2

0

09 Nov 2025

Can Aha Moments Be Fake? Identifying True and Decorative Thinking Steps in Chain-of-Thought

Can Aha Moments Be Fake? Identifying True and Decorative Thinking Steps in Chain-of-Thought

103

0

0

28 Oct 2025

Modeling Hierarchical Thinking in Large Reasoning Models

Modeling Hierarchical Thinking in Large Reasoning Models

Erfan Shayegani

Nael B. Abu-Ghazaleh

126

0

0

25 Oct 2025

Mapping Faithful Reasoning in Language Models

Mapping Faithful Reasoning in Language Models

Andreas Damianou

José Luis Redondo García

Konstantina Palla

104

0

0

25 Oct 2025

Can Small and Reasoning Large Language Models Score Journal Articles for Research Quality and Do Averaging and Few-shot Help?

Can Small and Reasoning Large Language Models Score Journal Articles for Research Quality and Do Averaging and Few-shot Help?

Ehsan Mohammadi

80

1

0

25 Oct 2025

Stream: Scaling up Mechanistic Interpretability to Long Context in LLMs via Sparse Attention

Stream: Scaling up Mechanistic Interpretability to Long Context in LLMs via Sparse Attention

José Luis Redondo García

Konstantina Palla

Hugues Bouchard

96

0

0

22 Oct 2025

A Concrete Roadmap towards Safety Cases based on Chain-of-Thought Monitoring

A Concrete Roadmap towards Safety Cases based on Chain-of-Thought Monitoring

130

0

0

22 Oct 2025

Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm Enables Fine-Grained Policy Optimization

Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm Enables Fine-Grained Policy Optimization

...

113

4

0

15 Oct 2025

ThinkPilot: Steering Reasoning Models via Automated Think-prefixes Optimization

ThinkPilot: Steering Reasoning Models via Automated Think-prefixes Optimization

101

0

0

14 Oct 2025

The Geometry of Reasoning: Flowing Logics in Representation Space

The Geometry of Reasoning: Flowing Logics in Representation Space

120

2

0

10 Oct 2025

AV-EMO-Reasoning: Benchmarking Emotional Reasoning Capabilities in Omni-modal LLMS with Audio-visual Cues

AV-EMO-Reasoning: Benchmarking Emotional Reasoning Capabilities in Omni-modal LLMS with Audio-visual Cues

...

Huang-Cheng Chou

Gopala Anumanchipalli

155

5

0

08 Oct 2025

Internal states before wait modulate reasoning patterns

Internal states before wait modulate reasoning patterns

Dmitrii Troitskii

Callum McDougall

101

1

1

05 Oct 2025

MLLMEraser: Achieving Test-Time Unlearning in Multimodal Large Language Models through Activation Steering

MLLMEraser: Achieving Test-Time Unlearning in Multimodal Large Language Models through Activation Steering

249

0

0

05 Oct 2025

ThinKV: Thought-Adaptive KV Cache Compression for Efficient Reasoning Models

ThinKV: Thought-Adaptive KV Cache Compression for Efficient Reasoning Models

Akshat Ramachandran

Rangharajan Venkatesan

Brucek Khailany

150

1

1

01 Oct 2025

Enhancing LLM Steering through Sparse Autoencoder-Based Vector Refinement

Enhancing LLM Steering through Sparse Autoencoder-Based Vector Refinement

183

0

0

28 Sep 2025

From Reasoning to Answer: Empirical, Attention-Based and Mechanistic Insights into Distilled DeepSeek R1 Models

From Reasoning to Answer: Empirical, Attention-Based and Mechanistic Insights into Distilled DeepSeek R1 Models

Saravan Rajmohan

108

0

0

28 Sep 2025

Bridging the Knowledge-Prediction Gap in LLMs on Multiple-Choice Questions

Bridging the Knowledge-Prediction Gap in LLMs on Multiple-Choice Questions

375

0

0

28 Sep 2025

RL Squeezes, SFT Expands: A Comparative Study of Reasoning LLMs

RL Squeezes, SFT Expands: A Comparative Study of Reasoning LLMs

Kohsei Matsutani

Shota Takashiro

Gouki Minegishi

207

6

0

25 Sep 2025

DISCO: Disentangled Communication Steering for Large Language Models

DISCO: Disentangled Communication Steering for Large Language Models

182

0

0

20 Sep 2025

Small Vectors, Big Effects: A Mechanistic Study of RL-Induced Reasoning via Steering Vectors

Small Vectors, Big Effects: A Mechanistic Study of RL-Induced Reasoning via Steering Vectors

Viacheslav Sinii

Nikita Balagansky

Yaroslav Aksenov

Vadim Kurochkin

Alexey Gorbatovski

Boris Shaposhnikov

Daniil Gavrilov

183

1

0

08 Sep 2025

Can We Predict Alignment Before Models Finish Thinking? Towards Monitoring Misaligned Reasoning Models

Can We Predict Alignment Before Models Finish Thinking? Towards Monitoring Misaligned Reasoning Models

Stephen H. Bach

254

8

0

16 Jul 2025

PII Jailbreaking in LLMs via Activation Steering Reveals Personal Information Leakage

PII Jailbreaking in LLMs via Activation Steering Reveals Personal Information Leakage

Krishna Kanth Nakka

250

0

0

03 Jul 2025

Adversarial Manipulation of Reasoning Models using Internal Representations

Adversarial Manipulation of Reasoning Models using Internal Representations

Kureha Yamaguchi

Benjamin Etheridge

141

3

0

03 Jul 2025

Thought Anchors: Which LLM Reasoning Steps Matter?

Thought Anchors: Which LLM Reasoning Steps Matter?

364

50

0

23 Jun 2025

Latent Concept Disentanglement in Transformer-based Language Models

Latent Concept Disentanglement in Transformer-based Language Models

Bhavya Vasudeva

Willie Neiswanger

Cyrus Rashtchian

Prabhakar Raghavan

341

2

0

20 Jun 2025

AlphaSteer: Learning Refusal Steering with Principled Null-Space Constraint

AlphaSteer: Learning Refusal Steering with Principled Null-Space Constraint

154

7

0

08 Jun 2025

Topology of Reasoning: Understanding Large Reasoning Models through Reasoning Graph Properties

Topology of Reasoning: Understanding Large Reasoning Models through Reasoning Graph Properties

Gouki Minegishi

1.1K

12

0

06 Jun 2025

Steering LLM Reasoning Through Bias-Only Adaptation

Steering LLM Reasoning Through Bias-Only Adaptation

Viacheslav Sinii

Alexey Gorbatovski

Artem Cherepanov

Boris Shaposhnikov

Nikita Balagansky

Daniil Gavrilov

278

2

0

24 May 2025

The Geometry of Self-Verification in a Task-Specific Reasoning Model

The Geometry of Self-Verification in a Task-Specific Reasoning Model

Fernanda Viégas

Martin Wattenberg

423

3

0

19 Apr 2025

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

...

OffRL AI4TS LRM ReLM VLM

1.2K

5,342

0

22 Jan 2025

Page 1 of 1