ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.00177
  4. Cited By

Steering Large Language Model Activations in Sparse Spaces

28 February 2025
Reza Bayat
Ali Rahimi-Kalahroudi
Mohammad Pezeshki
Sarath Chandar
Pascal Vincent
    LLMSV
ArXiv (abs)PDFHTML

Papers citing "Steering Large Language Model Activations in Sparse Spaces"

8 / 8 papers shown
Title
AI shares emotion with humans across languages and cultures
AI shares emotion with humans across languages and cultures
Xiuwen Wu
Hao Wang
Zhiang Yan
Xiaohan Tang
Pengfei Xu
Wai-Ting Siok
P. Li
Jia-Hong Gao
Bingjiang Lyu
Lang Qin
22
0
0
11 Jun 2025
Mitigating Spurious Correlations in LLMs via Causality-Aware Post-Training
Shurui Gui
Shuiwang Ji
LRM
63
0
0
11 Jun 2025
Interpretation Meets Safety: A Survey on Interpretation Methods and Tools for Improving LLM Safety
Interpretation Meets Safety: A Survey on Interpretation Methods and Tools for Improving LLM Safety
Seongmin Lee
Aeree Cho
Grace C. Kim
ShengYun Peng
Mansi Phute
Duen Horng Chau
LM&MAAI4CE
68
0
0
05 Jun 2025
STEER-BENCH: A Benchmark for Evaluating the Steerability of Large Language Models
STEER-BENCH: A Benchmark for Evaluating the Steerability of Large Language Models
Kai Chen
Zihao He
Taiwei Shi
Kristina Lerman
ALMLLMSV
95
0
0
27 May 2025
Beyond Prompt Engineering: Robust Behavior Control in LLMs via Steering Target Atoms
Beyond Prompt Engineering: Robust Behavior Control in LLMs via Steering Target Atoms
Mengru Wang
Ziwen Xu
Shengyu Mao
Shumin Deng
Zhaopeng Tu
Ningyu Zhang
N. Zhang
LLMSV
135
0
0
23 May 2025
Navigate the Unknown: Enhancing LLM Reasoning with Intrinsic Motivation Guided Exploration
Navigate the Unknown: Enhancing LLM Reasoning with Intrinsic Motivation Guided Exploration
Jingtong Gao
Ling Pan
Yejing Wang
Rui Zhong
Chi Lu
Qingpeng Cai
Peng Jiang
Xiangyu Zhao
LRM
85
1
0
23 May 2025
EasyEdit2: An Easy-to-use Steering Framework for Editing Large Language Models
EasyEdit2: An Easy-to-use Steering Framework for Editing Large Language Models
Ziwen Xu
Shuxun Wang
Kewei Xu
Haoming Xu
Mengru Wang
Xinle Deng
Yunzhi Yao
Guozhou Zheng
Ningyu Zhang
Xin Xu
KELMLLMSV
479
1
0
21 Apr 2025
Robust Hallucination Detection in LLMs via Adaptive Token Selection
Robust Hallucination Detection in LLMs via Adaptive Token Selection
Mengjia Niu
Hamed Haddadi
Guansong Pang
HILM
107
0
0
10 Apr 2025
1