ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.11611
  4. Cited By
Probing the Vulnerability of Large Language Models to Polysemantic Interventions

Probing the Vulnerability of Large Language Models to Polysemantic Interventions

16 May 2025
Bofan Gong
Shiyang Lai
Dawn Song
    AAMLMILM
ArXiv (abs)PDFHTML

Papers citing "Probing the Vulnerability of Large Language Models to Polysemantic Interventions"

4 / 4 papers shown
Title
Layers at Similar Depths Generate Similar Activations Across LLM Architectures
Layers at Similar Depths Generate Similar Activations Across LLM Architectures
Christopher Wolfram
Aaron Schein
100
2
0
03 Apr 2025
LLM Social Simulations Are a Promising Research Method
LLM Social Simulations Are a Promising Research Method
Jacy Reese Anthis
Ryan Liu
Sean M. Richardson
Austin C. Kozlowski
Bernard Koch
James A. Evans
Erik Brynjolfsson
Michael S. Bernstein
ALM
111
15
0
03 Apr 2025
Shared Global and Local Geometry of Language Model Embeddings
Shared Global and Local Geometry of Language Model Embeddings
Andrew Lee
Melanie Weber
F. Viégas
Martin Wattenberg
FedML
113
7
0
27 Mar 2025
Sparse Autoencoders Can Interpret Randomly Initialized Transformers
Sparse Autoencoders Can Interpret Randomly Initialized Transformers
Thomas Heap
Tim Lawson
Lucy Farnik
Laurence Aitchison
78
17
0
29 Jan 2025
1