ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.02273
22
31

Demystifying optimized prompts in language models

4 May 2025
Rimon Melamed
Lucas H. McCabe
H. H. Huang
ArXivPDFHTML
Abstract

Modern language models (LMs) are not robust to out-of-distribution inputs. Machine generated (``optimized'') prompts can be used to modulate LM outputs and induce specific behaviors while appearing completely uninterpretable. In this work, we investigate the composition of optimized prompts, as well as the mechanisms by which LMs parse and build predictions from optimized prompts. We find that optimized prompts primarily consist of punctuation and noun tokens which are more rare in the training data. Internally, optimized prompts are clearly distinguishable from natural language counterparts based on sparse subsets of the model's activations. Across various families of instruction-tuned models, optimized prompts follow a similar path in how their representations form through the network.

View on arXiv
@article{melamed2025_2505.02273,
  title={ Demystifying optimized prompts in language models },
  author={ Rimon Melamed and Lucas H. McCabe and H. Howie Huang },
  journal={arXiv preprint arXiv:2505.02273},
  year={ 2025 }
}
Comments on this paper