ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.00069
39
0

Societal Alignment Frameworks Can Improve LLM Alignment

27 February 2025
Karolina Stañczak
Nicholas Meade
Mehar Bhatia
Hattie Zhou
Konstantin Böttinger
Jeremy Barnes
Jason Stanley
Jessica Montgomery
Richard Zemel
Nicolas Papernot
Nicolas Chapados
Denis Therien
Timothy P. Lillicrap
Ana Marasović
Sylvie Delacroix
Gillian K. Hadfield
Siva Reddy
ArXivPDFHTML
Abstract

Recent progress in large language models (LLMs) has focused on producing responses that meet human expectations and align with shared values - a process coined alignment. However, aligning LLMs remains challenging due to the inherent disconnect between the complexity of human values and the narrow nature of the technological approaches designed to address them. Current alignment methods often lead to misspecified objectives, reflecting the broader issue of incomplete contracts, the impracticality of specifying a contract between a model developer, and the model that accounts for every scenario in LLM alignment. In this paper, we argue that improving LLM alignment requires incorporating insights from societal alignment frameworks, including social, economic, and contractual alignment, and discuss potential solutions drawn from these domains. Given the role of uncertainty within societal alignment frameworks, we then investigate how it manifests in LLM alignment. We end our discussion by offering an alternative view on LLM alignment, framing the underspecified nature of its objectives as an opportunity rather than perfect their specification. Beyond technical improvements in LLM alignment, we discuss the need for participatory alignment interface designs.

View on arXiv
@article{stańczak2025_2503.00069,
  title={ Societal Alignment Frameworks Can Improve LLM Alignment },
  author={ Karolina Stańczak and Nicholas Meade and Mehar Bhatia and Hattie Zhou and Konstantin Böttinger and Jeremy Barnes and Jason Stanley and Jessica Montgomery and Richard Zemel and Nicolas Papernot and Nicolas Chapados and Denis Therien and Timothy P. Lillicrap and Ana Marasović and Sylvie Delacroix and Gillian K. Hadfield and Siva Reddy },
  journal={arXiv preprint arXiv:2503.00069},
  year={ 2025 }
}
Comments on this paper