ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.16048
  4. Cited By
AI Alignment and Social Choice: Fundamental Limitations and Policy
  Implications

AI Alignment and Social Choice: Fundamental Limitations and Policy Implications

24 October 2023
Abhilash Mishra
ArXivPDFHTML

Papers citing "AI Alignment and Social Choice: Fundamental Limitations and Policy Implications"

17 / 17 papers shown
Title
Synthetic media and computational capitalism: towards a critical theory of artificial intelligence
Synthetic media and computational capitalism: towards a critical theory of artificial intelligence
David M. Berry
39
0
0
22 Mar 2025
Balancing Innovation and Integrity: AI Integration in Liberal Arts College Administration
Ian Olivo Read
36
0
0
20 Feb 2025
Game Theory Meets Large Language Models: A Systematic Survey
Game Theory Meets Large Language Models: A Systematic Survey
Haoran Sun
Yusen Wu
Yukun Cheng
Xu Chu
LM&MA
OffRL
AI4CE
60
1
0
13 Feb 2025
Style Outweighs Substance: Failure Modes of LLM Judges in Alignment Benchmarking
Style Outweighs Substance: Failure Modes of LLM Judges in Alignment Benchmarking
Benjamin Feuer
Micah Goldblum
Teresa Datta
Sanjana Nambiar
Raz Besaleli
Samuel Dooley
Max Cembalest
John P. Dickerson
ALM
37
0
0
28 Jan 2025
Can an AI Agent Safely Run a Government? Existence of Probably
  Approximately Aligned Policies
Can an AI Agent Safely Run a Government? Existence of Probably Approximately Aligned Policies
Frédéric Berdoz
Roger Wattenhofer
93
0
0
21 Nov 2024
Representative Social Choice: From Learning Theory to AI Alignment
Representative Social Choice: From Learning Theory to AI Alignment
Tianyi Qiu
FedML
38
2
0
31 Oct 2024
Adaptive Alignment: Dynamic Preference Adjustments via Multi-Objective
  Reinforcement Learning for Pluralistic AI
Adaptive Alignment: Dynamic Preference Adjustments via Multi-Objective Reinforcement Learning for Pluralistic AI
Hadassah Harland
Richard Dazeley
Peter Vamplew
Hashini Senaratne
Bahareh Nakisa
Francisco Cruz
29
2
0
31 Oct 2024
Beyond Preferences in AI Alignment
Beyond Preferences in AI Alignment
Tan Zhi-Xuan
Micah Carroll
Matija Franklin
Hal Ashton
35
16
0
30 Aug 2024
Improving Context-Aware Preference Modeling for Language Models
Improving Context-Aware Preference Modeling for Language Models
Silviu Pitis
Ziang Xiao
Nicolas Le Roux
Alessandro Sordoni
38
8
0
20 Jul 2024
Axioms for AI Alignment from Human Feedback
Axioms for AI Alignment from Human Feedback
Luise Ge
Daniel Halpern
Evi Micha
Ariel D. Procaccia
Itai Shapira
Yevgeniy Vorobeychik
Junlin Wu
41
15
0
23 May 2024
Mapping Social Choice Theory to RLHF
Mapping Social Choice Theory to RLHF
Jessica Dai
Eve Fleisig
27
11
0
19 Apr 2024
Social Choice Should Guide AI Alignment in Dealing with Diverse Human
  Feedback
Social Choice Should Guide AI Alignment in Dealing with Diverse Human Feedback
Vincent Conitzer
Rachel Freedman
J. Heitzig
Wesley H. Holliday
Bob M. Jacobs
...
Eric Pacuit
Stuart Russell
Hailey Schoelkopf
Emanuel Tewolde
W. Zwicker
33
28
0
16 Apr 2024
A Roadmap to Pluralistic Alignment
A Roadmap to Pluralistic Alignment
Taylor Sorensen
Jared Moore
Jillian R. Fisher
Mitchell L. Gordon
Niloofar Mireshghallah
...
Liwei Jiang
Ximing Lu
Nouha Dziri
Tim Althoff
Yejin Choi
65
80
0
07 Feb 2024
Distributional Preference Learning: Understanding and Accounting for
  Hidden Context in RLHF
Distributional Preference Learning: Understanding and Accounting for Hidden Context in RLHF
Anand Siththaranjan
Cassidy Laidlaw
Dylan Hadfield-Menell
29
55
0
13 Dec 2023
Open Problems and Fundamental Limitations of Reinforcement Learning from
  Human Feedback
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Stephen Casper
Xander Davies
Claudia Shi
T. Gilbert
Jérémy Scheurer
...
Erdem Biyik
Anca Dragan
David M. Krueger
Dorsa Sadigh
Dylan Hadfield-Menell
ALM
OffRL
47
472
0
27 Jul 2023
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
313
11,915
0
04 Mar 2022
Fine-Tuning Language Models from Human Preferences
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
280
1,587
0
18 Sep 2019
1