ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.08414
  4. Cited By
Discovering Preference Optimization Algorithms with and for Large
  Language Models

Discovering Preference Optimization Algorithms with and for Large Language Models

12 June 2024
Chris Xiaoxuan Lu
Samuel Holt
Claudio Fanconi
Alex J. Chan
Jakob Foerster
M. Schaar
R. T. Lange
    OffRL
ArXivPDFHTML

Papers citing "Discovering Preference Optimization Algorithms with and for Large Language Models"

14 / 14 papers shown
Title
MLRC-Bench: Can Language Agents Solve Machine Learning Research Challenges?
MLRC-Bench: Can Language Agents Solve Machine Learning Research Challenges?
Yunxiang Zhang
Muhammad Khalifa
Shitanshu Bhushan
Grant D Murphy
Lajanugen Logeswaran
Jaekyeom Kim
Moontae Lee
Honglak Lee
Lu Wang
LLMAG
ELM
62
0
0
13 Apr 2025
Automating quantum feature map design via large language models
Automating quantum feature map design via large language models
Kenya Sakka
K. Mitarai
Keisuke Fujii
25
0
0
10 Apr 2025
Algorithm Discovery With LLMs: Evolutionary Search Meets Reinforcement Learning
Algorithm Discovery With LLMs: Evolutionary Search Meets Reinforcement Learning
Anja Surina
Amin Mansouri
Lars Quaedvlieg
Amal Seddas
Maryna Viazovska
Emmanuel Abbe
Çağlar Gülçehre
31
0
0
07 Apr 2025
Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning
Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning
Kai Ye
Hongyi Zhou
Jin Zhu
Francesco Quinzan
C. Shi
20
0
0
03 Apr 2025
LLM-Guided Search for Deletion-Correcting Codes
LLM-Guided Search for Deletion-Correcting Codes
Franziska Weindel
Reinhard Heckel
LRM
56
0
0
01 Apr 2025
Generative Adversarial Reviews: When LLMs Become the Critic
Generative Adversarial Reviews: When LLMs Become the Critic
Nicolas Bougie
Narimasa Watanabe
65
2
0
09 Dec 2024
A Flexible Large Language Models Guardrail Development Methodology Applied to Off-Topic Prompt Detection
A Flexible Large Language Models Guardrail Development Methodology Applied to Off-Topic Prompt Detection
Gabriel Chua
Shing Yee Chan
Shaun Khoo
75
1
0
20 Nov 2024
RAGulator: Lightweight Out-of-Context Detectors for Grounded Text
  Generation
RAGulator: Lightweight Out-of-Context Detectors for Grounded Text Generation
Ian Poey
Jiajun Liu
Qishuai Zhong
Adrien Chenailler
50
0
0
06 Nov 2024
Automatically Learning Hybrid Digital Twins of Dynamical Systems
Automatically Learning Hybrid Digital Twins of Dynamical Systems
Samuel Holt
Tennison Liu
M. Schaar
AI4CE
18
2
0
31 Oct 2024
Automating Traffic Model Enhancement with AI Research Agent
Automating Traffic Model Enhancement with AI Research Agent
Xusen Guo
Xinxi Yang
Mingxing Peng
Hongliang Lu
Meixin Zhu
Hai Yang
60
0
0
25 Sep 2024
Automated Design of Agentic Systems
Automated Design of Agentic Systems
Shengran Hu
Cong Lu
Jeff Clune
AI4CE
34
36
0
15 Aug 2024
Right Now, Wrong Then: Non-Stationary Direct Preference Optimization
  under Preference Drift
Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift
Seongho Son
William Bankes
Sayak Ray Chowdhury
Brooks Paige
Ilija Bogunovic
27
4
0
26 Jul 2024
Gemma: Open Models Based on Gemini Research and Technology
Gemma: Open Models Based on Gemini Research and Technology
Gemma Team
Gemma Team Thomas Mesnard
Cassidy Hardin
Robert Dadashi
Surya Bhupatiraju
...
Armand Joulin
Noah Fiedel
Evan Senter
Alek Andreev
Kathleen Kenealy
VLM
LLMAG
123
415
0
13 Mar 2024
Extracting Training Data from Large Language Models
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
267
1,798
0
14 Dec 2020
1