Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.00087
Cited By
ARES: Alternating Reinforcement Learning and Supervised Fine-Tuning for Enhanced Multi-Modal Chain-of-Thought Reasoning Through Diverse AI Feedback
25 June 2024
Ju-Seung Byun
Jiyun Chun
Jihyung Kil
Andrew Perrault
ReLM
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ARES: Alternating Reinforcement Learning and Supervised Fine-Tuning for Enhanced Multi-Modal Chain-of-Thought Reasoning Through Diverse AI Feedback"
6 / 6 papers shown
Title
Improving LLM-as-a-Judge Inference with the Judgment Distribution
Victor Wang
Michael J.Q. Zhang
Eunsol Choi
49
0
0
04 Mar 2025
Improving alignment of dialogue agents via targeted human judgements
Amelia Glaese
Nat McAleese
Maja Trkebacz
John Aslanides
Vlad Firoiu
...
John F. J. Mellor
Demis Hassabis
Koray Kavukcuoglu
Lisa Anne Hendricks
G. Irving
ALM
AAML
220
495
0
28 Sep 2022
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
A. Kalyan
ELM
ReLM
LRM
198
1,089
0
20 Sep 2022
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
291
2,712
0
24 May 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models
Alex Tamkin
Miles Brundage
Jack Clark
Deep Ganguli
AILaw
ELM
192
248
0
04 Feb 2021
1