Sherlock: Self-Correcting Reasoning in Vision-Language Models

Sherlock: Self-Correcting Reasoning in Vision-Language Models

Papers citing "Sherlock: Self-Correcting Reasoning in Vision-Language Models"

39 / 39 papers shown
Title
Nash Learning from Human Feedback
Nash Learning from Human Feedback
Rémi Munos
Michal Valko
Daniele Calandriello
M. G. Azar
Mark Rowland
...
Nikola Momchev
Olivier Bachem
D. Mankowitz
Doina Precup
Bilal Piot
76
137
0
01 Dec 2023