ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.06238
  4. Cited By
EVOLvE: Evaluating and Optimizing LLMs For In-Context Exploration
v1v2 (latest)

EVOLvE: Evaluating and Optimizing LLMs For In-Context Exploration

8 October 2024
Allen Nie
Yi Su
B. Chang
Jonathan N. Lee
Ed H. Chi
Quoc Le
Minmin Chen
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "EVOLvE: Evaluating and Optimizing LLMs For In-Context Exploration"

10 / 10 papers shown
Title
Provably Learning from Language Feedback
Provably Learning from Language Feedback
Wanqiao Xu
Allen Nie
Ruijie Zheng
Aditya Modi
Adith Swaminathan
Ching-An Cheng
360
2
0
12 Jun 2025
e3: Learning to Explore Enables Extrapolation of Test-Time Compute for LLMs
Amrith Rajagopal Setlur
Matthew Y. R. Yang
Charlie Snell
Jeremy Greer
Ian Wu
Virginia Smith
Max Simchowitz
Aviral Kumar
LRM
259
26
0
10 Jun 2025
LLM-First Search: Self-Guided Exploration of the Solution Space
Nathan Herr
Tim Rocktaschel
Roberta Raileanu
LRM
309
1
0
05 Jun 2025
Reward Is Enough: LLMs Are In-Context Reinforcement Learners
Reward Is Enough: LLMs Are In-Context Reinforcement Learners
Kefan Song
Amir Moeini
Peng Wang
Lei Gong
Rohan Chandra
Yanjun Qi
Shangtong Zhang
ReLMLRM
331
3
0
21 May 2025
Improving the Data-efficiency of Reinforcement Learning by Warm-starting with LLM
Improving the Data-efficiency of Reinforcement Learning by Warm-starting with LLM
Thang Duong
Minglai Yang
Chicheng Zhang
OffRL
296
3
0
16 May 2025
Comparing Exploration-Exploitation Strategies of LLMs and Humans: Insights from Standard Multi-armed Bandit Experiments
Comparing Exploration-Exploitation Strategies of LLMs and Humans: Insights from Standard Multi-armed Bandit Experiments
Ziyuan Zhang
Darcy Wang
Ningyuan Chen
Rodrigo Mansur
Vahid Sarhangian
392
0
0
15 May 2025
Toward Efficient Exploration by Large Language Model Agents
Toward Efficient Exploration by Large Language Model Agents
Dilip Arumugam
Thomas L. Griffiths
LLMAG
406
10
0
29 Apr 2025
Prompt Optimization with Logged Bandit Data
Prompt Optimization with Logged Bandit Data
Haruka Kiyohara
Daniel Yiming Cao
Yuta Saito
Thorsten Joachims
398
0
0
03 Apr 2025
Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning
Yuxiao Qu
Matthew Y. R. Yang
Amrith Rajagopal Setlur
Lewis Tunstall
E. Beeching
Ruslan Salakhutdinov
Aviral Kumar
OffRL
398
82
0
10 Mar 2025
Can large language models explore in-context?
Can large language models explore in-context?Neural Information Processing Systems (NeurIPS), 2024
Akshay Krishnamurthy
Keegan Harris
Dylan J. Foster
Cyril Zhang
Aleksandrs Slivkins
LM&RoLLMAGLRM
578
53
0
22 Mar 2024
1