ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2509.03817
  4. Cited By
Learning to Deliberate: Meta-policy Collaboration for Agentic LLMs with Multi-agent Reinforcement Learning
v1v2 (latest)

Learning to Deliberate: Meta-policy Collaboration for Agentic LLMs with Multi-agent Reinforcement Learning

4 September 2025
Wei Yang
Jesse Thomason
ArXiv (abs)PDFHTML

Papers citing "Learning to Deliberate: Meta-policy Collaboration for Agentic LLMs with Multi-agent Reinforcement Learning"

3 / 3 papers shown
Maestro: Learning to Collaborate via Conditional Listwise Policy Optimization for Multi-Agent LLMs
Maestro: Learning to Collaborate via Conditional Listwise Policy Optimization for Multi-Agent LLMsISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences (ISPRS Annals), 2025
Wei Yang
Jiacheng Pang
Shixuan Li
P. Bogdan
Stephen Tu
Jesse Thomason
LLMAG
396
1
0
08 Nov 2025
Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning
Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning
Yoonjeon Kim
Doohyuk Jang
Eunho Yang
ReLMAIFinLRM
198
1
0
26 Sep 2025
A Survey of Reasoning and Agentic Systems in Time Series with Large Language Models
A Survey of Reasoning and Agentic Systems in Time Series with Large Language Models
Ching Chang
Yidan Shi
Defu Cao
Wei Yang
Jeehyun Hwang
...
Jiacheng Pang
Wei-Yao Wang
Yan Liu
Wen-Chih Peng
Tien-Fu Chen
AI4TSLRM
215
1
0
15 Sep 2025
1