ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2507.21848
  4. Cited By
EDGE-GRPO: Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity

EDGE-GRPO: Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity

29 July 2025
Xingjian Zhang
Siwei Wen
Wenjun Wu
Lei Huang
ArXiv (abs)PDFHTMLHuggingFace (6 upvotes)Github (17★)

Papers citing "EDGE-GRPO: Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity"

7 / 7 papers shown
Beyond High-Entropy Exploration: Correctness-Aware Low-Entropy Segment-Based Advantage Shaping for Reasoning LLMs
Xinzhu Chen
Xuesheng Li
Zhongxiang Sun
Weijie Yu
LRM
108
1
0
30 Nov 2025
Arbitrary Entropy Policy Optimization Breaks The Exploration Bottleneck of Reinforcement Learning
Arbitrary Entropy Policy Optimization Breaks The Exploration Bottleneck of Reinforcement Learning
Chen Wang
Ruoyao Xiao
Jionghao Bai
Yuzhi Zhang
Shisheng Cui
Zhou Zhao
Yue Wang
369
0
0
09 Oct 2025
Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning
Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning
Yoonjeon Kim
Doohyuk Jang
Eunho Yang
ReLMAIFinLRM
202
1
0
26 Sep 2025
FastGRPO: Accelerating Policy Optimization via Concurrency-aware Speculative Decoding and Online Draft Learning
FastGRPO: Accelerating Policy Optimization via Concurrency-aware Speculative Decoding and Online Draft Learning
Yizhou Zhang
Ning Lv
T. Wang
Jisheng Dang
OffRLLRM
131
1
0
26 Sep 2025
From Uniform to Heterogeneous: Tailoring Policy Optimization to Every Token's Nature
From Uniform to Heterogeneous: Tailoring Policy Optimization to Every Token's Nature
Zheng Liu
Mengjie Liu
Siwei Wen
Mengzhang Cai
Bin Cui
Conghui He
W. Zhang
AAML
82
3
0
20 Sep 2025
Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents
Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents
Jiawei Wang
Jiacai Liu
Y. Fu
Y. Li
Xintao Wang
Yuan Lin
Yu Yue
L. Zhang
Y. X. R. Wang
Ke Wang
156
12
0
11 Sep 2025
Efficiency-Effectiveness Reranking FLOPs for LLM-based Rerankers
Efficiency-Effectiveness Reranking FLOPs for LLM-based Rerankers
Zhiyuan Peng
Ting-ruen Wei
Tingyu Song
Yilun Zhao
243
0
0
08 Jul 2025
1
Page 1 of 1