ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2507.02833
  4. Cited By
Generalizing Verifiable Instruction Following
v1v2 (latest)

Generalizing Verifiable Instruction Following

3 July 2025
Valentina Pyatkin
Saumya Malik
Victoria Graf
Michal Guerquin
Shengyi Huang
Pradeep Dasigi
Nathan Lambert
Hannaneh Hajishirzi
    ALM
ArXiv (abs)PDFHTML

Papers citing "Generalizing Verifiable Instruction Following"

14 / 14 papers shown
Title
Vibe Checker: Aligning Code Evaluation with Human Preference
Vibe Checker: Aligning Code Evaluation with Human Preference
Ming Zhong
Xiang Zhou
T. Chang
Q. Wang
Nan Xu
...
Shyam Upadhyay
Jeremiah Zhe Liu
Jiawei Han
Benoit Schillings
Jiao Sun
16
0
0
08 Oct 2025
OpenJAI-v1.0: An Open Thai Large Language Model
OpenJAI-v1.0: An Open Thai Large Language Model
Pontakorn Trakuekul
Attapol T. Rutherford
Jullajak Karnjanaekarin
Narongkorn Panitsrisit
Sumana Sumanakul
OSLMVLMLRM
48
0
0
08 Oct 2025
Game-Time: Evaluating Temporal Dynamics in Spoken Language Models
Game-Time: Evaluating Temporal Dynamics in Spoken Language Models
Kai-Wei Chang
En-Pei Hu
Chun-Yi Kuan
Wenze Ren
Wei-Chih Chen
Guan-Ting Lin
Yu Tsao
Shao-Hua Sun
Hung-yi Lee
James R. Glass
AuLLM
7
2
0
30 Sep 2025
Position: The Hidden Costs and Measurement Gaps of Reinforcement Learning with Verifiable Rewards
Position: The Hidden Costs and Measurement Gaps of Reinforcement Learning with Verifiable Rewards
Aaron Tu
Weihao Xuan
Heli Qi
X. Y. Huang
Qingcheng Zeng
...
Amin Saberi
Naoto Yokoya
Jure Leskovec
Yejin Choi
Fang Wu
OffRL
8
0
0
26 Sep 2025
RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards
RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards
Zhilin Wang
Jiaqi Zeng
Olivier Delalleau
Ellie Evans
Daniel Egert
Hoo-Chang Shin
Felipe Soares
Yi Dong
Oleksii Kuchaiev
OffRL
12
0
0
25 Sep 2025
The role of synthetic data in Multilingual, Multi-cultural AI systems: Lessons from Indic Languages
The role of synthetic data in Multilingual, Multi-cultural AI systems: Lessons from Indic Languages
Pranjal A. Chitale
Varun Gumma
Sanchit Ahuja
Prashant Kodali
Manan Uppadhyay
Deepthi Sudharsan
Sunayana Sitaram
SyDa
40
0
0
25 Sep 2025
Maestro: Joint Graph & Config Optimization for Reliable AI Agents
Maestro: Joint Graph & Config Optimization for Reliable AI Agents
Wenxiao Wang
Priyatham Kattakinda
Soheil Feizi
LLMAG
8
0
0
04 Sep 2025
Hermes 4 Technical Report
Hermes 4 Technical Report
Ryan Teknium
Roger Jin
Jai Suphavadeeprasit
Dakota Mahan
Jeffrey Quesnelle
Joe Li
Chen Guang
Shannon Sands
Karan Malhotra
62
1
0
25 Aug 2025
Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning
Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning
Yang Zhou
Sunzhu Li
Shunyu Liu
Wenkai Fang
Jiale Zhao
...
Hengtong Lu
Wei Chen
Yan Xie
Mingli Song
Mingli Song
LRM
44
3
0
23 Aug 2025
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers
Ziyang Luo
Zhiqi Shen
Wenzhuo Yang
Zirui Zhao
Prathyusha Jwalapuram
Amrita Saha
Doyen Sahoo
Silvio Savarese
Caiming Xiong
Junnan Li
ELM
60
8
0
20 Aug 2025
Complex Logical Instruction Generation
Complex Logical Instruction Generation
Mian Zhang
Shujian Liu
Sixun Dong
Ming Yin
Yebowen Hu
...
Song Wang
Sathish Indurthi
Haoyun Deng
Zhiyu Zoey Chen
Kaiqiang Song
LRM
50
1
0
12 Aug 2025
IFDECORATOR: Wrapping Instruction Following Reinforcement Learning with Verifiable Rewards
IFDECORATOR: Wrapping Instruction Following Reinforcement Learning with Verifiable Rewards
Xu Guo
Tianyi Liang
Tong Jian
Xiaogui Yang
Ling-I Wu
Chenhui Li
Z. Lu
Qipeng Guo
Kai Chen
90
2
0
06 Aug 2025
Beyond the Trade-off: Self-Supervised Reinforcement Learning for Reasoning Models' Instruction Following
Beyond the Trade-off: Self-Supervised Reinforcement Learning for Reasoning Models' Instruction Following
Qingyu Ren
Qianyu He
Bowei Zhang
Jie Zeng
Jiaqing Liang
Yanghua Xiao
Weikang Zhou
Zeye Sun
Fei Yu
OffRLLRM
26
0
0
04 Aug 2025
Checklists Are Better Than Reward Models For Aligning Language Models
Checklists Are Better Than Reward Models For Aligning Language Models
Vijay Viswanathan
Yanchao Sun
Shuang Ma
Xiang Kong
Meng Cao
Graham Neubig
Tongshuang Wu
ALM
110
10
0
24 Jul 2025
1