ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2209.14610
  4. Cited By
Dynamic Prompt Learning via Policy Gradient for Semi-structured
  Mathematical Reasoning

Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning

29 September 2022
Pan Lu
Liang Qiu
Kai-Wei Chang
Ying Nian Wu
Song-Chun Zhu
Tanmay Rajpurohit
Peter Clark
A. Kalyan
    ReLM
    LRM
ArXivPDFHTML

Papers citing "Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning"

48 / 48 papers shown
Title
LLM-based Automated Grading with Human-in-the-Loop
LLM-based Automated Grading with Human-in-the-Loop
Hang Li
Yucheng Chu
Kaiqi Yang
Yasemin Copur-Gencturk
Jiliang Tang
AI4Ed
ELM
59
0
0
07 Apr 2025
Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks
Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks
Jiawei Wang
Yushen Zuo
Yuanjun Chai
Z. Liu
Yichen Fu
Yichun Feng
Kin-Man Lam
AAML
VLM
40
0
0
02 Apr 2025
MLLM-Selector: Necessity and Diversity-driven High-Value Data Selection for Enhanced Visual Instruction Tuning
MLLM-Selector: Necessity and Diversity-driven High-Value Data Selection for Enhanced Visual Instruction Tuning
Yiwei Ma
Guohai Xu
Xiaoshuai Sun
Jiayi Ji
Jie Lou
Debing Zhang
Rongrong Ji
90
0
0
26 Mar 2025
Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models
Wenxuan Huang
Bohan Jia
Zijie Zhai
Shaosheng Cao
Zheyu Ye
Fei Zhao
Zhe Xu
Yao Hu
Shaohui Lin
MU
OffRL
LRM
MLLM
ReLM
VLM
55
37
0
09 Mar 2025
Can Atomic Step Decomposition Enhance the Self-structured Reasoning of Multimodal Large Models?
Kun Xiang
Zhili Liu
Zihao Jiang
Yunshuang Nie
Kaixin Cai
...
Yu-Jie Yuan
J. Han
Lanqing Hong
Hang Xu
Xiaodan Liang
ReLM
LRM
51
6
0
08 Mar 2025
M2-omni: Advancing Omni-MLLM for Comprehensive Modality Support with Competitive Performance
M2-omni: Advancing Omni-MLLM for Comprehensive Modality Support with Competitive Performance
Qingpei Guo
Kaiyou Song
Zipeng Feng
Ziping Ma
Qinglong Zhang
...
Yunxiao Sun
Tai-WeiChang
Jingdong Chen
Ming Yang
Jun Zhou
MLLM
VLM
77
3
0
26 Feb 2025
Parameter Efficient Merging for Multimodal Large Language Models with Complementary Parameter Adaptation
Parameter Efficient Merging for Multimodal Large Language Models with Complementary Parameter Adaptation
Fanhu Zeng
Haiyang Guo
Fei Zhu
Li Shen
Hao Tang
MoMe
49
1
0
24 Feb 2025
Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale
Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale
Fan Zhou
Zengzhi Wang
Qian Liu
Junlong Li
Pengfei Liu
ALM
100
14
0
17 Feb 2025
Granite Vision: a lightweight, open-source multimodal model for enterprise Intelligence
Granite Vision: a lightweight, open-source multimodal model for enterprise Intelligence
Granite Vision Team
Leonid Karlinsky
Assaf Arbelle
Abraham Daniels
A. Nassar
...
Sriram Raghavan
T. Syeda-Mahmood
Peter W. J. Staar
Tal Drory
Rogerio Feris
VLM
AI4TS
102
0
0
14 Feb 2025
Rationalization Models for Text-to-SQL
Rationalization Models for Text-to-SQL
Gaetano Rossiello
Nhan Pham
Michael R. Glass
Junkyu Lee
Shankar Subramanian
ReLM
LRM
50
0
0
10 Feb 2025
Baichuan-Omni-1.5 Technical Report
Yadong Li
J. Liu
Tao Zhang
Tao Zhang
S. Chen
...
Jianhua Xu
Haoze Sun
Mingan Lin
Zenan Zhou
Weipeng Chen
AuLLM
67
10
0
28 Jan 2025
Can ChatGPT Overcome Behavioral Biases in the Financial Sector? Classify-and-Rethink: Multi-Step Zero-Shot Reasoning in the Gold Investment
Can ChatGPT Overcome Behavioral Biases in the Financial Sector? Classify-and-Rethink: Multi-Step Zero-Shot Reasoning in the Gold Investment
Shuoling Liu
Gaoguo Jia
Yuhang Jiang
Liyuan Chen
Qiang Yang
AIFin
LRM
89
0
0
17 Jan 2025
Mathematical Language Models: A Survey
Mathematical Language Models: A Survey
W. Liu
Hanglei Hu
Jie Zhou
Yuyang Ding
Junsong Li
...
Mengliang He
Qin Chen
Bo Jiang
Aimin Zhou
Liang He
LRM
77
12
0
03 Jan 2025
Chimera: Improving Generalist Model with Domain-Specific Experts
Chimera: Improving Generalist Model with Domain-Specific Experts
Tianshuo Peng
M. Li
Hongbin Zhou
Renqiu Xia
Renrui Zhang
...
Aojun Zhou
Botian Shi
Tao Chen
Bo Zhang
Xiangyu Yue
86
4
0
08 Dec 2024
Dynamic Uncertainty Ranking: Enhancing Retrieval-Augmented In-Context Learning for Long-Tail Knowledge in LLMs
Dynamic Uncertainty Ranking: Enhancing Retrieval-Augmented In-Context Learning for Long-Tail Knowledge in LLMs
Shuyang Yu
Runxue Bao
Parminder Bhatia
Taha A. Kass-Hout
Jiayu Zhou
Cao Xiao
26
1
0
31 Oct 2024
Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling
Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling
W. Xu
Rujun Han
Z. Wang
L. Le
Dhruv Madeka
Lei Li
W. Wang
Rishabh Agarwal
Chen-Yu Lee
Tomas Pfister
69
8
0
15 Oct 2024
Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
Gen Luo
Xue Yang
Wenhan Dou
Zhaokai Wang
Jifeng Dai
Jifeng Dai
Yu Qiao
Xizhou Zhu
VLM
MLLM
59
25
0
10 Oct 2024
Differential Transformer
Differential Transformer
Tianzhu Ye
Li Dong
Yuqing Xia
Yutao Sun
Yi Zhu
Gao Huang
Furu Wei
50
0
0
07 Oct 2024
GraphIC: A Graph-Based In-Context Example Retrieval Model for Multi-Step Reasoning
GraphIC: A Graph-Based In-Context Example Retrieval Model for Multi-Step Reasoning
Jiale Fu
Yaqing Wang
Simeng Han
Jiaming Fan
Chen Si
23
1
0
03 Oct 2024
AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge
AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge
Han Wang
Archiki Prasad
Elias Stengel-Eskin
Mohit Bansal
75
5
0
11 Sep 2024
Visual Agents as Fast and Slow Thinkers
Visual Agents as Fast and Slow Thinkers
Guangyan Sun
Mingyu Jin
Zhenting Wang
Cheng-Long Wang
Siqi Ma
Qifan Wang
Ying Nian Wu
Ying Nian Wu
Dongfang Liu
Dongfang Liu
LLMAG
LRM
74
12
0
16 Aug 2024
Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization
Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization
Yuxin Jiang
Bo Huang
Yufei Wang
Xingshan Zeng
Liangyou Li
Yasheng Wang
Xin Jiang
Lifeng Shang
Ruiming Tang
Wei Wang
42
5
0
14 Aug 2024
Demonstration Notebook: Finding the Most Suited In-Context Learning
  Example from Interactions
Demonstration Notebook: Finding the Most Suited In-Context Learning Example from Interactions
Yiming Tang
Bin Dong
29
0
0
16 Jun 2024
Concentrate Attention: Towards Domain-Generalizable Prompt Optimization
  for Language Models
Concentrate Attention: Towards Domain-Generalizable Prompt Optimization for Language Models
Chengzhengxu Li
Xiaoming Liu
Zhaohan Zhang
Yichen Wang
Chen Liu
Y. Lan
Chao Shen
35
2
0
15 Jun 2024
MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal
  Large Language Models
MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models
Tianle Gu
Zeyang Zhou
Kexin Huang
Dandan Liang
Yixu Wang
...
Keqing Wang
Yujiu Yang
Yan Teng
Yu Qiao
Yingchun Wang
ELM
37
9
0
11 Jun 2024
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning
Joongwon Kim
Bhargavi Paranjape
Tushar Khot
Hannaneh Hajishirzi
LM&Ro
ELM
LLMAG
LRM
29
8
0
10 Jun 2024
Large Language Models Meet NLP: A Survey
Large Language Models Meet NLP: A Survey
Libo Qin
Qiguang Chen
Xiachong Feng
Yang Wu
Yongheng Zhang
Yinghui Li
Min Li
Wanxiang Che
Philip S. Yu
ALM
LM&MA
ELM
LRM
38
44
0
21 May 2024
A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language
  Models
A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models
Wenqi Fan
Yujuan Ding
Liang-bo Ning
Shijie Wang
Hengyun Li
Dawei Yin
Tat-Seng Chua
Qing Li
RALM
3DV
38
178
0
10 May 2024
Distilling Reasoning Ability from Large Language Models with Adaptive
  Thinking
Distilling Reasoning Ability from Large Language Models with Adaptive Thinking
Xiao Chen
Sihang Zhou
K. Liang
Xinwang Liu
ReLM
LRM
27
2
0
14 Apr 2024
SAAS: Solving Ability Amplification Strategy for Enhanced Mathematical
  Reasoning in Large Language Models
SAAS: Solving Ability Amplification Strategy for Enhanced Mathematical Reasoning in Large Language Models
Hyeonwoo Kim
Gyoungjin Gim
Yungi Kim
Jihoo Kim
Byungju Kim
Wonseok Lee
Chanjun Park
ReLM
LRM
29
1
0
05 Apr 2024
PURPLE: Making a Large Language Model a Better SQL Writer
PURPLE: Making a Large Language Model a Better SQL Writer
Tonghui Ren
Yuankai Fan
Zhenying He
Ren Huang
Jiaqi Dai
Can Huang
Yinan Jing
Kai Zhang
Yifan Yang
Xiaoyang Sean Wang
21
21
0
29 Mar 2024
ConceptMath: A Bilingual Concept-wise Benchmark for Measuring
  Mathematical Reasoning of Large Language Models
ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models
Yanan Wu
Jie Liu
Xingyuan Bu
Jiaheng Liu
Zhanhui Zhou
...
Haibin Chen
Tiezheng Ge
Wanli Ouyang
Wenbo Su
Bo Zheng
LRM
27
6
0
22 Feb 2024
Demystifying Chains, Trees, and Graphs of Thoughts
Demystifying Chains, Trees, and Graphs of Thoughts
Maciej Besta
Florim Memedi
Zhenyu Zhang
Robert Gerstenberger
Guangyuan Piao
...
Aleš Kubíček
H. Niewiadomski
Aidan O'Mahony
Onur Mutlu
Torsten Hoefler
AI4CE
LRM
63
26
0
25 Jan 2024
Toolink: Linking Toolkit Creation and Using through Chain-of-Solving on
  Open-Source Model
Toolink: Linking Toolkit Creation and Using through Chain-of-Solving on Open-Source Model
Cheng Qian
Chenyan Xiong
Zhenghao Liu
Zhiyuan Liu
LRM
24
12
0
08 Oct 2023
Tool Documentation Enables Zero-Shot Tool-Usage with Large Language
  Models
Tool Documentation Enables Zero-Shot Tool-Usage with Large Language Models
Cheng-Yu Hsieh
Sibei Chen
Chun-Liang Li
Yasuhisa Fujii
Alexander Ratner
Chen-Yu Lee
Ranjay Krishna
Tomas Pfister
LLMAG
SyDa
29
40
0
01 Aug 2023
Counterfactual Explanation Policies in RL
Counterfactual Explanation Policies in RL
Shripad Deshmukh
R Srivatsan
Supriti Vijay
Jayakumar Subramanian
Chirag Agarwal
OffRL
14
0
0
25 Jul 2023
AutoHint: Automatic Prompt Optimization with Hint Generation
AutoHint: Automatic Prompt Optimization with Hint Generation
Hong Sun
Xue Li
Yi Xu
Youkow Homma
Qinhao Cao
Min-man Wu
Jian Jiao
Denis Xavier Charles
27
23
0
13 Jul 2023
Reasoning with Language Model Prompting: A Survey
Reasoning with Language Model Prompting: A Survey
Shuofei Qiao
Yixin Ou
Ningyu Zhang
Xiang Chen
Yunzhi Yao
Shumin Deng
Chuanqi Tan
Fei Huang
Huajun Chen
ReLM
ELM
LRM
44
307
0
19 Dec 2022
UniGeo: Unifying Geometry Logical Reasoning via Reformulating
  Mathematical Expression
UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression
Jiaqi Chen
Tong Li
Jinghui Qin
Pan Lu
Liang Lin
Chongyu Chen
Xiaodan Liang
AIMat
LRM
37
89
0
06 Dec 2022
Program of Thoughts Prompting: Disentangling Computation from Reasoning
  for Numerical Reasoning Tasks
Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks
Wenhu Chen
Xueguang Ma
Xinyi Wang
William W. Cohen
ReLM
ReCod
LRM
41
729
0
22 Nov 2022
Improving alignment of dialogue agents via targeted human judgements
Improving alignment of dialogue agents via targeted human judgements
Amelia Glaese
Nat McAleese
Maja Trkebacz
John Aslanides
Vlad Firoiu
...
John F. J. Mellor
Demis Hassabis
Koray Kavukcuoglu
Lisa Anne Hendricks
G. Irving
ALM
AAML
225
495
0
28 Sep 2022
Learn to Explain: Multimodal Reasoning via Thought Chains for Science
  Question Answering
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
A. Kalyan
ELM
ReLM
LRM
207
1,089
0
20 Sep 2022
Large Language Models are Zero-Shot Reasoners
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
291
4,048
0
24 May 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
297
3,163
0
21 Mar 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,730
0
04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
Fantastically Ordered Prompts and Where to Find Them: Overcoming
  Few-Shot Prompt Order Sensitivity
Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity
Yao Lu
Max Bartolo
Alastair Moore
Sebastian Riedel
Pontus Stenetorp
AILaw
LRM
274
1,114
0
18 Apr 2021
What Makes Good In-Context Examples for GPT-$3$?
What Makes Good In-Context Examples for GPT-333?
Jiachang Liu
Dinghan Shen
Yizhe Zhang
Bill Dolan
Lawrence Carin
Weizhu Chen
AAML
RALM
275
1,296
0
17 Jan 2021
1