ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.14830
  4. Cited By
Orca-Math: Unlocking the potential of SLMs in Grade School Math

Orca-Math: Unlocking the potential of SLMs in Grade School Math

16 February 2024
Arindam Mitra
Hamed Khanpour
Corby Rosset
Ahmed Hassan Awadallah
    ALM
    MoE
    LRM
ArXivPDFHTML

Papers citing "Orca-Math: Unlocking the potential of SLMs in Grade School Math"

50 / 53 papers shown
Title
VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model
VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model
Zuwei Long
Yunhang Shen
Chaoyou Fu
Heting Gao
Lijiang Li
...
Jinlong Peng
Haoyu Cao
Ke Li
R. Ji
Xing Sun
30
0
0
06 May 2025
RV-Syn: Rational and Verifiable Mathematical Reasoning Data Synthesis based on Structured Function Library
RV-Syn: Rational and Verifiable Mathematical Reasoning Data Synthesis based on Structured Function Library
J. Wang
Jinhao Jiang
Zhiqiang Zhang
Jun Zhou
Wayne Xin Zhao
SyDa
53
0
0
29 Apr 2025
Parameter-Efficient Checkpoint Merging via Metrics-Weighted Averaging
Parameter-Efficient Checkpoint Merging via Metrics-Weighted Averaging
Shi Jie Yu
Sehyun Choi
MoMe
42
0
0
23 Apr 2025
a1: Steep Test-time Scaling Law via Environment Augmented Generation
a1: Steep Test-time Scaling Law via Environment Augmented Generation
Lingrui Mei
Shenghua Liu
Yiwei Wang
Baolong Bi
Yuyao Ge
Jun Wan
Yurong Wu
Xueqi Cheng
LRM
17
0
0
20 Apr 2025
Collaborative Learning of On-Device Small Model and Cloud-Based Large Model: Advances and Future Directions
Collaborative Learning of On-Device Small Model and Cloud-Based Large Model: Advances and Future Directions
Chaoyue Niu
Yucheng Ding
Junhui Lu
Zhengxiang Huang
Hang Zeng
Yutong Dai
Xuezhen Tu
Chengfei Lv
Fan Wu
Guihai Chen
24
0
0
17 Apr 2025
Breaking the Data Barrier -- Building GUI Agents Through Task Generalization
Breaking the Data Barrier -- Building GUI Agents Through Task Generalization
Junlei Zhang
Zichen Ding
Chang Ma
Zijie Chen
Qiushi Sun
Zhenzhong Lan
Junxian He
34
0
0
14 Apr 2025
Vision as LoRA
Vision as LoRA
Han Wang
Yongjie Ye
Bingru Li
Yuxiang Nie
Jinghui Lu
Jingqun Tang
Yanjie Wang
Can Huang
83
0
0
26 Mar 2025
The KoLMogorov Test: Compression by Code Generation
The KoLMogorov Test: Compression by Code Generation
Ori Yoran
Kunhao Zheng
Fabian Gloeckle
Jonas Gehring
Gabriel Synnaeve
Taco Cohen
58
1
0
18 Mar 2025
HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model
HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model
Tao Wang
Changxu Cheng
Lingfeng Wang
Senda Chen
Wuyue Zhao
VLM
64
0
0
17 Mar 2025
LLM Agents for Education: Advances and Applications
LLM Agents for Education: Advances and Applications
Zhendong Chu
Shen Wang
Jian Xie
Tinghui Zhu
Yibo Yan
...
Aoxiao Zhong
Xuming Hu
Jing Liang
Philip S. Yu
Qingsong Wen
LLMAG
ELM
103
1
0
14 Mar 2025
From Text to Visuals: Using LLMs to Generate Math Diagrams with Vector Graphics
Jaewook Lee
Jeongah Lee
Wanyong Feng
Andrew S. Lan
48
0
0
10 Mar 2025
M2-omni: Advancing Omni-MLLM for Comprehensive Modality Support with Competitive Performance
M2-omni: Advancing Omni-MLLM for Comprehensive Modality Support with Competitive Performance
Qingpei Guo
Kaiyou Song
Zipeng Feng
Ziping Ma
Qinglong Zhang
...
Yunxiao Sun
Tai-WeiChang
Jingdong Chen
Ming Yang
Jun Zhou
MLLM
VLM
67
3
0
26 Feb 2025
Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models
Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models
Alon Albalak
Duy Phung
Nathan Lile
Rafael Rafailov
Kanishk Gandhi
...
Anikait Singh
Chase Blagden
Violet Xiang
Dakota Mahan
Nick Haber
OffRL
LRM
45
4
0
24 Feb 2025
Preference Optimization for Reasoning with Pseudo Feedback
Preference Optimization for Reasoning with Pseudo Feedback
Fangkai Jiao
Geyang Guo
Xingxing Zhang
Nancy F. Chen
Shafiq R. Joty
Furu Wei
LRM
95
8
0
17 Feb 2025
InfiR : Crafting Effective Small Language Models and Multimodal Small Language Models in Reasoning
InfiR : Crafting Effective Small Language Models and Multimodal Small Language Models in Reasoning
C. Xie
Shuo Cai
Wenjun Wang
Pengxiang Li
Zhijie Sang
...
Xiaotian Han
Jianbo Yuan
Shengyu Zhang
Fei Wu
Hongxia Yang
LRM
41
1
0
17 Feb 2025
AnyEdit: Edit Any Knowledge Encoded in Language Models
AnyEdit: Edit Any Knowledge Encoded in Language Models
Houcheng Jiang
Junfeng Fang
Ningyu Zhang
Guojun Ma
Mingyang Wan
X. Wang
Xiangnan He
Tat-Seng Chua
KELM
46
8
0
08 Feb 2025
Leveraging Reasoning with Guidelines to Elicit and Utilize Knowledge for Enhancing Safety Alignment
Leveraging Reasoning with Guidelines to Elicit and Utilize Knowledge for Enhancing Safety Alignment
Haoyu Wang
Zeyu Qin
Li Shen
Xueqian Wang
Minhao Cheng
Dacheng Tao
66
1
0
06 Feb 2025
Graph-Aware Isomorphic Attention for Adaptive Dynamics in Transformers
Graph-Aware Isomorphic Attention for Adaptive Dynamics in Transformers
Markus J. Buehler
AI4CE
35
1
0
04 Jan 2025
Distill Visual Chart Reasoning Ability from LLMs to MLLMs
Distill Visual Chart Reasoning Ability from LLMs to MLLMs
Wei He
Zhiheng Xi
Wanxu Zhao
Xiaoran Fan
Yiwen Ding
Zifei Shan
Tao Gui
Qi Zhang
Xuanjing Huang
LRM
51
5
0
24 Oct 2024
Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis
  from Scratch
Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch
Yuyang Ding
Xinyu Shi
Xiaobo Liang
Juntao Li
Qiaoming Zhu
Min Zhang
ELM
AIMat
SyDa
LRM
16
8
0
24 Oct 2024
Taipan: Efficient and Expressive State Space Language Models with
  Selective Attention
Taipan: Efficient and Expressive State Space Language Models with Selective Attention
Chien Van Nguyen
Huy Huu Nguyen
Thang M. Pham
Ruiyi Zhang
Hanieh Deilamsalehy
...
Ryan A. Rossi
Trung Bui
Viet Dac Lai
Franck Dernoncourt
Thien Huu Nguyen
Mamba
RALM
21
1
0
24 Oct 2024
RespDiff: An End-to-End Multi-scale RNN Diffusion Model for Respiratory
  Waveform Estimation from PPG Signals
RespDiff: An End-to-End Multi-scale RNN Diffusion Model for Respiratory Waveform Estimation from PPG Signals
Yuyang Miao
Zehua Chen
C. Li
Danilo P. Mandic
DiffM
MedIm
15
0
0
06 Oct 2024
House of Cards: Massive Weights in LLMs
House of Cards: Massive Weights in LLMs
Jaehoon Oh
Seungjun Shin
Dokwan Oh
32
1
0
02 Oct 2024
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning
Haotian Zhang
Mingfei Gao
Zhe Gan
Philipp Dufter
Nina Wenzel
...
Haoxuan You
Zirui Wang
Afshin Dehghan
Peter Grasch
Yinfei Yang
VLM
MLLM
36
32
1
30 Sep 2024
The Perfect Blend: Redefining RLHF with Mixture of Judges
The Perfect Blend: Redefining RLHF with Mixture of Judges
Tengyu Xu
Eryk Helenowski
Karthik Abinav Sankararaman
Di Jin
Kaiyan Peng
...
Gabriel Cohen
Yuandong Tian
Hao Ma
Sinong Wang
Han Fang
23
9
0
30 Sep 2024
Unlocking Reasoning Potential in Large Langauge Models by Scaling
  Code-form Planning
Unlocking Reasoning Potential in Large Langauge Models by Scaling Code-form Planning
Jiaxin Wen
Jian Guan
Hongning Wang
Wei Wu
Minlie Huang
ReLM
OffRL
LRM
18
7
0
19 Sep 2024
NVLM: Open Frontier-Class Multimodal LLMs
NVLM: Open Frontier-Class Multimodal LLMs
Wenliang Dai
Nayeon Lee
Boxin Wang
Zhuoling Yang
Zihan Liu
Jon Barker
Tuomas Rintamaki
M. Shoeybi
Bryan Catanzaro
Wei Ping
MLLM
VLM
LRM
24
50
0
17 Sep 2024
MathGLM-Vision: Solving Mathematical Problems with Multi-Modal Large
  Language Model
MathGLM-Vision: Solving Mathematical Problems with Multi-Modal Large Language Model
Zhen Yang
Jinhao Chen
Zhengxiao Du
Wenmeng Yu
Weihan Wang
Wenyi Hong
Zhihuan Jiang
Bin Xu
Yuxiao Dong
Jie Tang
VLM
LRM
27
8
0
10 Sep 2024
Building and better understanding vision-language models: insights and
  future directions
Building and better understanding vision-language models: insights and future directions
Hugo Laurençon
Andrés Marafioti
Victor Sanh
Léo Tronchon
VLM
29
60
0
22 Aug 2024
Math-PUMA: Progressive Upward Multimodal Alignment to Enhance
  Mathematical Reasoning
Math-PUMA: Progressive Upward Multimodal Alignment to Enhance Mathematical Reasoning
Wenwen Zhuang
Xin Huang
Xiantao Zhang
Jin Zeng
LRM
16
1
0
16 Aug 2024
I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative
  Self-Enhancement Paradigm
I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative Self-Enhancement Paradigm
Yiming Liang
Ge Zhang
Xingwei Qu
Tianyu Zheng
Jiawei Guo
...
Jiaheng Liu
Chenghua Lin
Lei Ma
Wenhao Huang
Jiajun Zhang
ALM
20
5
0
15 Aug 2024
Leveraging Web-Crawled Data for High-Quality Fine-Tuning
Leveraging Web-Crawled Data for High-Quality Fine-Tuning
Jing Zhou
Chenglin Jiang
Wei Shen
Xiao Zhou
Xiaonan He
ALM
29
1
0
15 Aug 2024
Agent-E: From Autonomous Web Navigation to Foundational Design
  Principles in Agentic Systems
Agent-E: From Autonomous Web Navigation to Foundational Design Principles in Agentic Systems
Tamer Abuelsaad
Deepak Akkil
Prasenjit Dey
Ashish Jagmohan
Aditya Vempaty
Ravi Kokku
26
23
0
17 Jul 2024
Case2Code: Scalable Synthetic Data for Code Generation
Case2Code: Scalable Synthetic Data for Code Generation
Yunfan Shao
Linyang Li
Yichuan Ma
Peiji Li
Demin Song
...
Qipeng Guo
Hang Yan
Xipeng Qiu
Xuanjing Huang
Dahua Lin
LRM
13
2
0
17 Jul 2024
Training on the Test Task Confounds Evaluation and Emergence
Training on the Test Task Confounds Evaluation and Emergence
Ricardo Dominguez-Olmedo
Florian E. Dorner
Moritz Hardt
ELM
44
6
1
10 Jul 2024
AgentInstruct: Toward Generative Teaching with Agentic Flows
AgentInstruct: Toward Generative Teaching with Agentic Flows
Arindam Mitra
Luciano Del Corro
Guoqing Zheng
Shweti Mahajan
Dany Rouhana
...
Corby Rosset
Fillipe Silva
Hamed Khanpour
Yash Lara
Ahmed Awadallah
SyDa
25
23
0
03 Jul 2024
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of
  LLMs
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs
Xin Lai
Zhuotao Tian
Yukang Chen
Senqiao Yang
Xiangru Peng
Jiaya Jia
LRM
41
89
0
26 Jun 2024
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs
Shengbang Tong
Ellis L Brown
Penghao Wu
Sanghyun Woo
Manoj Middepogu
...
Xichen Pan
Austin Wang
Rob Fergus
Yann LeCun
Saining Xie
3DV
MLLM
37
206
0
24 Jun 2024
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical
  Problem-Solving
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving
Yuxuan Tong
Xiwen Zhang
Rui Wang
R. Wu
Junxian He
AIMat
LRM
30
30
0
18 Jun 2024
AgentGym: Evolving Large Language Model-based Agents across Diverse
  Environments
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments
Zhiheng Xi
Yiwen Ding
Wenxiang Chen
Boyang Hong
Honglin Guo
...
Qi Zhang
Xipeng Qiu
Xuanjing Huang
Zuxuan Wu
Yu-Gang Jiang
LLMAG
LM&Ro
25
28
0
06 Jun 2024
UltraMedical: Building Specialized Generalists in Biomedicine
UltraMedical: Building Specialized Generalists in Biomedicine
Kaiyan Zhang
Sihang Zeng
Ermo Hua
Ning Ding
Zhang-Ren Chen
...
Xuekai Zhu
Xingtai Lv
Hu Jinfang
Zhiyuan Liu
Bowen Zhou
LM&MA
28
19
0
06 Jun 2024
Exploratory Preference Optimization: Harnessing Implicit
  Q*-Approximation for Sample-Efficient RLHF
Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF
Tengyang Xie
Dylan J. Foster
Akshay Krishnamurthy
Corby Rosset
Ahmed Hassan Awadallah
Alexander Rakhlin
36
29
0
31 May 2024
JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training
  Small Data Synthesis Models
JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis Models
Kun Zhou
Beichen Zhang
Jiapeng Wang
Zhipeng Chen
Wayne Xin Zhao
Jing Sha
Zhichao Sheng
Shijin Wang
Ji-Rong Wen
SyDa
LRM
30
29
0
23 May 2024
Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single
  Process
Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single Process
Ermo Hua
Biqing Qi
Kaiyan Zhang
Yue Yu
Ning Ding
Xingtai Lv
Kai Tian
Bowen Zhou
25
3
0
20 May 2024
RLHF Workflow: From Reward Modeling to Online RLHF
RLHF Workflow: From Reward Modeling to Online RLHF
Hanze Dong
Wei Xiong
Bo Pang
Haoxiang Wang
Han Zhao
Yingbo Zhou
Nan Jiang
Doyen Sahoo
Caiming Xiong
Tong Zhang
OffRL
21
92
0
13 May 2024
MAmmoTH2: Scaling Instructions from the Web
MAmmoTH2: Scaling Instructions from the Web
Xiang Yue
Tuney Zheng
Ge Zhang
Wenhu Chen
ALM
LRM
35
77
0
06 May 2024
What matters when building vision-language models?
What matters when building vision-language models?
Hugo Laurençon
Léo Tronchon
Matthieu Cord
Victor Sanh
VLM
30
155
0
03 May 2024
Self-Explore to Avoid the Pit: Improving the Reasoning Capabilities of
  Language Models with Fine-grained Rewards
Self-Explore to Avoid the Pit: Improving the Reasoning Capabilities of Language Models with Fine-grained Rewards
Hyeonbin Hwang
Doyoung Kim
Seungone Kim
Seonghyeon Ye
Minjoon Seo
LRM
ReLM
23
7
0
16 Apr 2024
Direct Nash Optimization: Teaching Language Models to Self-Improve with
  General Preferences
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Corby Rosset
Ching-An Cheng
Arindam Mitra
Michael Santacroce
Ahmed Hassan Awadallah
Tengyang Xie
141
113
0
04 Apr 2024
Advancing LLM Reasoning Generalists with Preference Trees
Advancing LLM Reasoning Generalists with Preference Trees
Lifan Yuan
Ganqu Cui
Hanbin Wang
Ning Ding
Xingyao Wang
...
Zhenghao Liu
Bowen Zhou
Hao Peng
Zhiyuan Liu
Maosong Sun
LRM
24
94
0
02 Apr 2024
12
Next