ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.22375
  4. Cited By
Pangu Embedded: An Efficient Dual-system LLM Reasoner with Metacognition

Pangu Embedded: An Efficient Dual-system LLM Reasoner with Metacognition

28 May 2025
Hanting Chen
Yasheng Wang
Kai Han
Dong Li
Lin Li
Zhenni Bi
Jinpeng Li
Haoyu Wang
Fei Mi
Mingjian Zhu
Bin Wang
Kaikai Song
Yifei Fu
Xu He
Yu-Juan Luo
Chong Zhu
Quan He
Xueyu Wu
Wei He
Hailin Hu
Yehui Tang
Dacheng Tao
Xinghao Chen
Yunhe Wang
    LRM
ArXivPDFHTML

Papers citing "Pangu Embedded: An Efficient Dual-system LLM Reasoner with Metacognition"

42 / 42 papers shown
Title
AdaptThink: Reasoning Models Can Learn When to Think
AdaptThink: Reasoning Models Can Learn When to Think
Jing Zhang
Nianyi Lin
Lei Hou
Ling Feng
Juanzi Li
OffRL
LRM
50
3
0
19 May 2025
Learning When to Think: Shaping Adaptive Reasoning in R1-Style Models via Multi-Stage RL
Learning When to Think: Shaping Adaptive Reasoning in R1-Style Models via Multi-Stage RL
Songjun Tu
Jiahao Lin
Qichao Zhang
Xiangyu Tian
Linjing Li
Xiangyuan Lan
Dongbin Zhao
OffRL
ReLM
LRM
66
2
0
16 May 2025
Qwen3 Technical Report
Qwen3 Technical Report
An Yang
A. Li
Baosong Yang
Beichen Zhang
Binyuan Hui
...
Zekun Wang
Zeyu Cui
Zhenru Zhang
Zhenhong Zhou
Zihan Qiu
LLMAG
OSLM
LRM
81
35
0
14 May 2025
Fast-Slow Thinking for Large Vision-Language Model Reasoning
Fast-Slow Thinking for Large Vision-Language Model Reasoning
W. L. Xiao
Leilei Gan
Weilong Dai
Wanggui He
Ziwei Huang
...
Fangxun Shu
Zhelun Yu
Peng Zhang
Hao Jiang
Leilei Gan
ReLM
LRM
AI4CE
391
7
0
25 Apr 2025
Efficient Reasoning Models: A Survey
Efficient Reasoning Models: A Survey
Sicheng Feng
Gongfan Fang
Xinyin Ma
Xinchao Wang
ReLM
LRM
289
10
0
15 Apr 2025
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild
Weihao Zeng
Yuzhen Huang
Qian Liu
Wei Liu
Keqing He
Zejun Ma
Junxian He
OffRL
ReLM
LRM
120
85
0
24 Mar 2025
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Qiying Yu
Zheng Zhang
Ruofei Zhu
Yufeng Yuan
Xiaochen Zuo
...
Ya Zhang
Lin Yan
Mu Qiao
Yonghui Wu
Mingxuan Wang
OffRL
LRM
129
131
0
18 Mar 2025
Marco-o1 v2: Towards Widening The Distillation Bottleneck for Reasoning Models
Marco-o1 v2: Towards Widening The Distillation Bottleneck for Reasoning Models
Huifeng Yin
Yu Zhao
Mingyang Wu
Xuanfan Ni
Bo Zeng
...
Liangying Shao
Chenyang Lyu
Longyue Wang
Weihua Luo
Kaifu Zhang
LRM
90
4
0
03 Mar 2025
Small Models Struggle to Learn from Strong Reasoners
Small Models Struggle to Learn from Strong Reasoners
Yuetai Li
Xiang Yue
Zhangchen Xu
Fengqing Jiang
Luyao Niu
Bill Yuchen Lin
Bhaskar Ramasubramanian
Radha Poovendran
LRM
84
22
0
17 Feb 2025
LIMO: Less is More for Reasoning
LIMO: Less is More for Reasoning
Yixin Ye
Zhen Huang
Yang Xiao
Ethan Chern
Shijie Xia
Pengfei Liu
LRM
115
132
0
05 Feb 2025
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
...
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
ReLM
VLM
OffRL
AI4TS
LRM
303
1,503
0
22 Jan 2025
O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning
O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning
Haotian Luo
Li Shen
Haiying He
Yun Wang
Shiwei Liu
Wei Li
Naiqiang Tan
Xiaochun Cao
Dacheng Tao
VLM
LRM
142
77
0
22 Jan 2025
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple
  Distillation, Big Progress or Bitter Lesson?
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?
Zhen Huang
Haoyang Zou
Xuefeng Li
Yixiu Liu
Yuxiang Zheng
Ethan Chern
Shijie Xia
Yiwei Qin
Weizhe Yuan
Pengfei Liu
VLM
82
47
0
25 Nov 2024
Fast Best-of-N Decoding via Speculative Rejection
Fast Best-of-N Decoding via Speculative Rejection
Hanshi Sun
Momin Haider
Ruiqi Zhang
Huitao Yang
Jiahao Qiu
Ming Yin
Mengdi Wang
Peter L. Bartlett
Andrea Zanette
BDL
66
39
0
26 Oct 2024
HybridFlow: A Flexible and Efficient RLHF Framework
HybridFlow: A Flexible and Efficient RLHF Framework
Guangming Sheng
Chi Zhang
Zilingfeng Ye
Xibin Wu
Wang Zhang
Ru Zhang
Size Zheng
Haibin Lin
Chuan Wu
AI4CE
94
171
0
28 Sep 2024
LLM Pruning and Distillation in Practice: The Minitron Approach
LLM Pruning and Distillation in Practice: The Minitron Approach
Sharath Turuvekere Sreenivas
Saurav Muralidharan
Raviraj Joshi
Marcin Chochowski
M. Patwary
Mohammad Shoeybi
Bryan Catanzaro
Jan Kautz
Pavlo Molchanov
51
31
0
21 Aug 2024
Entropy Law: The Story Behind Data Compression and LLM Performance
Entropy Law: The Story Behind Data Compression and LLM Performance
Mingjia Yin
Chuhan Wu
Yufei Wang
Hao Wang
Wei Guo
Yasheng Wang
Yong Liu
Ruiming Tang
Defu Lian
Enhong Chen
77
23
0
09 Jul 2024
Distilling System 2 into System 1
Distilling System 2 into System 1
Ping Yu
Jing Xu
Jason Weston
Ilia Kulikov
OffRL
LRM
93
76
0
08 Jul 2024
Survey on Knowledge Distillation for Large Language Models: Methods,
  Evaluation, and Application
Survey on Knowledge Distillation for Large Language Models: Methods, Evaluation, and Application
Chuanpeng Yang
Wang Lu
Yao Zhu
Yidong Wang
Qian Chen
Chenlong Gao
Bingjie Yan
Yiqiang Chen
ALM
KELM
86
26
0
02 Jul 2024
Are We Done with MMLU?
Are We Done with MMLU?
Aryo Pradipta Gema
Joshua Ong Jun Leang
Giwon Hong
Alessio Devoto
Alberto Carlo Maria Mancino
...
R. McHardy
Joshua Harris
Jean Kaddour
Emile van Krieken
Pasquale Minervini
ELM
104
39
0
06 Jun 2024
LiveCodeBench: Holistic and Contamination Free Evaluation of Large
  Language Models for Code
LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code
Naman Jain
King Han
Alex Gu
Wen-Ding Li
Fanjia Yan
Tianjun Zhang
Sida I. Wang
Armando Solar-Lezama
Koushik Sen
Ion Stoica
ELM
79
365
0
12 Mar 2024
Learning or Self-aligning? Rethinking Instruction Fine-tuning
Learning or Self-aligning? Rethinking Instruction Fine-tuning
Mengjie Ren
Boxi Cao
Hongyu Lin
Liu Cao
Xianpei Han
Ke Zeng
Guanglu Wan
Xunliang Cai
Le Sun
48
28
0
28 Feb 2024
A Survey on Knowledge Distillation of Large Language Models
A Survey on Knowledge Distillation of Large Language Models
Xiaohan Xu
Ming Li
Chongyang Tao
Tao Shen
Reynold Cheng
Jinyang Li
Can Xu
Dacheng Tao
Dinesh Manocha
KELM
VLM
100
113
0
20 Feb 2024
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open
  Language Models
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Zhihong Shao
Peiyi Wang
Qihao Zhu
Runxin Xu
Jun-Mei Song
...
Haowei Zhang
Mingchuan Zhang
Yiming Li
Yu-Huan Wu
Daya Guo
ReLM
LRM
94
953
0
05 Feb 2024
PanGu-$π$: Enhancing Language Model Architectures via Nonlinearity
  Compensation
PanGu-πππ: Enhancing Language Model Architectures via Nonlinearity Compensation
Yunhe Wang
Hanting Chen
Yehui Tang
Tianyu Guo
Kai Han
...
Qinghua Xu
Qun Liu
Jun Yao
Chao Xu
Dacheng Tao
83
19
0
27 Dec 2023
What Makes Good Data for Alignment? A Comprehensive Study of Automatic
  Data Selection in Instruction Tuning
What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning
Wei Liu
Weihao Zeng
Keqing He
Yong Jiang
Junxian He
ALM
80
231
0
25 Dec 2023
MoDS: Model-oriented Data Selection for Instruction Tuning
MoDS: Model-oriented Data Selection for Instruction Tuning
Qianlong Du
Chengqing Zong
Jiajun Zhang
ALM
69
83
0
27 Nov 2023
GPQA: A Graduate-Level Google-Proof Q&A Benchmark
GPQA: A Graduate-Level Google-Proof Q&A Benchmark
David Rein
Betty Li Hou
Asa Cooper Stickland
Jackson Petty
Richard Yuanzhe Pang
Julien Dirani
Julian Michael
Samuel R. Bowman
AI4MH
ELM
72
627
0
20 Nov 2023
LawBench: Benchmarking Legal Knowledge of Large Language Models
LawBench: Benchmarking Legal Knowledge of Large Language Models
Zhiwei Fei
Xiaoyu Shen
D. Zhu
Fengzhe Zhou
Zhuo Han
Songyang Zhang
Kai-xiang Chen
Zongwen Shen
Jidong Ge
ELM
AILaw
71
42
0
28 Sep 2023
Efficient Memory Management for Large Language Model Serving with
  PagedAttention
Efficient Memory Management for Large Language Model Serving with PagedAttention
Woosuk Kwon
Zhuohan Li
Siyuan Zhuang
Ying Sheng
Lianmin Zheng
Cody Hao Yu
Joseph E. Gonzalez
Haotong Zhang
Ion Stoica
VLM
136
2,049
0
12 Sep 2023
Code Llama: Open Foundation Models for Code
Code Llama: Open Foundation Models for Code
Baptiste Rozière
Jonas Gehring
Fabian Gloeckle
Sten Sootla
Itai Gat
...
Hugo Touvron
Louis Martin
Nicolas Usunier
Thomas Scialom
Gabriel Synnaeve
ELM
ALM
84
1,990
0
24 Aug 2023
From Quantity to Quality: Boosting LLM Performance with Self-Guided Data
  Selection for Instruction Tuning
From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning
Ming Li
Yong Zhang
Zhitao Li
Jiuhai Chen
Lichang Chen
Ning Cheng
Jianzong Wang
Dinesh Manocha
Jing Xiao
102
194
0
23 Aug 2023
A Survey on Large Language Model based Autonomous Agents
A Survey on Large Language Model based Autonomous Agents
Lei Wang
Chengbang Ma
Xueyang Feng
Zeyu Zhang
Hao-ran Yang
...
Xu Chen
Yankai Lin
Wayne Xin Zhao
Zhewei Wei
Ji-Rong Wen
LLMAG
AI4CE
LM&Ro
69
1,225
0
22 Aug 2023
Let's Verify Step by Step
Let's Verify Step by Step
Hunter Lightman
V. Kosaraju
Yura Burda
Harrison Edwards
Bowen Baker
Teddy Lee
Jan Leike
John Schulman
Ilya Sutskever
K. Cobbe
ALM
OffRL
LRM
122
1,044
0
31 May 2023
LIMA: Less Is More for Alignment
LIMA: Less Is More for Alignment
Chunting Zhou
Pengfei Liu
Puxin Xu
Srini Iyer
Jiao Sun
...
Susan Zhang
Gargi Ghosh
M. Lewis
Luke Zettlemoyer
Omer Levy
ALM
73
819
0
18 May 2023
GPT-4 Technical Report
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAG
MLLM
832
13,788
0
15 Mar 2023
LLaMA: Open and Efficient Foundation Language Models
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
...
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
ALM
PILM
938
12,840
0
27 Feb 2023
Teaching Small Language Models to Reason
Teaching Small Language Models to Reason
Lucie Charlotte Magister
Jonathan Mallinson
Jakub Adamek
Eric Malmi
Aliaksei Severyn
LRM
AI4CE
ReLM
148
258
0
16 Dec 2022
Training a Helpful and Harmless Assistant with Reinforcement Learning
  from Human Feedback
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
Yuntao Bai
Andy Jones
Kamal Ndousse
Amanda Askell
Anna Chen
...
Jack Clark
Sam McCandlish
C. Olah
Benjamin Mann
Jared Kaplan
224
2,457
0
12 Apr 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
726
12,525
0
04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
616
9,009
0
28 Jan 2022
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
285
18,685
0
20 Jul 2017
1