Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2504.15777
Cited By
Tina: Tiny Reasoning Models via LoRA
22 April 2025
Shangshang Wang
Julian Asilis
Ömer Faruk Akgül
Enes Burak Bilgin
Ollie Liu
Willie Neiswanger
OffRL
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (55 upvotes)
Papers citing
"Tina: Tiny Reasoning Models via LoRA"
15 / 15 papers shown
Title
ToolSample: Dual Dynamic Sampling Methods with Curriculum Learning for RL-based Tool Learning
Zihao Feng
Xiaoxue Wang
Bowen Wu
Hailong Cao
Tiejun Zhao
Qun Yu
Baoxun Wang
OffRL
61
0
0
18 Sep 2025
Predictive Scaling Laws for Efficient GRPO Training of Large Reasoning Models
Datta Nimmaturi
Vaishnavi Bhargava
Rajat Ghosh
Johnu George
Debojyoti Dutta
LRM
110
2
0
24 Jul 2025
The Impact of Language Mixing on Bilingual LLM Reasoning
Yihao Li
Jiayi Xin
Miranda Muqing Miao
Qi Long
Lyle Ungar
LRM
191
3
0
21 Jul 2025
Sharp Generalization Bounds for Foundation Models with Asymmetric Randomized Low-Rank Adapters
Anastasis Kratsios
Tin Sum Cheng
Aurelien Lucchi
Haitz Sáez de Ocáriz Borde
190
1
0
17 Jun 2025
Mitigating Spurious Correlations in LLMs via Causality-Aware Post-Training
Shurui Gui
Shuiwang Ji
LRM
226
2
0
11 Jun 2025
RECIPE-TKG: From Sparse History to Structured Reasoning for LLM-based Temporal Knowledge Graph Completion
Ömer Faruk Akgül
Feiyu Zhu
Yuxin Yang
Rajgopal Kannan
Viktor Prasanna
196
0
0
23 May 2025
Get Experience from Practice: LLM Agents with Record & Replay
Erhu Feng
Wenbo Zhou
Zibin Liu
Le Chen
Yunpeng Dong
...
Yisheng Zhao
Dong Du
Zhichao Hua
Yubin Xia
Haibo Chen
355
6
0
23 May 2025
The Hallucination Tax of Reinforcement Finetuning
Linxin Song
Taiwei Shi
Jieyu Zhao
HILM
LRM
250
11
0
20 May 2025
Reinforcement Learning Finetunes Small Subnetworks in Large Language Models
Sagnik Mukherjee
Lifan Yuan
Dilek Hakkani-Tur
Yuan Yao
227
13
0
16 May 2025
Crosslingual Reasoning through Test-Time Scaling
Zheng-Xin Yong
Muhammad Farid Adilazuarda
Jonibek Mansurov
Ruochen Zhang
Niklas Muennighoff
Carsten Eickhoff
Genta Indra Winata
Julia Kreutzer
Stephen H. Bach
Alham Fikri Aji
LRM
ELM
933
26
0
08 May 2025
Activated LoRA: Fine-tuned LLMs for Intrinsics
Kristjan Greenewald
Luis A. Lastras
Thomas Parnell
Vraj Shah
Lucian Popa
Giulio Zizzo
Chulaka Gunasekara
Ambrish Rawat
David D. Cox
453
0
0
16 Apr 2025
Understanding R1-Zero-Like Training: A Critical Perspective
Zichen Liu
Changyu Chen
Wenjun Li
Penghui Qi
Tianyu Pang
Chao Du
Wee Sun Lee
Jialin Li
OffRL
LRM
474
546
0
26 Mar 2025
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild
Weihao Zeng
Yuzhen Huang
Qian Liu
Wei Liu
Keqing He
Zejun Ma
Junxian He
OffRL
ReLM
LRM
562
328
0
24 Mar 2025
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't
Quy-Anh Dang
Chris Ngo
OffRL
LRM
297
42
0
20 Mar 2025
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
...
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
ReLM
VLM
OffRL
AI4TS
LRM
1.2K
5,215
0
22 Jan 2025
1