Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.14660
Cited By
ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models
22 February 2024
Yanan Wu
Jie Liu
Xingyuan Bu
Jiaheng Liu
Zhanhui Zhou
Yuanxing Zhang
Chenchen Zhang
Zhiqi Bai
Haibin Chen
Tiezheng Ge
Wanli Ouyang
Wenbo Su
Bo Zheng
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models"
9 / 9 papers shown
Title
One Example Shown, Many Concepts Known! Counterexample-Driven Conceptual Reasoning in Mathematical LLMs
Yinghui Li
Jiayi Kuang
Haojing Huang
Zhikun Xu
Xinnian Liang
...
Xiaoyu Tan
C. Qu
Ying Shen
Hai-Tao Zheng
Philip S. Yu
LRM
41
3
0
12 Feb 2025
End-to-End Bangla AI for Solving Math Olympiad Problem Benchmark: Leveraging Large Language Model Using Integrated Approach
H.M. Shadman Tabib
Jaber Ahmed Deedar
LRM
24
0
0
08 Jan 2025
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
Siming Huang
Tianhao Cheng
J.K. Liu
Jiaran Hao
L. Song
...
Ge Zhang
Zili Wang
Yuan Qi
Yinghui Xu
Wei Chu
ALM
59
16
0
07 Nov 2024
DDK: Distilling Domain Knowledge for Efficient Large Language Models
Jiaheng Liu
Chenchen Zhang
Jinyang Guo
Yuanxing Zhang
Haoran Que
...
Congnan Liu
Wenbo Su
Jiamang Wang
Lin Qu
Bo Zheng
36
3
0
23 Jul 2024
Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models
Zhanhui Zhou
Zhixuan Liu
Jie Liu
Zhichen Dong
Chao Yang
Yu Qiao
ALM
30
20
0
29 May 2024
Theoretical Analysis of Meta Reinforcement Learning: Generalization Bounds and Convergence Guarantees
Cangqing Wang
Mingxiu Sui
Dan Sun
Zecheng Zhang
Yan Zhou
23
26
0
22 May 2024
OWL: A Large Language Model for IT Operations
Hongcheng Guo
Jian Yang
Jiaheng Liu
Liqun Yang
Linzheng Chai
...
Tieqiao Zheng
Liangfan Zheng
Bo-Wen Zhang
Ke Xu
Zhoujun Li
VLM
52
40
0
17 Sep 2023
LVP-M3: Language-aware Visual Prompt for Multilingual Multimodal Machine Translation
Hongcheng Guo
Jiaheng Liu
Haoyang Huang
Jian Yang
Zhoujun Li
Dongdong Zhang
Zheng Cui
Furu Wei
27
22
0
19 Oct 2022
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
1