Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.09261
Cited By
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
17 October 2022
Mirac Suzgun
Nathan Scales
Nathanael Scharli
Sebastian Gehrmann
Yi Tay
Hyung Won Chung
Aakanksha Chowdhery
Quoc V. Le
Ed H. Chi
Denny Zhou
Jason W. Wei
ALM
ELM
LRM
ReLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them"
50 / 788 papers shown
Title
Semantic Retention and Extreme Compression in LLMs: Can We Have Both?
Stanislas Laborde
Martin Cousseau
Antoun Yaacoub
Lionel Prevost
MQ
9
0
0
12 May 2025
Measuring General Intelligence with Generated Games
Vivek Verma
David Huang
William Chen
Dan Klein
Nicholas Tomlin
ReLM
ELM
LM&MA
LRM
21
0
0
12 May 2025
Direct Density Ratio Optimization: A Statistically Consistent Approach to Aligning Large Language Models
Rei Higuchi
Taiji Suzuki
17
0
0
12 May 2025
AttentionInfluence: Adopting Attention Head Influence for Weak-to-Strong Pretraining Data Selection
Kai Hua
Steven Wu
Ge Zhang
Ke Shen
LRM
9
0
0
12 May 2025
xGen-small Technical Report
Erik Nijkamp
Bo Pang
Egor Pakhomov
Akash Gokul
Jin Qu
Silvio Savarese
Yingbo Zhou
Caiming Xiong
LLMAG
36
0
0
10 May 2025
Unilogit: Robust Machine Unlearning for LLMs Using Uniform-Target Self-Distillation
Stefan Vasilev
Christian Herold
Baohao Liao
Seyyed Hadi Hashemi
Shahram Khadivi
Christof Monz
MU
38
0
0
09 May 2025
Stability in Single-Peaked Strategic Resource Selection Games
Henri Zeiler
17
0
0
09 May 2025
LiteLMGuard: Seamless and Lightweight On-Device Prompt Filtering for Safeguarding Small Language Models against Quantization-induced Risks and Vulnerabilities
Kalyan Nakka
Jimmy Dani
Ausmit Mondal
Nitesh Saxena
AAML
13
0
0
08 May 2025
ICon: In-Context Contribution for Automatic Data Selection
Yixin Yang
Qingxiu Dong
Linli Yao
Fangwei Zhu
Zhifang Sui
38
0
0
08 May 2025
Ultra-FineWeb: Efficient Data Filtering and Verification for High-Quality LLM Training Data
Y. Wang
Z. Fu
Jie Cai
Peijun Tang
Hongya Lyu
...
Jie Zhou
Guoyang Zeng
Chaojun Xiao
Xu Han
Zhiyuan Liu
39
0
0
08 May 2025
Rewriting Pre-Training Data Boosts LLM Performance in Math and Code
Kazuki Fujii
Yukito Tajima
Sakae Mizuki
Hinari Shimada
Taihei Shiotani
...
Kakeru Hattori
Youmi Ma
Hiroya Takamura
Rio Yokota
Naoaki Okazaki
SyDa
45
0
0
05 May 2025
Understanding LLM Scientific Reasoning through Promptings and Model's Explanation on the Answers
Alice Rueda
Mohammed S. Hassan
Argyrios Perivolaris
Bazen G. Teferra
Reza Samavi
...
Y. Wu
Y. Zhang
Bo Cao
Divya Sharma
Sridhar Krishnan Venkat Bhat
ELM
LRM
43
0
0
02 May 2025
TRAVELER: A Benchmark for Evaluating Temporal Reasoning across Vague, Implicit and Explicit References
Svenja Kenneweg
J. Deigmöller
Philipp Cimiano
Julian Eggert
40
0
0
02 May 2025
Thoughts without Thinking: Reconsidering the Explanatory Value of Chain-of-Thought Reasoning in LLMs through Agentic Pipelines
R. Manuvinakurike
Emanuel Moss
E. A. Watkins
Saurav Sahay
G. Raffa
L. Nachman
LRM
19
0
0
01 May 2025
ShorterBetter: Guiding Reasoning Models to Find Optimal Inference Length for Efficient Reasoning
Jingyang Yi
Jiazheng Wang
ReLM
OODD
LRM
58
0
0
30 Apr 2025
Local Prompt Optimization
Yash Jain
Vishal Chowdhary
41
0
0
29 Apr 2025
Combatting Dimensional Collapse in LLM Pre-Training Data via Diversified File Selection
Ziqing Fan
Siyuan Du
Shengchao Hu
Pingjie Wang
Li Shen
Y. Zhang
Dacheng Tao
Y. Wang
41
1
0
29 Apr 2025
SAS-Prompt: Large Language Models as Numerical Optimizers for Robot Self-Improvement
H. B. Amor
L. Graesser
Atil Iscen
David B. DÁmbrosio
Saminda Abeyruwan
Alex Bewley
Yifan Zhou
Kamalesh Kalirathinam
Swaroop Mishra
Pannag R. Sanketi
LLMAG
LM&Ro
LRM
94
0
0
29 Apr 2025
Can a Crow Hatch a Falcon? Lineage Matters in Predicting Large Language Model Performance
Takuya Tamura
Taro Yano
Masafumi Enomoto
M. Oyamada
34
0
0
28 Apr 2025
CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenges
Y. Li
Qizhi Pei
Mengyuan Sun
Honglin Lin
Chenlin Ming
Xin Gao
Jiang Wu
C. He
Lijun Wu
ELM
LRM
37
0
0
27 Apr 2025
An Empirical Study on Prompt Compression for Large Language Models
Z. Zhang
Jinyi Li
Yihuai Lan
X. Wang
Hao Wang
MQ
37
0
0
24 Apr 2025
MOOSComp: Improving Lightweight Long-Context Compressor via Mitigating Over-Smoothing and Incorporating Outlier Scores
Fengwei Zhou
Jiafei Song
Wenjin Jason Li
Gengjian Xue
Zhikang Zhao
Yichao Lu
Bailin Na
12
0
0
23 Apr 2025
Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism
Aviv Bick
Eric P. Xing
Albert Gu
RALM
81
0
0
22 Apr 2025
Trillion 7B Technical Report
Sungjun Han
Juyoung Suk
Suyeong An
Hyungguk Kim
Kyuseok Kim
Wonsuk Yang
Seungtaek Choi
Jamin Shin
25
0
0
21 Apr 2025
ParaPO: Aligning Language Models to Reduce Verbatim Reproduction of Pre-training Data
Tong Chen
Faeze Brahman
Jiacheng Liu
Niloofar Mireshghallah
Weijia Shi
Pang Wei Koh
Luke Zettlemoyer
Hannaneh Hajishirzi
28
0
0
20 Apr 2025
MIG: Automatic Data Selection for Instruction Tuning by Maximizing Information Gain in Semantic Space
Yicheng Chen
Yining Li
Kai Hu
Zerun Ma
Haochen Ye
Kai Chen
22
0
0
18 Apr 2025
Information Gain-Guided Causal Intervention for Autonomous Debiasing Large Language Models
Zhouhao Sun
Xiao Ding
LI DU
Yunpeng Xu
Yixuan Ma
Yang Zhao
Bing Qin
Ting Liu
17
0
0
17 Apr 2025
Dynamic Compressing Prompts for Efficient Inference of Large Language Models
Jinwu Hu
W. Zhang
Yufeng Wang
Yu Hu
Bin Xiao
Mingkui Tan
Qing Du
14
0
0
15 Apr 2025
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
Jinguo Zhu
Weiyun Wang
Zhe Chen
Z. Liu
Shenglong Ye
...
D. Lin
Yu Qiao
Jifeng Dai
Wenhai Wang
W. Wang
MLLM
VLM
63
6
1
14 Apr 2025
VDocRAG: Retrieval-Augmented Generation over Visually-Rich Documents
Ryota Tanaka
Taichi Iki
Taku Hasegawa
Kyosuke Nishida
Kuniko Saito
Jun Suzuki
VLM
40
0
0
14 Apr 2025
Can the capability of Large Language Models be described by human ability? A Meta Study
Mingrui Zan
Yunquan Zhang
Boyang Zhang
Fangming Liu
Daning Cheng
ELM
LM&MA
50
0
0
13 Apr 2025
Short-Path Prompting in LLMs: Analyzing Reasoning Instability and Solutions for Robust Performance
Zuoli Tang
Junjie Ou
Kaiqin Hu
Chunwei Wu
Zhaoxin Huan
Chilin Fu
Xiaolu Zhang
Jun Zhou
Chenliang Li
ReLM
LRM
30
0
0
13 Apr 2025
Improving Multilingual Capabilities with Cultural and Local Knowledge in Large Language Models While Enhancing Native Performance
Ram Mohan Rao Kadiyala
Siddartha Pullakhandam
Siddhant Gupta
Drishti Sharma
Jebish Purbey
Kanwal Mehreen
Muhammad Arham
Hamza Farooq
27
0
0
13 Apr 2025
A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis
Xin Gao
Qizhi Pei
Zinan Tang
Y. Li
Honglin Lin
Jiang Wu
C. He
Lijun Wu
SyDa
28
0
0
11 Apr 2025
Playpen: An Environment for Exploring Learning Through Conversational Interaction
Nicola Horst
Davide Mazzaccara
Antonia Schmidt
Michael Sullivan
Filippo Momentè
...
Alexander Koller
Oliver Lemon
David Schlangen
Mario Giulianelli
Alessandro Suglia
OffRL
29
0
0
11 Apr 2025
Do Reasoning Models Show Better Verbalized Calibration?
Qingcheng Zeng
Weihao Xuan
Leyang Cui
Rob Voigt
LRM
18
0
0
09 Apr 2025
DeduCE: Deductive Consistency as a Framework to Evaluate LLM Reasoning
Atharva Pandey
Kshitij Dubey
Rahul Sharma
Amit Sharma
ReLM
ELM
LRM
49
0
0
09 Apr 2025
SEA-LION: Southeast Asian Languages in One Network
Raymond Ng
Thanh Ngan Nguyen
Yuli Huang
Ngee Chia Tai
Wai Yi Leong
...
David Ong Tat-Wee
B. Liu
William-Chandra Tjhi
Erik Cambria
Leslie Teo
28
11
0
08 Apr 2025
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention
Gleb Rodionov
Roman Garipov
Alina Shutova
George Yakushev
Vage Egiazarian
Anton Sinitsin
Denis Kuznedelev
Dan Alistarh
LRM
27
1
0
08 Apr 2025
Encoder-Decoder Gemma: Improving the Quality-Efficiency Trade-Off via Adaptation
Biao Zhang
Fedor Moiseev
Joshua Ainslie
Paul Suganthan
Min Ma
Surya Bhupatiraju
Fede Lebron
Orhan Firat
Armand Joulin
Zhe Dong
AI4CE
21
0
0
08 Apr 2025
GREATERPROMPT: A Unified, Customizable, and High-Performing Open-Source Toolkit for Prompt Optimization
Wenliang Zheng
Sarkar Snigdha Sarathi Das
Yusen Zhang
Rui Zhang
23
0
0
04 Apr 2025
Universal Collection of Euclidean Invariants between Pairs of Position-Orientations
Gijs Bellaard
B. Smets
R. Duits
51
0
0
04 Apr 2025
When Reasoning Meets Compression: Benchmarking Compressed Large Reasoning Models on Complex Reasoning Tasks
Nan Zhang
Yusen Zhang
Prasenjit Mitra
Rui Zhang
MQ
LRM
44
2
0
02 Apr 2025
Representation Bending for Large Language Model Safety
Ashkan Yousefpour
Taeheon Kim
Ryan S. Kwon
Seungbeen Lee
Wonje Jeung
Seungju Han
Alvin Wan
Harrison Ngan
Youngjae Yu
Jonghyun Choi
AAML
ALM
KELM
52
0
0
02 Apr 2025
TransforMerger: Transformer-based Voice-Gesture Fusion for Robust Human-Robot Communication
Petr Vanc
Karla Stepanova
30
0
0
02 Apr 2025
AgentNet: Decentralized Evolutionary Coordination for LLM-based Multi-Agent Systems
Y. Yang
Huacan Chai
Shuai Shao
Y. Song
Siyuan Qi
Renting Rui
Weinan Zhang
AIFin
38
0
0
01 Apr 2025
Zero-shot Benchmarking: A Framework for Flexible and Scalable Automatic Evaluation of Language Models
José P. Pombal
Nuno M. Guerreiro
Ricardo Rei
André F. T. Martins
ALM
61
0
0
01 Apr 2025
Recitation over Reasoning: How Cutting-Edge Language Models Can Fail on Elementary School-Level Reasoning Problems?
Kai Yan
Yufei Xu
Zhengyin Du
Xuesong Yao
Z. Wang
Xiaowen Guo
Jiecao Chen
ReLM
ELM
LRM
87
3
0
01 Apr 2025
Efficient Inference for Large Reasoning Models: A Survey
Y. Liu
Jiaying Wu
Yufei He
Hongcheng Gao
Hongyu Chen
Baolong Bi
Jiaheng Zhang
Zhiqi Huang
Bryan Hooi
LLMAG
LRM
58
7
0
29 Mar 2025
SWI: Speaking with Intent in Large Language Models
Yuwei Yin
EunJeong Hwang
Giuseppe Carenini
LRM
44
0
0
27 Mar 2025
1
2
3
4
...
14
15
16
Next