ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.09261
  4. Cited By
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them

Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them

Annual Meeting of the Association for Computational Linguistics (ACL), 2022
17 October 2022
Mirac Suzgun
Nathan Scales
Nathanael Scharli
Sebastian Gehrmann
Yi Tay
Hyung Won Chung
Aakanksha Chowdhery
Quoc V. Le
Ed H. Chi
Denny Zhou
Jason W. Wei
    ALMELMLRMReLM
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them"

50 / 1,103 papers shown
UniAPO: Unified Multimodal Automated Prompt Optimization
UniAPO: Unified Multimodal Automated Prompt Optimization
Qipeng Zhu
Yanzhe Chen
Huasong Zhong
Yan Li
Jie Chen
Zhixin Zhang
Junping Zhang
Zhenheng Yang
LLMAG
143
1
0
25 Aug 2025
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Weiyun Wang
Zhangwei Gao
Lixin Gu
Hengjun Pu
Long Cui
...
Bowen Zhou
Kai Chen
Yu Qiao
Wenhai Wang
Gen Luo
MLLMLRM
305
279
0
25 Aug 2025
LLMs Can't Handle Peer Pressure: Crumbling under Multi-Agent Social Interactions
LLMs Can't Handle Peer Pressure: Crumbling under Multi-Agent Social Interactions
Maojia Song
Tej Deep Pala
Weisheng Jin
Amir Zadeh
Chuan Li
Dorien Herremans
Soujanya Poria
Soujanya Poria
LLMAG
172
3
0
24 Aug 2025
CYCLE-INSTRUCT: Fully Seed-Free Instruction Tuning via Dual Self-Training and Cycle Consistency
CYCLE-INSTRUCT: Fully Seed-Free Instruction Tuning via Dual Self-Training and Cycle Consistency
Zhanming Shen
Hao Chen
Yulei Tang
Shaolin Zhu
Wentao Ye
Xiaomeng Hu
Haobo Wang
Gang Chen
Junbo Zhao
SyDaALM
109
1
0
22 Aug 2025
Systematic Characterization of LLM Quantization: A Performance, Energy, and Quality Perspective
Systematic Characterization of LLM Quantization: A Performance, Energy, and Quality Perspective
Tianyao Shi
Yi Ding
MQ
133
3
0
22 Aug 2025
Dream 7B: Diffusion Large Language Models
Dream 7B: Diffusion Large Language Models
Jiacheng Ye
Zhihui Xie
Lin Zheng
Lei Li
Zirui Wu
Xin Jiang
Zhenguo Li
Lingpeng Kong
DiffMVLM
1.0K
110
0
21 Aug 2025
Dissecting Tool-Integrated Reasoning: An Empirical Study and Analysis
Dissecting Tool-Integrated Reasoning: An Empirical Study and Analysis
Yufeng Zhao
Junnan Liu
Hongwei Liu
D. Zhu
Yuan Shen
Songyang Zhang
Kai Chen
LRM
133
0
0
21 Aug 2025
In-Context Iterative Policy Improvement for Dynamic Manipulation
In-Context Iterative Policy Improvement for Dynamic Manipulation
Mark Van der Merwe
Devesh Jha
LM&RoOffRLLRM
131
0
0
20 Aug 2025
ZigzagAttention: Efficient Long-Context Inference with Exclusive Retrieval and Streaming Heads
ZigzagAttention: Efficient Long-Context Inference with Exclusive Retrieval and Streaming Heads
Zhuorui Liu
Chen Zhang
Dawei Song
62
2
0
17 Aug 2025
ReaLM: Reflection-Enhanced Autonomous Reasoning with Small Language Models
ReaLM: Reflection-Enhanced Autonomous Reasoning with Small Language Models
Yuanfeng Xu
Zehui Dai
Jian Liang
Jiapeng Guan
Guangrun Wang
Liang Lin
Xiaohui Lv
LLMAGLRM
140
0
0
17 Aug 2025
Hard Examples Are All You Need: Maximizing GRPO Post-Training Under Annotation Budgets
Hard Examples Are All You Need: Maximizing GRPO Post-Training Under Annotation Budgets
Benjamin Pikus
Pratyush Ranjan Tiwari
Burton Ye
307
5
0
15 Aug 2025
Slow Tuning and Low-Entropy Masking for Safe Chain-of-Thought Distillation
Slow Tuning and Low-Entropy Masking for Safe Chain-of-Thought Distillation
Ziyang Ma
Qingyue Yuan
Linhai Zhang
Deyu Zhou
LRM
123
2
0
13 Aug 2025
mSCoRe: a $M$ultilingual and Scalable Benchmark for $S$kill-based $Co$mmonsense $Re$asoning
mSCoRe: a MMMultilingual and Scalable Benchmark for SSSkill-based CoCoCommonsense ReReReasoning
Nghia Trung Ngo
Franck Dernoncourt
T. Nguyen
LRM
190
0
0
13 Aug 2025
Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments
Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments
Junjie Ye
C. Jiang
Zhengyin Du
Yufei Xu
Xuesong Yao
...
Xiaoran Fan
Qi Zhang
Tao Gui
Xuanjing Huang
Jiecao Chen
KELMOffRL
176
4
0
12 Aug 2025
GreenTEA: Gradient Descent with Topic-modeling and Evolutionary Auto-prompting
GreenTEA: Gradient Descent with Topic-modeling and Evolutionary Auto-prompting
Zheng Dong
Luming Shang
Gabriela Olinto
107
0
0
12 Aug 2025
Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts
Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts
Haoyuan Wu
Haoxing Chen
Xiaodong Chen
Zhanchao Zhou
Tieyuan Chen
...
Junbo Zhao
Lin Liu
Zhenzhong Lan
Bei Yu
Jianguo Li
MoE
137
4
0
11 Aug 2025
DP-LLM: Runtime Model Adaptation with Dynamic Layer-wise Precision Assignment
DP-LLM: Runtime Model Adaptation with Dynamic Layer-wise Precision Assignment
S. Kwon
Seong Hoon Seo
Jae W. Lee
Yeonhong Park
MQ
262
1
0
08 Aug 2025
TASE: Token Awareness and Structured Evaluation for Multilingual Language Models
TASE: Token Awareness and Structured Evaluation for Multilingual Language Models
Chenzhuo Zhao
Xinda Wang
Yue Huang
Junting Lu
Ziqian Liu
LRM
115
1
0
07 Aug 2025
Align, Don't Divide: Revisiting the LoRA Architecture in Multi-Task Learning
Align, Don't Divide: Revisiting the LoRA Architecture in Multi-Task Learning
Jinda Liu
Bo Cheng
Yi-Ju Chang
Yuan Wu
MoMe
83
0
0
07 Aug 2025
Bench-2-CoP: Can We Trust Benchmarking for EU AI Compliance?
Bench-2-CoP: Can We Trust Benchmarking for EU AI Compliance?
Matteo Prandi
Vincenzo Suriani
Federico Pierucci
Marcello Galisai
Daniele Nardi
Piercosma Bisconti
ELM
114
0
0
07 Aug 2025
R-Zero: Self-Evolving Reasoning LLM from Zero Data
R-Zero: Self-Evolving Reasoning LLM from Zero Data
Chengsong Huang
Wenhao Yu
Xiaoyang Wang
H. Zhang
Zongxia Li
Ruosen Li
J. Huang
Haitao Mi
Dong Yu
ReLMSyDaLRM
233
47
0
07 Aug 2025
IFDECORATOR: Wrapping Instruction Following Reinforcement Learning with Verifiable Rewards
IFDECORATOR: Wrapping Instruction Following Reinforcement Learning with Verifiable Rewards
Xu Guo
Tianyi Liang
Tong Jian
Xiaogui Yang
Ling-I Wu
Chenhui Li
Z. Lu
Qipeng Guo
Kai Chen
276
2
0
06 Aug 2025
Tensorized Clustered LoRA Merging for Multi-Task Interference
Tensorized Clustered LoRA Merging for Multi-Task Interference
Zhan Su
Fengran Mo
G. Liang
Jinghan Zhang
Bingbing Wen
Prayag Tiwari
Jian-Yun Nie
MoMe
182
0
0
06 Aug 2025
CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward
CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward
Shudong Liu
Hongwei Liu
Junnan Liu
Linchen Xiao
Songyang Gao
...
Yuzhe Gu
Wenwei Zhang
Yang Li
Songyang Zhang
Kai Chen
151
17
0
05 Aug 2025
ProCut: LLM Prompt Compression via Attribution Estimation
ProCut: LLM Prompt Compression via Attribution Estimation
Zhentao Xu
Fengyi Li
Albert Chen
Xiaofeng Wang
179
1
0
04 Aug 2025
SmallKV: Small Model Assisted Compensation of KV Cache Compression for Efficient LLM Inference
SmallKV: Small Model Assisted Compensation of KV Cache Compression for Efficient LLM Inference
Yi Zhao
Yajuan Peng
Cam-Tu Nguyen
Zuchao Li
Xiaoliang Wang
Hai Zhao
Xiaoming Fu
215
2
0
03 Aug 2025
LinkQA: Synthesizing Diverse QA from Multiple Seeds Strongly Linked by Knowledge Points
LinkQA: Synthesizing Diverse QA from Multiple Seeds Strongly Linked by Knowledge Points
Xuemiao Zhang
Can Ren
Chengying Tu
Rongxiang Weng
Hongfei Yan
Jingang Wang
Xunliang Cai
212
2
0
02 Aug 2025
Large-Scale Diverse Synthesis for Mid-Training
Large-Scale Diverse Synthesis for Mid-Training
Xuemiao Zhang
Chengying Tu
Can Ren
Rongxiang Weng
Hongfei Yan
Jingang Wang
Xunliang Cai
SyDa
151
3
0
02 Aug 2025
Llama-3.1-FoundationAI-SecurityLLM-8B-Instruct Technical Report
Llama-3.1-FoundationAI-SecurityLLM-8B-Instruct Technical Report
Sajana Weerawardhena
Paul Kassianik
Blaine Nelson
Baturay Saglam
Anu Vellore
...
Dhruv Kedia
Kojin Oshiba
Zhouran Yang
Yaron Singer
Amin Karbasi
ALMELM
185
4
0
01 Aug 2025
Beyond Fixed: Training-Free Variable-Length Denoising for Diffusion Large Language Models
Beyond Fixed: Training-Free Variable-Length Denoising for Diffusion Large Language Models
Jinsong Li
Xiaoyi Dong
Yuhang Zang
Yuhang Cao
Jiaqi Wang
Dahua Lin
DiffM
170
13
0
01 Aug 2025
Learning Like Humans: Resource-Efficient Federated Fine-Tuning through Cognitive Developmental Stages
Learning Like Humans: Resource-Efficient Federated Fine-Tuning through Cognitive Developmental Stages
Yebo Wu
Jingguang Li
Zhijiang Guo
Li Li
184
4
0
31 Jul 2025
DynaSwarm: Dynamically Graph Structure Selection for LLM-based Multi-agent System
DynaSwarm: Dynamically Graph Structure Selection for LLM-based Multi-agent System
Hui Yi Leong
Yuqing Wu
169
0
0
31 Jul 2025
Doctor Sun: A Bilingual Multimodal Large Language Model for Biomedical AI
Doctor Sun: A Bilingual Multimodal Large Language Model for Biomedical AI
Dong Xue
Ziyao Shao
Zhaoyang Duan
Fangzhou Liu
Bing Li
Zhongheng Zhang
LM&MA
328
0
0
30 Jul 2025
Stop Evaluating AI with Human Tests, Develop Principled, AI-specific Tests instead
Stop Evaluating AI with Human Tests, Develop Principled, AI-specific Tests instead
Tom Sühr
Florian E. Dorner
Olawale Salaudeen
Augustin Kelava
Samira Samadi
ALMELM
169
2
0
30 Jul 2025
Kimi K2: Open Agentic Intelligence
Kimi K2: Open Agentic Intelligence
Kimi Team
Yifan Bai
Yiping Bao
Guanduo Chen
Jiahao Chen
...
Qifeng Teng
Chensi Wang
Dinglu Wang
Feng Wang
Haiming Wang
MoEVLMLRM
179
81
0
28 Jul 2025
Mitigating Geospatial Knowledge Hallucination in Large Language Models: Benchmarking and Dynamic Factuality Aligning
Mitigating Geospatial Knowledge Hallucination in Large Language Models: Benchmarking and Dynamic Factuality Aligning
Shengyuan Wang
J. Feng
Tianhui Liu
Dan Pei
Yong Li
HILM
171
0
0
25 Jul 2025
Towards Effective Human-in-the-Loop Assistive AI Agents
Towards Effective Human-in-the-Loop Assistive AI Agents
Filippos Bellos
Yayuan Li
Cary Shu
Ruey Day
J. Siskind
Jason J. Corso
169
2
0
24 Jul 2025
Technical Report of TeleChat2, TeleChat2.5 and T1
Technical Report of TeleChat2, TeleChat2.5 and T1
Zihan Wang
Xinzhang Liu
Yitong Yao
Chao Wang
Yu Zhao
...
Bingkai Yang
Shuangyong Song
Yongxiang Li
Zhongjiang He
Xuelong Li
AI4TSLRM
428
6
0
24 Jul 2025
Innovator: Scientific Continued Pretraining with Fine-grained MoE Upcycling
Innovator: Scientific Continued Pretraining with Fine-grained MoE Upcycling
Ning Liao
Xiaoxing Wang
Peng Liu
Weiyang Guo
Feng Hong
...
Junchi Yan
Zhiyu Li
Feiyu Xiong
Yanfeng Wang
Linfeng Zhang
CLL
243
1
0
24 Jul 2025
Towards Greater Leverage: Scaling Laws for Efficient Mixture-of-Experts Language Models
Towards Greater Leverage: Scaling Laws for Efficient Mixture-of-Experts Language Models
Changxin Tian
Kunlong Chen
Jia-Ling Liu
Ziqi Liu
Zhiqiang Zhang
Jun Zhou
MoE
386
12
0
23 Jul 2025
WSM: Decay-Free Learning Rate Schedule via Checkpoint Merging for LLM Pre-training
WSM: Decay-Free Learning Rate Schedule via Checkpoint Merging for LLM Pre-training
Changxin Tian
Jiapeng Wang
Qian Zhao
Kunlong Chen
Jia-Ling Liu
Ziqi Liu
Jiaxin Mao
Wayne Xin Zhao
Zhiqiang Zhang
Jun Zhou
MoMeCLL
259
6
0
23 Jul 2025
Are LLM Belief Updates Consistent with Bayes' Theorem?
Are LLM Belief Updates Consistent with Bayes' Theorem?
Sohaib Imran
Ihor Kendiukhov
Matthew Broerman
Aditya Thomas
Riccardo Campanella
Rob Lamb
Peter M. Atkinson
168
3
0
23 Jul 2025
Towards Compute-Optimal Many-Shot In-Context Learning
Towards Compute-Optimal Many-Shot In-Context Learning
Shahriar Golchin
Yanfei Chen
Rujun Han
Manan Gandhi
Tianli Yu
Swaroop Mishra
Mihai Surdeanu
Rishabh Agarwal
Chen-Yu Lee
Tomas Pfister
213
0
0
22 Jul 2025
A Unifying Scheme for Extractive Content Selection Tasks
A Unifying Scheme for Extractive Content Selection Tasks
Shmuel Amar
Ori Shapira
Aviv Slobodkin
Ido Dagan
144
0
0
22 Jul 2025
Metric assessment protocol in the context of answer fluctuation on MCQ tasks
Metric assessment protocol in the context of answer fluctuation on MCQ tasks
Ekaterina Goliakova
X. Renard
Marie-Jeanne Lesot
Thibault Laugel
Christophe Marsala
Marcin Detyniecki
129
0
0
21 Jul 2025
Quantum Machine Learning in Multi-Qubit Phase-Space Part I: Foundations
Quantum Machine Learning in Multi-Qubit Phase-Space Part I: Foundations
Timothy Heightman
Edward Jiang
Ruth Mora-Soto
Maciej Lewenstein
Marcin Płodzień
315
4
0
16 Jul 2025
A Survey of Deep Learning for Geometry Problem Solving
A Survey of Deep Learning for Geometry Problem Solving
Jianzhe Ma
Wenxuan Wang
Qin Jin
444
2
0
16 Jul 2025
Deep Hidden Cognition Facilitates Reliable Chain-of-Thought Reasoning
Deep Hidden Cognition Facilitates Reliable Chain-of-Thought Reasoning
Zijun Chen
Wenbo Hu
Richang Hong
LRM
157
0
0
14 Jul 2025
RedOne: Revealing Domain-specific LLM Post-Training in Social Networking Services
RedOne: Revealing Domain-specific LLM Post-Training in Social Networking Services
Fei Zhao
Chonggang Lu
Yue Wang
Zheyong Xie
Ziyan Liu
...
Jun Fan
Xiaolong Jiang
Weiting Liu
Boyang Wang
Shaosheng Cao
ALM
219
0
0
13 Jul 2025
DATE-LM: Benchmarking Data Attribution Evaluation for Large Language Models
DATE-LM: Benchmarking Data Attribution Evaluation for Large Language Models
Cathy Jiao
Yijun Pan
Emily Xiao
Daisy Sheng
Niket Jain
H. C. Zhao
Ishita Dasgupta
Jiaqi W. Ma
Chenyan Xiong
216
0
0
12 Jul 2025
Previous
12345...212223
Next