ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.09261
  4. Cited By
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them

Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them

Annual Meeting of the Association for Computational Linguistics (ACL), 2022
17 October 2022
Mirac Suzgun
Nathan Scales
Nathanael Scharli
Sebastian Gehrmann
Yi Tay
Hyung Won Chung
Aakanksha Chowdhery
Quoc V. Le
Ed H. Chi
Denny Zhou
Jason W. Wei
    ALMELMLRMReLM
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them"

50 / 1,089 papers shown
Title
MeTA-LoRA: Data-Efficient Multi-Task Fine-Tuning for Large Language Models
MeTA-LoRA: Data-Efficient Multi-Task Fine-Tuning for Large Language Models
Bo Cheng
Xu Wang
Jinda Liu
Yi-Ju Chang
Yuan Wu
MoEALM
124
0
0
13 Oct 2025
APLOT: Robust Reward Modeling via Adaptive Preference Learning with Optimal Transport
APLOT: Robust Reward Modeling via Adaptive Preference Learning with Optimal Transport
Z. Li
Yuege Feng
Dandan Guo
Jinpeng Hu
Anningzhe Gao
Xiang Wan
76
0
0
13 Oct 2025
LLM Reasoning for Machine Translation: Synthetic Data Generation over Thinking Tokens
LLM Reasoning for Machine Translation: Synthetic Data Generation over Thinking Tokens
A. Zebaze
Rachel Bawden
Benoît Sagot
LRM
80
1
0
13 Oct 2025
Active Model Selection for Large Language Models
Active Model Selection for Large Language Models
Yavuz Durmazkeser
Patrik Okanovic
Andreas Kirsch
Torsten Hoefler
Nezihe Merve Gürel
92
0
0
10 Oct 2025
ProxRouter: Proximity-Weighted LLM Query Routing for Improved Robustness to Outliers
ProxRouter: Proximity-Weighted LLM Query Routing for Improved Robustness to Outliers
Shivam Patel
Neharika Jali
Ankur Mallick
Gauri Joshi
84
0
0
10 Oct 2025
Don't Throw Away Your Pretrained Model
Don't Throw Away Your Pretrained Model
Shangbin Feng
Wenhao Yu
Yike Wang
Hongming Zhang
Yulia Tsvetkov
Dong Yu
MoMe
158
0
0
10 Oct 2025
How Reliable is Language Model Micro-Benchmarking?
How Reliable is Language Model Micro-Benchmarking?
Gregory Yauney
Shahzaib Saqib Warraich
Swabha Swayamdipta
ALM
112
0
0
09 Oct 2025
Metric Calculating Benchmark: Code-Verifiable Complicate Instruction Following Benchmark for Large Language Models
Metric Calculating Benchmark: Code-Verifiable Complicate Instruction Following Benchmark for Large Language Models
Hyeonseok Moon
Seongtae Hong
Jaehyung Seo
Heuiseok Lim
ALM
108
0
0
09 Oct 2025
Encode, Think, Decode: Scaling test-time reasoning with recursive latent thoughts
Encode, Think, Decode: Scaling test-time reasoning with recursive latent thoughts
Yeskendir Koishekenov
Aldo Lipani
Nicola Cancedda
LRM
86
2
0
08 Oct 2025
Auto-Stega: An Agent-Driven System for Lifelong Strategy Evolution in LLM-Based Text Steganography
Auto-Stega: An Agent-Driven System for Lifelong Strategy Evolution in LLM-Based Text Steganography
Jiuan Zhou
Yu Cheng
Yuan Xie
Z. Yin
86
2
0
08 Oct 2025
MASA: Rethinking the Representational Bottleneck in LoRA with Multi-A Shared Adaptation
MASA: Rethinking the Representational Bottleneck in LoRA with Multi-A Shared Adaptation
Qin Dong
Yuntian Tang
Heming Jia
Yunhang Shen
Bohan Jia
Wenxuan Huang
Lianyue Zhang
Jiao Xie
Shaohui Lin
MoE
68
0
0
07 Oct 2025
Sample Smart, Not Hard: Correctness-First Decoding for Better Reasoning in LLMs
Sample Smart, Not Hard: Correctness-First Decoding for Better Reasoning in LLMs
Xueyan Li
Guinan Su
Mrinmaya Sachan
Jonas Geiping
LRM
61
0
0
07 Oct 2025
ARMOR: High-Performance Semi-Structured Pruning via Adaptive Matrix Factorization
ARMOR: High-Performance Semi-Structured Pruning via Adaptive Matrix Factorization
Lawrence Liu
Alexander Liu
Mengdi Wang
T. Zhao
Lin F. Yang
96
0
0
07 Oct 2025
TiTok: Transfer Token-level Knowledge via Contrastive Excess to Transplant LoRA
TiTok: Transfer Token-level Knowledge via Contrastive Excess to Transplant LoRA
Chanjoo Jung
Jaehyung Kim
116
0
0
06 Oct 2025
Mitigating Diffusion Model Hallucinations with Dynamic Guidance
Mitigating Diffusion Model Hallucinations with Dynamic Guidance
Kostas Triaridis
Alexandros Graikos
Aggelina Chatziagapi
Grigorios G. Chrysos
Dimitris Samaras
DiffM
78
0
0
06 Oct 2025
GRACE: Generative Representation Learning via Contrastive Policy Optimization
GRACE: Generative Representation Learning via Contrastive Policy Optimization
Jiashuo Sun
Shixuan Liu
Zhaochen Su
Xianrui Zhong
Pengcheng Jiang
Sara Szymkuć
Peiran Li
Weijia Shi
Jiawei Han
82
0
0
06 Oct 2025
Finish First, Perfect Later: Test-Time Token-Level Cross-Validation for Diffusion Large Language Models
Finish First, Perfect Later: Test-Time Token-Level Cross-Validation for Diffusion Large Language Models
Runchu Tian
Junxia Cui
Xueqiang Xu
Feng Yao
Jingbo Shang
113
1
0
06 Oct 2025
The End of Transformers? On Challenging Attention and the Rise of Sub-Quadratic Architectures
The End of Transformers? On Challenging Attention and the Rise of Sub-Quadratic Architectures
Alexander Fichtl
Jeremias Bohn
Josefin Kelber
Edoardo Mosca
Georg Groh
88
0
0
06 Oct 2025
Increasing LLM response trustworthiness using voting ensembles
Increasing LLM response trustworthiness using voting ensembles
Aparna Nair-Kanneganti
Trevor J. Chan
Shir Goldfinger
Emily Mackay
Brian Anthony
Alison M. Pouch
99
0
0
05 Oct 2025
Self-Anchor: Large Language Model Reasoning via Step-by-step Attention Alignment
Self-Anchor: Large Language Model Reasoning via Step-by-step Attention Alignment
Hongxiang Zhang
Yuan Tian
Tianyi Zhang
LRM
74
1
0
03 Oct 2025
Fine-Tuning on Noisy Instructions: Effects on Generalization and Performance
Fine-Tuning on Noisy Instructions: Effects on Generalization and Performance
Ahmed Alajrami
Xingwei Tan
Nikolaos Aletras
110
0
0
03 Oct 2025
Think Right: Learning to Mitigate Under-Over Thinking via Adaptive, Attentive Compression
Think Right: Learning to Mitigate Under-Over Thinking via Adaptive, Attentive Compression
Joykirat Singh
Justin Chih-Yao Chen
Archiki Prasad
Elias Stengel-Eskin
A. Nambi
Mohit Bansal
OffRLLRM
100
0
0
02 Oct 2025
A-VERT: Agnostic Verification with Embedding Ranking Targets
A-VERT: Agnostic Verification with Embedding Ranking Targets
Nicolás Aguirre
Ramiro Caso
Ramiro Rodríguez Colmeiro
Mauro Santelli
Joaquín Toranzo Calderón
60
0
0
01 Oct 2025
Safety Instincts: LLMs Learn to Trust Their Internal Compass for Self-Defense
Safety Instincts: LLMs Learn to Trust Their Internal Compass for Self-Defense
Guobin Shen
Dongcheng Zhao
Haibo Tong
Jindong Li
Feifei Zhao
Yi Zeng
80
0
0
01 Oct 2025
Uncovering the Computational Ingredients of Human-Like Representations in LLMs
Uncovering the Computational Ingredients of Human-Like Representations in LLMs
Zach Studdiford
Timothy T. Rogers
Kushin Mukherjee
Siddharth Suresh
140
0
0
01 Oct 2025
MADS: Multi-Agent Dialogue Simulation for Diverse Persuasion Data Generation
MADS: Multi-Agent Dialogue Simulation for Diverse Persuasion Data Generation
Mingjin Li
Yu Liu
Huayi Liu
Xiang Ye
Chao Jiang
Hongguang Zhang
Yu Ruan
168
2
0
30 Sep 2025
Collaborative Compression for Large-Scale MoE Deployment on Edge
Collaborative Compression for Large-Scale MoE Deployment on Edge
Yixiao Chen
Yanyue Xie
Ruining Yang
Wei Jiang
Wei Wang
Yong He
Yue Chen
Pu Zhao
Y. Wang
MQ
60
0
0
30 Sep 2025
AIMCoT: Active Information-driven Multimodal Chain-of-Thought for Vision-Language Reasoning
AIMCoT: Active Information-driven Multimodal Chain-of-Thought for Vision-Language Reasoning
Xiping Li
Jianghong Ma
LRM
65
0
0
30 Sep 2025
The Flaw of Averages: Quantifying Uniformity of Performance on Benchmarks
The Flaw of Averages: Quantifying Uniformity of Performance on Benchmarks
Arda Uzunoglu
Tianjian Li
Daniel Khashabi
116
0
0
30 Sep 2025
Nudging the Boundaries of LLM Reasoning
Nudging the Boundaries of LLM Reasoning
Justin Chih-Yao Chen
Becky Xiangyu Peng
Prafulla Kumar Choubey
Kung-Hsiang Huang
Jiaxin Zhang
Mohit Bansal
Chien-Sheng Wu
LRM
84
0
0
30 Sep 2025
InfLLM-V2: Dense-Sparse Switchable Attention for Seamless Short-to-Long Adaptation
InfLLM-V2: Dense-Sparse Switchable Attention for Seamless Short-to-Long Adaptation
Weilin Zhao
Z. Zhou
Zhou Su
Chaojun Xiao
Yuxuan Li
...
Ruoyao Xiao
Yuxiang Huang
Ao Sun
Xu Han
Zhiyuan Liu
VLM
139
4
0
29 Sep 2025
SeaPO: Strategic Error Amplification for Robust Preference Optimization of Large Language Models
SeaPO: Strategic Error Amplification for Robust Preference Optimization of Large Language Models
Jun Rao
Yunjie Liao
Xuebo Liu
Zepeng Lin
Lian Lian
Dong Jin
Shengjun Cheng
Jun-chen Yu
Min Zhang
104
0
0
29 Sep 2025
AdaThink-Med: Medical Adaptive Thinking with Uncertainty-Guided Length Calibration
AdaThink-Med: Medical Adaptive Thinking with Uncertainty-Guided Length Calibration
Shaohao Rui
Kaitao Chen
Weijie Ma
Xiaosong Wang
MedImLRM
72
0
0
29 Sep 2025
LLaDA-MoE: A Sparse MoE Diffusion Language Model
LLaDA-MoE: A Sparse MoE Diffusion Language Model
Fengqi Zhu
Zebin You
Yipeng Xing
Zenan Huang
Lin Liu
...
Junbo Zhao
Da Zheng
Chongxuan Li
Jianguo Li
J. Wen
MoE
176
8
0
29 Sep 2025
FedPOB: Sample-Efficient Federated Prompt Optimization via Bandits
FedPOB: Sample-Efficient Federated Prompt Optimization via Bandits
Pingchen Lu
Zhi Hong
Zhiwei Shang
Zhiyong Wang
Yikun Ban
Yao Shu
Min Zhang
Shuang Qiu
Zhongxiang Dai
FedML
80
0
0
29 Sep 2025
Beyond Magic Words: Sharpness-Aware Prompt Evolving for Robust Large Language Models with TARE
Beyond Magic Words: Sharpness-Aware Prompt Evolving for Robust Large Language Models with TARE
Guancheng Wan
Lucheng Fu
Haoxin Liu
Yiqiao Jin
Hui Yi Leong
...
Yunpu Ma
Xiangru Tang
B. A. Prakash
Yizhou Sun
Wei Wang
KELM
71
0
0
28 Sep 2025
Bridging the Knowledge-Prediction Gap in LLMs on Multiple-Choice Questions
Bridging the Knowledge-Prediction Gap in LLMs on Multiple-Choice Questions
Yoonah Park
Haesung Pyun
Yohan Jo
KELM
176
0
0
28 Sep 2025
No Loss, No Gain: Gated Refinement and Adaptive Compression for Prompt Optimization
No Loss, No Gain: Gated Refinement and Adaptive Compression for Prompt Optimization
Wenhang Shi
Yiren Chen
Shuqing Bian
Xinyi Zhang
Kai Tang
Pengfei Hu
Zhe Zhao
Wei Lu
Xiaoyong Du
72
0
0
27 Sep 2025
Mapping Overlaps in Benchmarks through Perplexity in the Wild
Mapping Overlaps in Benchmarks through Perplexity in the Wild
Siyang Wu
Honglin Bao
Sida Li
Ari Holtzman
James A. Evans
211
0
0
27 Sep 2025
Quant-dLLM: Post-Training Extreme Low-Bit Quantization for Diffusion Large Language Models
Quant-dLLM: Post-Training Extreme Low-Bit Quantization for Diffusion Large Language Models
Tianao Zhang
Zhiteng Li
Xianglong Yan
Haotong Qin
Yong Guo
Yulun Zhang
MQ
77
0
0
27 Sep 2025
PATCH: Learnable Tile-level Hybrid Sparsity for LLMs
PATCH: Learnable Tile-level Hybrid Sparsity for LLMs
Younes Hourri
Mohammad Mozaffari
M. Dehnavi
132
0
0
27 Sep 2025
Representing LLMs in Prompt Semantic Task Space
Representing LLMs in Prompt Semantic Task Space
Idan Kashani
A. Mendelson
Yaniv Nemcovsky
72
0
0
26 Sep 2025
What Matters More For In-Context Learning under Matched Compute Budgets: Pretraining on Natural Text or Incorporating Targeted Synthetic Examples?
What Matters More For In-Context Learning under Matched Compute Budgets: Pretraining on Natural Text or Incorporating Targeted Synthetic Examples?
Mohammed Sabry
Anya Belz
71
0
0
26 Sep 2025
CLUE: Conflict-guided Localization for LLM Unlearning Framework
CLUE: Conflict-guided Localization for LLM Unlearning Framework
Hang Chen
Jiaying Zhu
Xinyu Yang
Wenya Wang
MU
128
0
0
25 Sep 2025
Mixture of Thoughts: Learning to Aggregate What Experts Think, Not Just What They Say
Mixture of Thoughts: Learning to Aggregate What Experts Think, Not Just What They Say
Jacob Fein-Ashley
Dhruv Parikh
Rajgopal Kannan
Viktor Prasanna
MoEMoMeLRM
136
1
0
25 Sep 2025
Distilling Many-Shot In-Context Learning into a Cheat Sheet
Distilling Many-Shot In-Context Learning into a Cheat Sheet
Ukyo Honda
Soichiro Murakami
Peinan Zhang
72
1
0
25 Sep 2025
Expanding Reasoning Potential in Foundation Model by Learning Diverse Chains of Thought Patterns
Expanding Reasoning Potential in Foundation Model by Learning Diverse Chains of Thought Patterns
Xuemiao Zhang
Can Ren
Chengying Tu
Rongxiang Weng
Shuo Wang
Hongfei Yan
Jingang Wang
Xunliang Cai
LRMAI4CE
133
1
0
25 Sep 2025
When Long Helps Short: How Context Length in Supervised Fine-tuning Affects Behavior of Large Language Models
When Long Helps Short: How Context Length in Supervised Fine-tuning Affects Behavior of Large Language Models
Yingming Zheng
Hanqi Li
Kai Yu
Lu Chen
193
0
0
23 Sep 2025
What Does Your Benchmark Really Measure? A Framework for Robust Inference of AI Capabilities
What Does Your Benchmark Really Measure? A Framework for Robust Inference of AI Capabilities
Nathanael Jo
Ashia Wilson
ELM
110
0
0
23 Sep 2025
AD-VF: LLM-Automatic Differentiation Enables Fine-Tuning-Free Robot Planning from Formal Methods Feedback
AD-VF: LLM-Automatic Differentiation Enables Fine-Tuning-Free Robot Planning from Formal Methods Feedback
Yunhao Yang
Junyuan Hong
Gabriel Jacob Perin
Zhiwen Fan
L. Yin
Zinan Lin
Ufuk Topcu
80
1
0
22 Sep 2025
Previous
12345...202122
Next