ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.09261
  4. Cited By
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them

Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them

17 October 2022
Mirac Suzgun
Nathan Scales
Nathanael Scharli
Sebastian Gehrmann
Yi Tay
Hyung Won Chung
Aakanksha Chowdhery
Quoc V. Le
Ed H. Chi
Denny Zhou
Jason W. Wei
    ALM
    ELM
    LRM
    ReLM
ArXivPDFHTML

Papers citing "Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them"

50 / 788 papers shown
Title
Compound-QA: A Benchmark for Evaluating LLMs on Compound Questions
Compound-QA: A Benchmark for Evaluating LLMs on Compound Questions
Yutao Hou
Yajing Luo
Zhiwen Ruan
H. Wang
Weifeng Ge
Y. Chen
Guanhua Chen
ELM
38
0
0
15 Nov 2024
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
Weiyun Wang
Zhe Chen
Wenhai Wang
Yue Cao
Yangzhou Liu
...
Jinguo Zhu
X. Zhu
Lewei Lu
Yu Qiao
Jifeng Dai
LRM
49
45
1
15 Nov 2024
SetLexSem Challenge: Using Set Operations to Evaluate the Lexical and
  Semantic Robustness of Language Models
SetLexSem Challenge: Using Set Operations to Evaluate the Lexical and Semantic Robustness of Language Models
Bardiya Akhbari
Manish Gawali
Nicholas A. Dronen
AAML
22
0
0
11 Nov 2024
Synthesize, Partition, then Adapt: Eliciting Diverse Samples from
  Foundation Models
Synthesize, Partition, then Adapt: Eliciting Diverse Samples from Foundation Models
Yeming Wen
Swarat Chaudhuri
19
0
0
11 Nov 2024
Clustering Algorithms and RAG Enhancing Semi-Supervised Text
  Classification with Large LLMs
Clustering Algorithms and RAG Enhancing Semi-Supervised Text Classification with Large LLMs
Shan Zhong
Jiahao Zeng
Yongxin Yu
Bohong Lin
29
1
0
09 Nov 2024
Evaluation data contamination in LLMs: how do we measure it and (when)
  does it matter?
Evaluation data contamination in LLMs: how do we measure it and (when) does it matter?
Aaditya K. Singh
Muhammed Yusuf Kocyigit
Andrew Poulton
David Esiobu
Maria Lomeli
Gergely Szilvasy
Dieuwke Hupkes
20
0
0
06 Nov 2024
Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated
  Parameters by Tencent
Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
X. Sun
Yanfeng Chen
Y. Huang
Ruobing Xie
Jiaqi Zhu
...
Zhanhui Kang
Yong Yang
Yuhong Liu
Di Wang
Jie Jiang
MoE
ALM
ELM
65
24
0
04 Nov 2024
Sparsing Law: Towards Large Language Models with Greater Activation Sparsity
Sparsing Law: Towards Large Language Models with Greater Activation Sparsity
Yuqi Luo
Chenyang Song
Xu Han
Y. Chen
Chaojun Xiao
Zhiyuan Liu
Maosong Sun
44
3
0
04 Nov 2024
What is Wrong with Perplexity for Long-context Language Modeling?
What is Wrong with Perplexity for Long-context Language Modeling?
Lizhe Fang
Yifei Wang
Zhaoyang Liu
Chenheng Zhang
Stefanie Jegelka
Jinyang Gao
Bolin Ding
Yisen Wang
49
4
0
31 Oct 2024
From Babble to Words: Pre-Training Language Models on Continuous Streams
  of Phonemes
From Babble to Words: Pre-Training Language Models on Continuous Streams of Phonemes
Zébulon Goriely
Richard Diehl Martinez
Andrew Caines
Lisa Beinborn
P. Buttery
CLL
24
5
0
30 Oct 2024
Belief in the Machine: Investigating Epistemological Blind Spots of
  Language Models
Belief in the Machine: Investigating Epistemological Blind Spots of Language Models
Mirac Suzgun
Tayfun Gur
Federico Bianchi
Daniel E. Ho
Thomas F. Icard
Dan Jurafsky
James Zou
24
1
0
28 Oct 2024
Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on
  Tasks where Thinking Makes Humans Worse
Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse
Ryan Liu
Jiayi Geng
Addison J. Wu
Ilia Sucholutsky
Tania Lombrozo
Thomas L. Griffiths
ReLM
LRM
55
19
0
27 Oct 2024
Rethinking Data Synthesis: A Teacher Model Training Recipe with
  Interpretation
Rethinking Data Synthesis: A Teacher Model Training Recipe with Interpretation
Yifang Chen
David Zhu
SyDa
25
0
0
27 Oct 2024
DAWN-ICL: Strategic Planning of Problem-solving Trajectories for Zero-Shot In-Context Learning
DAWN-ICL: Strategic Planning of Problem-solving Trajectories for Zero-Shot In-Context Learning
Xinyu Tang
Xiaolei Wang
Wayne Xin Zhao
Ji-Rong Wen
35
3
0
26 Oct 2024
Layer by Layer: Uncovering Where Multi-Task Learning Happens in
  Instruction-Tuned Large Language Models
Layer by Layer: Uncovering Where Multi-Task Learning Happens in Instruction-Tuned Large Language Models
Zheng Zhao
Yftah Ziser
Shay B. Cohen
17
0
0
25 Oct 2024
Cooperative Strategic Planning Enhances Reasoning Capabilities in Large
  Language Models
Cooperative Strategic Planning Enhances Reasoning Capabilities in Large Language Models
Danqing Wang
Zhuorui Ye
Fei Fang
Lei Li
LLMAG
LRM
13
0
0
25 Oct 2024
Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies
Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies
L. Wang
Sheng Chen
Linnan Jiang
Shu Pan
Runze Cai
Sen Yang
Fei Yang
44
3
0
24 Oct 2024
MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language
  Models Fine-tuning
MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning
Jingfan Zhang
Yi Zhao
Dan Chen
Xing Tian
Huanran Zheng
Wei Zhu
MoE
26
12
0
23 Oct 2024
VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning
VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning
Yifan Peng
Krishna C. Puvvada
Zhehuai Chen
Piotr .Zelasko
He Huang
Kunal Dhawan
Ke Hu
Shinji Watanabe
Jagadeesh Balam
Boris Ginsburg
41
2
0
23 Oct 2024
Controlled Low-Rank Adaptation with Subspace Regularization for Continued Training on Large Language Models
Controlled Low-Rank Adaptation with Subspace Regularization for Continued Training on Large Language Models
Yuheng Lu
Bingshuo Qian
Caixia Yuan
Huixing Jiang
Xiaojie Wang
CLL
20
0
0
22 Oct 2024
Rulebreakers Challenge: Revealing a Blind Spot in Large Language Models'
  Reasoning with Formal Logic
Rulebreakers Challenge: Revealing a Blind Spot in Large Language Models' Reasoning with Formal Logic
Jason Chan
Robert Gaizauskas
Zhixue Zhao
ELM
AAML
LRM
15
0
0
21 Oct 2024
RM-Bench: Benchmarking Reward Models of Language Models with Subtlety
  and Style
RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
Yantao Liu
Zijun Yao
Rui Min
Yixin Cao
Lei Hou
Juanzi Li
OffRL
ALM
16
23
0
21 Oct 2024
Compute-Constrained Data Selection
Compute-Constrained Data Selection
Junjie Oscar Yin
Alexander M. Rush
35
0
0
21 Oct 2024
MorphAgent: Empowering Agents through Self-Evolving Profiles and
  Decentralized Collaboration
MorphAgent: Empowering Agents through Self-Evolving Profiles and Decentralized Collaboration
Siyuan Lu
Jiaqi Shao
B. Luo
Tao Lin
LM&Ro
LLMAG
AI4CE
24
2
0
19 Oct 2024
SPRIG: Improving Large Language Model Performance by System Prompt
  Optimization
SPRIG: Improving Large Language Model Performance by System Prompt Optimization
Lechen Zhang
Tolga Ergen
Lajanugen Logeswaran
Moontae Lee
David Jurgens
LRM
37
7
0
18 Oct 2024
MoDification: Mixture of Depths Made Easy
MoDification: Mixture of Depths Made Easy
C. Zhang
M. Zhong
Qimeng Wang
Xuantao Lu
Zheyu Ye
...
Yan Gao
Yao Hu
Kehai Chen
Min Zhang
Dawei Song
VLM
MoE
22
2
0
18 Oct 2024
LLM The Genius Paradox: A Linguistic and Math Expert's Struggle with Simple Word-based Counting Problems
LLM The Genius Paradox: A Linguistic and Math Expert's Struggle with Simple Word-based Counting Problems
Nan Xu
Xuezhe Ma
LRM
29
3
0
18 Oct 2024
Accounting for Sycophancy in Language Model Uncertainty Estimation
Accounting for Sycophancy in Language Model Uncertainty Estimation
Anthony Sicilia
Mert Inan
Malihe Alikhani
19
1
0
17 Oct 2024
BenTo: Benchmark Task Reduction with In-Context Transferability
BenTo: Benchmark Task Reduction with In-Context Transferability
Hongyu Zhao
Ming Li
Lichao Sun
Tianyi Zhou
23
0
0
17 Oct 2024
Improving Multi-modal Large Language Model through Boosting Vision
  Capabilities
Improving Multi-modal Large Language Model through Boosting Vision Capabilities
Yanpeng Sun
H. Zhang
Qiang Chen
Xinyu Zhang
Nong Sang
Gang Zhang
Jingdong Wang
Zechao Li
21
0
0
17 Oct 2024
IterSelectTune: An Iterative Training Framework for Efficient
  Instruction-Tuning Data Selection
IterSelectTune: An Iterative Training Framework for Efficient Instruction-Tuning Data Selection
Jielin Song
Siyu Liu
Bin Zhu
Yanghui Rao
20
2
0
17 Oct 2024
Better to Ask in English: Evaluation of Large Language Models on
  English, Low-resource and Cross-Lingual Settings
Better to Ask in English: Evaluation of Large Language Models on English, Low-resource and Cross-Lingual Settings
Krishno Dey
Prerona Tarannum
Md. Arid Hasan
Imran Razzak
Usman Naseem
19
0
0
17 Oct 2024
RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards
RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards
Xinze Li
Sen Mei
Zhenghao Liu
Yukun Yan
Shuo Wang
...
H. Chen
Ge Yu
Zhiyuan Liu
Maosong Sun
Chenyan Xiong
35
6
0
17 Oct 2024
LLM-Human Pipeline for Cultural Context Grounding of Conversations
LLM-Human Pipeline for Cultural Context Grounding of Conversations
Rajkumar Pujari
Dan Goldwasser
16
1
0
17 Oct 2024
LoRA Soups: Merging LoRAs for Practical Skill Composition Tasks
LoRA Soups: Merging LoRAs for Practical Skill Composition Tasks
Akshara Prabhakar
Yuanzhi Li
Karthik Narasimhan
Sham Kakade
Eran Malach
Samy Jelassi
MoMe
21
1
0
16 Oct 2024
"Let's Argue Both Sides": Argument Generation Can Force Small Models to
  Utilize Previously Inaccessible Reasoning Capabilities
"Let's Argue Both Sides": Argument Generation Can Force Small Models to Utilize Previously Inaccessible Reasoning Capabilities
Kaveh Eskandari Miandoab
Vasanth Sarathy
LRM
ReLM
13
0
0
16 Oct 2024
Merge to Learn: Efficiently Adding Skills to Language Models with Model
  Merging
Merge to Learn: Efficiently Adding Skills to Language Models with Model Merging
Jacob Morrison
Noah A. Smith
Hannaneh Hajishirzi
Pang Wei Koh
Jesse Dodge
Pradeep Dasigi
KELM
MoMe
CLL
28
1
0
16 Oct 2024
Conformity in Large Language Models
Conformity in Large Language Models
Xiaochen Zhu
Caiqi Zhang
Tom Stafford
Nigel Collier
Andreas Vlachos
30
0
0
16 Oct 2024
Neuron-based Personality Trait Induction in Large Language Models
Neuron-based Personality Trait Induction in Large Language Models
Jia Deng
Tianyi Tang
Yanbin Yin
Wenhao Yang
Wayne Xin Zhao
Ji-Rong Wen
24
1
0
16 Oct 2024
Reversal of Thought: Enhancing Large Language Models with Preference-Guided Reverse Reasoning Warm-up
Reversal of Thought: Enhancing Large Language Models with Preference-Guided Reverse Reasoning Warm-up
Jiahao Yuan
Dehui Du
Hao Zhang
Zixiang Di
Usman Naseem
LRM
24
1
0
16 Oct 2024
JudgeBench: A Benchmark for Evaluating LLM-based Judges
JudgeBench: A Benchmark for Evaluating LLM-based Judges
Sijun Tan
Siyuan Zhuang
Kyle Montgomery
William Y. Tang
Alejandro Cuadron
Chenguang Wang
Raluca A. Popa
Ion Stoica
ELM
ALM
47
35
0
16 Oct 2024
Agent Skill Acquisition for Large Language Models via CycleQD
Agent Skill Acquisition for Large Language Models via CycleQD
So Kuroki
Taishi Nakamura
Takuya Akiba
Yujin Tang
MoMe
24
0
0
16 Oct 2024
Causal Reasoning in Large Language Models: A Knowledge Graph Approach
Causal Reasoning in Large Language Models: A Knowledge Graph Approach
Yejin Kim
Eojin Kang
Juae Kim
H. H. Huang
ReLM
LRM
19
0
0
15 Oct 2024
TSDS: Data Selection for Task-Specific Model Finetuning
TSDS: Data Selection for Task-Specific Model Finetuning
Zifan Liu
Amin Karbasi
Theodoros Rekatsinas
19
2
0
15 Oct 2024
FLARE: Faithful Logic-Aided Reasoning and Exploration
FLARE: Faithful Logic-Aided Reasoning and Exploration
Erik Arakelyan
Pasquale Minervini
Pat Verga
Patrick Lewis
Isabelle Augenstein
ReLM
LRM
57
2
0
14 Oct 2024
A Counterexample in Image Registration
A Counterexample in Image Registration
Serap A. Savari
21
2
0
14 Oct 2024
Ada-K Routing: Boosting the Efficiency of MoE-based LLMs
Ada-K Routing: Boosting the Efficiency of MoE-based LLMs
Tongtian Yue
Longteng Guo
Jie Cheng
Xuange Gao
J. Liu
MoE
18
0
0
14 Oct 2024
Locking Down the Finetuned LLMs Safety
Locking Down the Finetuned LLMs Safety
Minjun Zhu
Linyi Yang
Yifan Wei
Ningyu Zhang
Yue Zhang
34
8
0
14 Oct 2024
Reverse Modeling in Large Language Models
Reverse Modeling in Large Language Models
S. Yu
Yuanchen Xu
Cunxiao Du
Yanying Zhou
Minghui Qiu
Q. Sun
Hao Zhang
Jiawei Wu
26
1
0
13 Oct 2024
Survival of the Safest: Towards Secure Prompt Optimization through
  Interleaved Multi-Objective Evolution
Survival of the Safest: Towards Secure Prompt Optimization through Interleaved Multi-Objective Evolution
Ankita Sinha
Wendi Cui
Kamalika Das
Jiaxin Zhang
AAML
15
2
0
12 Oct 2024
Previous
12345...141516
Next