ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.09261
  4. Cited By
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them

Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them

17 October 2022
Mirac Suzgun
Nathan Scales
Nathanael Scharli
Sebastian Gehrmann
Yi Tay
Hyung Won Chung
Aakanksha Chowdhery
Quoc V. Le
Ed H. Chi
Denny Zhou
Jason W. Wei
    ALM
    ELM
    LRM
    ReLM
ArXivPDFHTML

Papers citing "Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them"

50 / 788 papers shown
Title
Scope of Large Language Models for Mining Emerging Opinions in Online
  Health Discourse
Scope of Large Language Models for Mining Emerging Opinions in Online Health Discourse
Joseph Gatto
Madhusudan Basak
Yash Srivastava
Philip Bohlman
S. Preum
38
1
0
05 Mar 2024
Exploring the Limitations of Large Language Models in Compositional
  Relation Reasoning
Exploring the Limitations of Large Language Models in Compositional Relation Reasoning
Jinman Zhao
Xueyan Zhang
BDL
LRM
17
4
0
05 Mar 2024
Eliciting Better Multilingual Structured Reasoning from LLMs through
  Code
Eliciting Better Multilingual Structured Reasoning from LLMs through Code
Bryan Li
Tamer Alkhouli
Daniele Bonadiman
Nikolaos Pappas
Saab Mansour
LRM
20
7
0
05 Mar 2024
Masked Thought: Simply Masking Partial Reasoning Steps Can Improve
  Mathematical Reasoning Learning of Language Models
Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models
Changyu Chen
Xiting Wang
Ting-En Lin
Ang Lv
Yuchuan Wu
Xin Gao
Ji-Rong Wen
Rui Yan
Yongbin Li
ReLM
LRM
21
9
0
04 Mar 2024
SciAssess: Benchmarking LLM Proficiency in Scientific Literature
  Analysis
SciAssess: Benchmarking LLM Proficiency in Scientific Literature Analysis
Hengxing Cai
Xiaochen Cai
Junhan Chang
Sihang Li
Lin Yao
...
Changhong Chen
Zheng Cheng
Zifeng Zhao
Linfeng Zhang
Guolin Ke
ELM
21
22
0
04 Mar 2024
Formulation Comparison for Timeline Construction using LLMs
Formulation Comparison for Timeline Construction using LLMs
Kimihiro Hasegawa
Nikhil Kandukuri
Susan Holm
Yukari Yamakawa
Teruko Mitamura
28
0
0
01 Mar 2024
Analyzing and Reducing Catastrophic Forgetting in Parameter Efficient
  Tuning
Analyzing and Reducing Catastrophic Forgetting in Parameter Efficient Tuning
Weijieying Ren
Xinlong Li
Lei Wang
Tianxiang Zhao
Wei Qin
CLL
KELM
32
30
0
29 Feb 2024
KoDialogBench: Evaluating Conversational Understanding of Language
  Models with Korean Dialogue Benchmark
KoDialogBench: Evaluating Conversational Understanding of Language Models with Korean Dialogue Benchmark
Seongbo Jang
Seonghyeon Lee
Hwanjo Yu
ELM
27
0
0
27 Feb 2024
MATHSENSEI: A Tool-Augmented Large Language Model for Mathematical
  Reasoning
MATHSENSEI: A Tool-Augmented Large Language Model for Mathematical Reasoning
Debrup Das
Debopriyo Banerjee
Somak Aditya
Ashish Kulkarni
ReLM
LRM
24
10
0
27 Feb 2024
Nemotron-4 15B Technical Report
Nemotron-4 15B Technical Report
Jupinder Parmar
Shrimai Prabhumoye
Joseph Jennings
M. Patwary
Sandeep Subramanian
...
Ashwath Aithal
Oleksii Kuchaiev
M. Shoeybi
Jonathan Cohen
Bryan Catanzaro
31
22
0
26 Feb 2024
SelectIT: Selective Instruction Tuning for LLMs via Uncertainty-Aware Self-Reflection
SelectIT: Selective Instruction Tuning for LLMs via Uncertainty-Aware Self-Reflection
Liangxin Liu
Xuebo Liu
Derek F. Wong
Dongfang Li
Ziyi Wang
Baotian Hu
Min Zhang
45
16
0
26 Feb 2024
PRoLoRA: Partial Rotation Empowers More Parameter-Efficient LoRA
PRoLoRA: Partial Rotation Empowers More Parameter-Efficient LoRA
Sheng Wang
Boyang Xue
Jiacheng Ye
Jiyue Jiang
Liheng Chen
Lingpeng Kong
Chuan Wu
23
13
0
24 Feb 2024
Unintended Impacts of LLM Alignment on Global Representation
Unintended Impacts of LLM Alignment on Global Representation
Michael Joseph Ryan
William B. Held
Diyi Yang
35
39
0
22 Feb 2024
CriticBench: Benchmarking LLMs for Critique-Correct Reasoning
CriticBench: Benchmarking LLMs for Critique-Correct Reasoning
Zicheng Lin
Zhibin Gou
Tian Liang
Ruilin Luo
Haowei Liu
Yujiu Yang
LRM
40
43
0
22 Feb 2024
Balanced Data Sampling for Language Model Training with Clustering
Balanced Data Sampling for Language Model Training with Clustering
Yunfan Shao
Linyang Li
Zhaoye Fei
Hang Yan
Dahua Lin
Xipeng Qiu
29
8
0
22 Feb 2024
BIRCO: A Benchmark of Information Retrieval Tasks with Complex
  Objectives
BIRCO: A Benchmark of Information Retrieval Tasks with Complex Objectives
Xiaoyue Wang
Jianyou Wang
Weili Cao
Kaicheng Wang
R. Paturi
Leon Bergen
37
6
0
21 Feb 2024
Making Reasoning Matter: Measuring and Improving Faithfulness of
  Chain-of-Thought Reasoning
Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning
Debjit Paul
Robert West
Antoine Bosselut
Boi Faltings
ReLM
LRM
33
20
0
21 Feb 2024
Dynamic Evaluation of Large Language Models by Meta Probing Agents
Dynamic Evaluation of Large Language Models by Meta Probing Agents
Kaijie Zhu
Jindong Wang
Qinlin Zhao
Ruochen Xu
Xing Xie
40
30
0
21 Feb 2024
ProSparse: Introducing and Enhancing Intrinsic Activation Sparsity
  within Large Language Models
ProSparse: Introducing and Enhancing Intrinsic Activation Sparsity within Large Language Models
Chenyang Song
Xu Han
Zhengyan Zhang
Shengding Hu
Xiyu Shi
...
Chen Chen
Zhiyuan Liu
Guanglin Li
Tao Yang
Maosong Sun
48
24
0
21 Feb 2024
Structure Guided Prompt: Instructing Large Language Model in Multi-Step
  Reasoning by Exploring Graph Structure of the Text
Structure Guided Prompt: Instructing Large Language Model in Multi-Step Reasoning by Exploring Graph Structure of the Text
Kewei Cheng
Nesreen K. Ahmed
Theodore L. Willke
Yizhou Sun
LRM
38
4
0
20 Feb 2024
TreeEval: Benchmark-Free Evaluation of Large Language Models through
  Tree Planning
TreeEval: Benchmark-Free Evaluation of Large Language Models through Tree Planning
Xiang Li
Yunshi Lan
Chao Yang
ELM
38
7
0
20 Feb 2024
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for
  Language Models
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models
Haoran Li
Qingxiu Dong
Zhengyang Tang
Chaojun Wang
Xingxing Zhang
...
Wei Lu
Zhifang Sui
Benyou Wang
Wai Lam
Furu Wei
SyDa
56
50
0
20 Feb 2024
Chain of Thought Empowers Transformers to Solve Inherently Serial
  Problems
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems
Zhiyuan Li
Hong Liu
Denny Zhou
Tengyu Ma
LRM
AI4CE
20
94
0
20 Feb 2024
HyperMoE: Towards Better Mixture of Experts via Transferring Among
  Experts
HyperMoE: Towards Better Mixture of Experts via Transferring Among Experts
Hao Zhao
Zihan Qiu
Huijia Wu
Zili Wang
Zhaofeng He
Jie Fu
MoE
22
9
0
20 Feb 2024
AnaloBench: Benchmarking the Identification of Abstract and Long-context
  Analogies
AnaloBench: Benchmarking the Identification of Abstract and Long-context Analogies
Xiao Ye
Andrew Wang
Jacob Choi
Yining Lu
Shreya Sharma
Lingfeng Shen
Vijay Tiyyala
Nicholas Andrews
Daniel Khashabi
ELM
31
8
0
19 Feb 2024
Reformatted Alignment
Reformatted Alignment
Run-Ze Fan
Xuefeng Li
Haoyang Zou
Junlong Li
Shwai He
Ethan Chern
Jiewen Hu
Pengfei Liu
57
8
0
19 Feb 2024
Revisiting Knowledge Distillation for Autoregressive Language Models
Revisiting Knowledge Distillation for Autoregressive Language Models
Qihuang Zhong
Liang Ding
Li Shen
Juhua Liu
Bo Du
Dacheng Tao
KELM
39
15
0
19 Feb 2024
FIPO: Free-form Instruction-oriented Prompt Optimization with Preference
  Dataset and Modular Fine-tuning Schema
FIPO: Free-form Instruction-oriented Prompt Optimization with Preference Dataset and Modular Fine-tuning Schema
Junru Lu
Siyu An
Min Zhang
Yulan He
Di Yin
Xing Sun
32
1
0
19 Feb 2024
Chain-of-Instructions: Compositional Instruction Tuning on Large Language Models
Chain-of-Instructions: Compositional Instruction Tuning on Large Language Models
S. Hayati
Taehee Jung
Tristan Bodding-Long
Sudipta Kar
A. Sethy
Joo-Kyung Kim
Dongyeop Kang
ALM
LRM
30
6
0
18 Feb 2024
Benchmarking Knowledge Boundary for Large Language Models: A Different
  Perspective on Model Evaluation
Benchmarking Knowledge Boundary for Large Language Models: A Different Perspective on Model Evaluation
Xunjian Yin
Xu Zhang
Jie Ruan
Xiaojun Wan
ELM
20
17
0
18 Feb 2024
PhaseEvo: Towards Unified In-Context Prompt Optimization for Large
  Language Models
PhaseEvo: Towards Unified In-Context Prompt Optimization for Large Language Models
Wendi Cui
Jiaxin Zhang
Zhuohang Li
Hao Sun
Damien Lopez
Kamalika Das
Bradley Malin
Kumar Sricharan
14
6
0
17 Feb 2024
Chain-of-Thought Reasoning Without Prompting
Chain-of-Thought Reasoning Without Prompting
Xuezhi Wang
Denny Zhou
ReLM
LRM
144
97
0
15 Feb 2024
BitDelta: Your Fine-Tune May Only Be Worth One Bit
BitDelta: Your Fine-Tune May Only Be Worth One Bit
James Liu
Guangxuan Xiao
Kai Li
Jason D. Lee
Song Han
Tri Dao
Tianle Cai
31
20
0
15 Feb 2024
Both Matter: Enhancing the Emotional Intelligence of Large Language
  Models without Compromising the General Intelligence
Both Matter: Enhancing the Emotional Intelligence of Large Language Models without Compromising the General Intelligence
Weixiang Zhao
Zhuojun Li
Shilong Wang
Yang Wang
Yulin Hu
Yanyan Zhao
Chen Wei
Bing Qin
12
4
0
15 Feb 2024
NutePrune: Efficient Progressive Pruning with Numerous Teachers for
  Large Language Models
NutePrune: Efficient Progressive Pruning with Numerous Teachers for Large Language Models
Shengrui Li
Junzhe Chen
Xueting Han
Jing Bai
17
6
0
15 Feb 2024
Efficient Prompt Optimization Through the Lens of Best Arm
  Identification
Efficient Prompt Optimization Through the Lens of Best Arm Identification
Chengshuai Shi
Kun Yang
Zihan Chen
Jundong Li
Jing Yang
Cong Shen
40
5
0
15 Feb 2024
AQA-Bench: An Interactive Benchmark for Evaluating LLMs' Sequential
  Reasoning Ability
AQA-Bench: An Interactive Benchmark for Evaluating LLMs' Sequential Reasoning Ability
Siwei Yang
Bingchen Zhao
Cihang Xie
LRM
6
6
0
14 Feb 2024
InstructGraph: Boosting Large Language Models via Graph-centric
  Instruction Tuning and Preference Alignment
InstructGraph: Boosting Large Language Models via Graph-centric Instruction Tuning and Preference Alignment
Jianing Wang
Junda Wu
Yupeng Hou
Yao Liu
Ming Gao
Julian McAuley
20
32
0
13 Feb 2024
Towards an Understanding of Stepwise Inference in Transformers: A
  Synthetic Graph Navigation Model
Towards an Understanding of Stepwise Inference in Transformers: A Synthetic Graph Navigation Model
Mikail Khona
Maya Okawa
Jan Hula
Rahul Ramesh
Kento Nishi
Robert P. Dick
Ekdeep Singh Lubana
Hidenori Tanaka
33
5
0
12 Feb 2024
Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language
  Models
Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models
Jiacheng Ye
Shansan Gong
Liheng Chen
Lin Zheng
Jiahui Gao
...
Chuan Wu
Xin Jiang
Zhenguo Li
Wei Bi
Lingpeng Kong
DiffM
LRM
AI4CE
38
12
0
12 Feb 2024
Step-On-Feet Tuning: Scaling Self-Alignment of LLMs via Bootstrapping
Step-On-Feet Tuning: Scaling Self-Alignment of LLMs via Bootstrapping
Haoyu Wang
Guozheng Ma
Ziqiao Meng
Zeyu Qin
Li Shen
...
Liu Liu
Yatao Bian
Tingyang Xu
Xueqian Wang
Peilin Zhao
55
12
0
12 Feb 2024
ODIN: Disentangled Reward Mitigates Hacking in RLHF
ODIN: Disentangled Reward Mitigates Hacking in RLHF
Lichang Chen
Chen Zhu
Davit Soselia
Jiuhai Chen
Tianyi Zhou
Tom Goldstein
Heng-Chiao Huang
M. Shoeybi
Bryan Catanzaro
AAML
42
51
0
11 Feb 2024
OpenFedLLM: Training Large Language Models on Decentralized Private Data
  via Federated Learning
OpenFedLLM: Training Large Language Models on Decentralized Private Data via Federated Learning
Rui Ye
Wenhao Wang
Jingyi Chai
Dihan Li
Zexi Li
Yinda Xu
Yaxin Du
Yanfeng Wang
Siheng Chen
ALM
FedML
AIFin
6
76
0
10 Feb 2024
CultureLLM: Incorporating Cultural Differences into Large Language
  Models
CultureLLM: Incorporating Cultural Differences into Large Language Models
Cheng-rong Li
Mengzhou Chen
Jindong Wang
Sunayana Sitaram
Xing Xie
VLM
49
17
0
09 Feb 2024
Rethinking Data Selection for Supervised Fine-Tuning
Rethinking Data Selection for Supervised Fine-Tuning
Ming Shen
23
16
0
08 Feb 2024
Learning to Route Among Specialized Experts for Zero-Shot Generalization
Learning to Route Among Specialized Experts for Zero-Shot Generalization
Mohammed Muqeeth
Haokun Liu
Yufan Liu
Colin Raffel
MoMe
32
33
0
08 Feb 2024
In-Context Principle Learning from Mistakes
In-Context Principle Learning from Mistakes
Tianjun Zhang
Aman Madaan
Luyu Gao
Steven Zheng
Swaroop Mishra
Yiming Yang
Niket Tandon
Uri Alon
KELM
ReLM
25
23
0
08 Feb 2024
Noise Contrastive Alignment of Language Models with Explicit Rewards
Noise Contrastive Alignment of Language Models with Explicit Rewards
Huayu Chen
Guande He
Lifan Yuan
Ganqu Cui
Hang Su
Jun Zhu
52
37
0
08 Feb 2024
LESS: Selecting Influential Data for Targeted Instruction Tuning
LESS: Selecting Influential Data for Targeted Instruction Tuning
Mengzhou Xia
Sadhika Malladi
Suchin Gururangan
Sanjeev Arora
Danqi Chen
80
180
0
06 Feb 2024
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Pei Zhou
Jay Pujara
Xiang Ren
Xinyun Chen
Heng-Tze Cheng
Quoc V. Le
Ed H. Chi
Denny Zhou
Swaroop Mishra
Huaixiu Steven Zheng
LRM
ReLM
27
48
0
06 Feb 2024
Previous
123...91011...141516
Next