ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.09261
  4. Cited By
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them

Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them

17 October 2022
Mirac Suzgun
Nathan Scales
Nathanael Scharli
Sebastian Gehrmann
Yi Tay
Hyung Won Chung
Aakanksha Chowdhery
Quoc V. Le
Ed H. Chi
Denny Zhou
Jason W. Wei
    ALM
    ELM
    LRM
    ReLM
ArXivPDFHTML

Papers citing "Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them"

50 / 788 papers shown
Title
A Deep Dive into the Trade-Offs of Parameter-Efficient Preference
  Alignment Techniques
A Deep Dive into the Trade-Offs of Parameter-Efficient Preference Alignment Techniques
Megh Thakkar
Quentin Fournier
Matthew D Riemer
Pin-Yu Chen
Amal Zouaq
Payel Das
Sarath Chandar
ALM
31
8
0
07 Jun 2024
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language
  Models
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models
Ling Yang
Zhaochen Yu
Tianjun Zhang
Shiyi Cao
Minkai Xu
Wentao Zhang
Joseph E. Gonzalez
Bin Cui
LLMAG
LM&Ro
LRM
KELM
26
34
0
06 Jun 2024
Uncovering Limitations of Large Language Models in Information Seeking
  from Tables
Uncovering Limitations of Large Language Models in Information Seeking from Tables
Chaoxu Pang
Yixuan Cao
Chunhao Yang
Ping Luo
RALM
LMTD
36
3
0
06 Jun 2024
Evaluating the World Model Implicit in a Generative Model
Evaluating the World Model Implicit in a Generative Model
Keyon Vafa
Justin Y. Chen
Jon M. Kleinberg
S. Mullainathan
Ashesh Rambachan
86
25
0
06 Jun 2024
Xmodel-LM Technical Report
Xmodel-LM Technical Report
Yichuan Wang
Yang Liu
Yu Yan
Qun Wang
Xucheng Huang
Ling Jiang
OSLM
ALM
16
1
0
05 Jun 2024
MMLU-Pro: A More Robust and Challenging Multi-Task Language
  Understanding Benchmark
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark
Yubo Wang
Xueguang Ma
Ge Zhang
Yuansheng Ni
Abhranil Chandra
...
Kai Wang
Alex Zhuang
Rongqi Fan
Xiang Yue
Wenhu Chen
LRM
ELM
43
283
0
03 Jun 2024
SUBLLM: A Novel Efficient Architecture with Token Sequence Subsampling
  for LLM
SUBLLM: A Novel Efficient Architecture with Token Sequence Subsampling for LLM
Quandong Wang
Yuxuan Yuan
Xiaoyu Yang
Ruike Zhang
Kang Zhao
Wei Liu
Jian Luan
Daniel Povey
Bin Wang
41
0
0
03 Jun 2024
Do Large Language Models Perform the Way People Expect? Measuring the
  Human Generalization Function
Do Large Language Models Perform the Way People Expect? Measuring the Human Generalization Function
Keyon Vafa
Ashesh Rambachan
S. Mullainathan
ELM
ALM
13
11
0
03 Jun 2024
EffiQA: Efficient Question-Answering with Strategic Multi-Model
  Collaboration on Knowledge Graphs
EffiQA: Efficient Question-Answering with Strategic Multi-Model Collaboration on Knowledge Graphs
Zixuan Dong
Baoyun Peng
Yufei Wang
Jia Fu
Xiaodong Wang
Yongxue Shan
Xin Zhou
42
1
0
03 Jun 2024
Demonstration Augmentation for Zero-shot In-context Learning
Demonstration Augmentation for Zero-shot In-context Learning
Yi Su
Yunpeng Tai
Yixin Ji
Juntao Li
Bowen Yan
Min Zhang
RALM
33
6
0
03 Jun 2024
MixEval: Deriving Wisdom of the Crowd from LLM Benchmark Mixtures
MixEval: Deriving Wisdom of the Crowd from LLM Benchmark Mixtures
Jinjie Ni
Fuzhao Xue
Xiang Yue
Yuntian Deng
Mahir Shah
Kabir Jain
Graham Neubig
Yang You
ELM
30
35
0
03 Jun 2024
A Survey of Useful LLM Evaluation
A Survey of Useful LLM Evaluation
Ji-Lun Peng
Sijia Cheng
Egil Diau
Yung-Yu Shih
Po-Heng Chen
Yen-Ting Lin
Yun-Nung Chen
LLMAG
ELM
24
12
0
03 Jun 2024
Evaluating Mathematical Reasoning of Large Language Models: A Focus on
  Error Identification and Correction
Evaluating Mathematical Reasoning of Large Language Models: A Focus on Error Identification and Correction
Xiaoyuan Li
Wenjie Wang
Moxin Li
Junrong Guo
Yang Zhang
Fuli Feng
ELM
LRM
33
15
0
02 Jun 2024
Hallucination-Free? Assessing the Reliability of Leading AI Legal
  Research Tools
Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools
Varun Magesh
Faiz Surani
Matthew Dahl
Mirac Suzgun
Christopher D. Manning
Daniel E. Ho
HILM
ELM
AILaw
27
63
0
30 May 2024
TAIA: Large Language Models are Out-of-Distribution Data Learners
TAIA: Large Language Models are Out-of-Distribution Data Learners
Shuyang Jiang
Yusheng Liao
Ya-Qin Zhang
Yu Wang
Yanfeng Wang
27
3
0
30 May 2024
Improve Student's Reasoning Generalizability through Cascading
  Decomposed CoTs Distillation
Improve Student's Reasoning Generalizability through Cascading Decomposed CoTs Distillation
Chengwei Dai
Kun Li
Wei Zhou
Song Hu
LRM
39
3
0
30 May 2024
Beyond Imitation: Learning Key Reasoning Steps from Dual
  Chain-of-Thoughts in Reasoning Distillation
Beyond Imitation: Learning Key Reasoning Steps from Dual Chain-of-Thoughts in Reasoning Distillation
Chengwei Dai
Kun Li
Wei Zhou
Song Hu
LRM
36
5
0
30 May 2024
AlchemistCoder: Harmonizing and Eliciting Code Capability by Hindsight
  Tuning on Multi-source Data
AlchemistCoder: Harmonizing and Eliciting Code Capability by Hindsight Tuning on Multi-source Data
Zifan Song
Yudong Wang
Wenwei Zhang
Kuikun Liu
Chengqi Lyu
...
Qipeng Guo
Hang Yan
Dahua Lin
Kai-xiang Chen
Cairong Zhao
SyDa
41
2
0
29 May 2024
Towards Dialogues for Joint Human-AI Reasoning and Value Alignment
Towards Dialogues for Joint Human-AI Reasoning and Value Alignment
Elfia Bezou-Vrakatseli
O. Cocarascu
Sanjay Modgil
30
0
0
28 May 2024
Self-Guiding Exploration for Combinatorial Problems
Self-Guiding Exploration for Combinatorial Problems
Zangir Iklassov
Yali Du
Farkhad Akimov
Martin Takáč
LRM
21
2
0
28 May 2024
Efficient multi-prompt evaluation of LLMs
Efficient multi-prompt evaluation of LLMs
Felipe Maia Polo
Ronald Xu
Lucas Weber
Mírian Silva
Onkar Bhardwaj
Leshem Choshen
Allysson Flavio Melo de Oliveira
Yuekai Sun
Mikhail Yurochkin
37
17
0
27 May 2024
BWArea Model: Learning World Model, Inverse Dynamics, and Policy for
  Controllable Language Generation
BWArea Model: Learning World Model, Inverse Dynamics, and Policy for Controllable Language Generation
Chengxing Jia
Pengyuan Wang
Ziniu Li
Yi-Chen Li
Zhilong Zhang
Nan Tang
Yang Yu
OffRL
25
1
0
27 May 2024
Limits of Deep Learning: Sequence Modeling through the Lens of Complexity Theory
Limits of Deep Learning: Sequence Modeling through the Lens of Complexity Theory
Nikola Zubić
Federico Soldá
Aurelio Sulser
Davide Scaramuzza
LRM
BDL
45
5
0
26 May 2024
STRIDE: A Tool-Assisted LLM Agent Framework for Strategic and
  Interactive Decision-Making
STRIDE: A Tool-Assisted LLM Agent Framework for Strategic and Interactive Decision-Making
Chuanhao Li
Runhan Yang
Tiankai Li
Milad Bafarassat
Kourosh Sharifi
Dirk Bergemann
Zhuoran Yang
LLMAG
22
5
0
25 May 2024
Learning to Reason via Program Generation, Emulation, and Search
Learning to Reason via Program Generation, Emulation, and Search
Nathaniel Weir
Muhammad Khalifa
Linlu Qiu
Orion Weller
Peter Clark
SyDa
ReLM
LRM
49
5
0
25 May 2024
CulturePark: Boosting Cross-cultural Understanding in Large Language
  Models
CulturePark: Boosting Cross-cultural Understanding in Large Language Models
Cheng-rong Li
Damien Teney
Linyi Yang
Qingsong Wen
Xing Xie
Jindong Wang
46
4
0
24 May 2024
Instruction Tuning With Loss Over Instructions
Instruction Tuning With Loss Over Instructions
Zhengyan Shi
Adam X. Yang
Bin Wu
Laurence Aitchison
Emine Yilmaz
Aldo Lipani
ALM
19
19
0
23 May 2024
Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via
  Alignment Tax Reduction
Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via Alignment Tax Reduction
Tingchen Fu
Deng Cai
Lemao Liu
Shuming Shi
Rui Yan
MoMe
45
13
0
22 May 2024
360Zhinao Technical Report
360Zhinao Technical Report
360Zhinao Team
32
0
0
22 May 2024
Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer
  Selection in Large Language Models
Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models
Zhangyue Yin
Qiushi Sun
Qipeng Guo
Zhiyuan Zeng
Xiaonan Li
...
Qinyuan Cheng
Ding Wang
Xiaofeng Mou
Xipeng Qiu
XuanJing Huang
LRM
41
3
0
21 May 2024
Towards Modular LLMs by Building and Reusing a Library of LoRAs
Towards Modular LLMs by Building and Reusing a Library of LoRAs
O. Ostapenko
Zhan Su
E. Ponti
Laurent Charlin
Nicolas Le Roux
Matheus Pereira
Lucas Page-Caccia
Alessandro Sordoni
MoMe
32
30
0
18 May 2024
Large Language Model (LLM) for Telecommunications: A Comprehensive
  Survey on Principles, Key Techniques, and Opportunities
Large Language Model (LLM) for Telecommunications: A Comprehensive Survey on Principles, Key Techniques, and Opportunities
Hao Zhou
Chengming Hu
Ye Yuan
Yufei Cui
Yili Jin
...
Di Wu
Xue Liu
Charlie Zhang
Xianbin Wang
Jiangchuan Liu
30
55
0
17 May 2024
METAREFLECTION: Learning Instructions for Language Agents using Past
  Reflections
METAREFLECTION: Learning Instructions for Language Agents using Past Reflections
Priyanshu Gupta
Shashank Kirtania
Ananya Singha
Sumit Gulwani
Arjun Radhakrishna
Sherry Shi
Gustavo Soares
LLMAG
27
4
0
13 May 2024
COBias and Debias: Balancing Class Accuracies for Language Models in Inference Time via Nonlinear Integer Programming
COBias and Debias: Balancing Class Accuracies for Language Models in Inference Time via Nonlinear Integer Programming
Ruixi Lin
Yang You
27
1
0
13 May 2024
OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage
  Pruning
OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning
Dan Qiao
Yi Su
Pinzheng Wang
Jing Ye
Wen Xie
...
Wenliang Chen
Guohong Fu
Guodong Zhou
Qiaoming Zhu
Min Zhang
MQ
32
0
0
09 May 2024
ADELIE: Aligning Large Language Models on Information Extraction
ADELIE: Aligning Large Language Models on Information Extraction
Y. Qi
Hao Peng
Xiaozhi Wang
Bin Xu
Lei Hou
Juanzi Li
26
7
0
08 May 2024
Chain of Thoughtlessness? An Analysis of CoT in Planning
Chain of Thoughtlessness? An Analysis of CoT in Planning
Kaya Stechly
Karthik Valmeekam
Subbarao Kambhampati
LRM
LM&Ro
54
37
0
08 May 2024
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts
  Language Model
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
DeepSeek-AI
Aixin Liu
Bei Feng
Bin Wang
Bingxuan Wang
...
Zhuoshu Li
Zihan Wang
Zihui Gu
Zilin Li
Ziwei Xie
MoE
39
372
0
07 May 2024
Long Context Alignment with Short Instructions and Synthesized Positions
Long Context Alignment with Short Instructions and Synthesized Positions
Wenhao Wu
Yizhong Wang
Yao Fu
Xiang Yue
Dawei Zhu
Sujian Li
SyDa
35
18
0
07 May 2024
MAmmoTH2: Scaling Instructions from the Web
MAmmoTH2: Scaling Instructions from the Web
Xiang Yue
Tuney Zheng
Ge Zhang
Wenhu Chen
ALM
LRM
38
84
0
06 May 2024
Inherent Trade-Offs between Diversity and Stability in Multi-Task
  Benchmarks
Inherent Trade-Offs between Diversity and Stability in Multi-Task Benchmarks
Guanhua Zhang
Moritz Hardt
32
7
0
02 May 2024
DynaMo: Accelerating Language Model Inference with Dynamic Multi-Token
  Sampling
DynaMo: Accelerating Language Model Inference with Dynamic Multi-Token Sampling
Shikhar Tuli
Chi-Heng Lin
Yen-Chang Hsu
N. Jha
Yilin Shen
Hongxia Jin
AI4CE
23
0
0
01 May 2024
Truth-value judgment in language models: belief directions are context
  sensitive
Truth-value judgment in language models: belief directions are context sensitive
Stefan F. Schouten
Peter Bloem
Ilia Markov
Piek Vossen
KELM
68
0
0
29 Apr 2024
HFT: Half Fine-Tuning for Large Language Models
HFT: Half Fine-Tuning for Large Language Models
Tingfeng Hui
Zhenyu Zhang
Shuohuan Wang
Weiran Xu
Yu Sun
Hua-Hong Wu
CLL
37
4
0
29 Apr 2024
Building a Large Japanese Web Corpus for Large Language Models
Building a Large Japanese Web Corpus for Large Language Models
Naoaki Okazaki
Kakeru Hattori
Hirai Shota
Hiroki Iida
Masanari Ohi
Kazuki Fujii
Taishi Nakamura
Mengsay Loem
Rio Yokota
Sakae Mizuki
47
6
0
27 Apr 2024
Tele-FLM Technical Report
Tele-FLM Technical Report
Xiang Li
Yiqun Yao
Xin Jiang
Xuezhi Fang
Chao Wang
...
Yequan Wang
Zhongjiang He
Zhongyuan Wang
Xuelong Li
Tiejun Huang
30
3
0
25 Apr 2024
Let's Think Dot by Dot: Hidden Computation in Transformer Language
  Models
Let's Think Dot by Dot: Hidden Computation in Transformer Language Models
Jacob Pfau
William Merrill
Samuel R. Bowman
LRM
23
59
0
24 Apr 2024
Think-Program-reCtify: 3D Situated Reasoning with Large Language Models
Think-Program-reCtify: 3D Situated Reasoning with Large Language Models
Qingrong He
Kejun Lin
Shizhe Chen
Anwen Hu
Qin Jin
LRM
37
1
0
23 Apr 2024
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your
  Phone
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Marah Abdin
Sam Ade Jacobs
A. A. Awan
J. Aneja
Ahmed Hassan Awadallah
...
Li Lyna Zhang
Yi Zhang
Yue Zhang
Yunan Zhang
Xiren Zhou
LRM
ALM
50
995
0
22 Apr 2024
EPI-SQL: Enhancing Text-to-SQL Translation with Error-Prevention
  Instructions
EPI-SQL: Enhancing Text-to-SQL Translation with Error-Prevention Instructions
X. Liu
Zhao Tan
22
4
0
21 Apr 2024
Previous
123...789...141516
Next