ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.09261
  4. Cited By
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them

Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them

17 October 2022
Mirac Suzgun
Nathan Scales
Nathanael Scharli
Sebastian Gehrmann
Yi Tay
Hyung Won Chung
Aakanksha Chowdhery
Quoc V. Le
Ed H. Chi
Denny Zhou
Jason W. Wei
    ALM
    ELM
    LRM
    ReLM
ArXivPDFHTML

Papers citing "Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them"

50 / 788 papers shown
Title
Rethinking Data Selection at Scale: Random Selection is Almost All You
  Need
Rethinking Data Selection at Scale: Random Selection is Almost All You Need
Tingyu Xia
Bowen Yu
K. Dang
An Yang
Yuan Wu
Yuan Tian
Yi-Ju Chang
Junyang Lin
ALM
49
3
0
12 Oct 2024
ELICIT: LLM Augmentation via External In-Context Capability
ELICIT: LLM Augmentation via External In-Context Capability
Futing Wang
Jianhao Yan
Yue Zhang
Tao Lin
35
0
0
12 Oct 2024
Enterprise Benchmarks for Large Language Model Evaluation
Enterprise Benchmarks for Large Language Model Evaluation
Bing Zhang
Mikio Takeuchi
Ryo Kawahara
Shubhi Asthana
Md. Maruf Hossain
Guang-Jie Ren
Kate Soule
Yada Zhu
ELM
24
2
0
11 Oct 2024
StraGo: Harnessing Strategic Guidance for Prompt Optimization
StraGo: Harnessing Strategic Guidance for Prompt Optimization
Yurong Wu
Yan Gao
Bin Benjamin Zhu
Zineng Zhou
Xiaodi Sun
Sheng Yang
Jian-Guang Lou
Zhiming Ding
Linjun Yang
LLMAG
35
2
0
11 Oct 2024
Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements
Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements
Jingyu Zhang
Ahmed Elgohary
Ahmed Magooda
Daniel Khashabi
Benjamin Van Durme
40
2
0
11 Oct 2024
Autonomous Evaluation of LLMs for Truth Maintenance and Reasoning Tasks
Autonomous Evaluation of LLMs for Truth Maintenance and Reasoning Tasks
Rushang Karia
Daniel Bramblett
D. Dobhal
Siddharth Srivastava
ELM
LRM
25
0
0
11 Oct 2024
Scaling Laws for Predicting Downstream Performance in LLMs
Scaling Laws for Predicting Downstream Performance in LLMs
Yangyi Chen
Binxuan Huang
Yifan Gao
Zhengyang Wang
Jingfeng Yang
Heng Ji
LRM
43
7
0
11 Oct 2024
Packing Analysis: Packing Is More Appropriate for Large Models or
  Datasets in Supervised Fine-tuning
Packing Analysis: Packing Is More Appropriate for Large Models or Datasets in Supervised Fine-tuning
Shuhe Wang
Guoyin Wang
Y. Wang
Jiwei Li
Eduard H. Hovy
Chen Guo
32
1
0
10 Oct 2024
Reward-Augmented Data Enhances Direct Preference Alignment of LLMs
Reward-Augmented Data Enhances Direct Preference Alignment of LLMs
Shenao Zhang
Zhihan Liu
Boyi Liu
Y. Zhang
Yingxiang Yang
Y. Liu
Liyu Chen
Tao Sun
Z. Wang
87
2
0
10 Oct 2024
TPO: Aligning Large Language Models with Multi-branch & Multi-step Preference Trees
TPO: Aligning Large Language Models with Multi-branch & Multi-step Preference Trees
Weibin Liao
Xu Chu
Yasha Wang
LRM
36
6
0
10 Oct 2024
SEAL: Safety-enhanced Aligned LLM Fine-tuning via Bilevel Data Selection
SEAL: Safety-enhanced Aligned LLM Fine-tuning via Bilevel Data Selection
Han Shen
Pin-Yu Chen
Payel Das
Tianyi Chen
ALM
26
11
0
09 Oct 2024
Glider: Global and Local Instruction-Driven Expert Router
Glider: Global and Local Instruction-Driven Expert Router
Pingzhi Li
Prateek Yadav
Jaehong Yoon
Jie Peng
Yi-Lin Sung
Mohit Bansal
Tianlong Chen
MoMe
MoE
25
1
0
09 Oct 2024
Tree of Problems: Improving structured problem solving with
  compositionality
Tree of Problems: Improving structured problem solving with compositionality
A. Zebaze
Benoît Sagot
Rachel Bawden
LRM
19
2
0
09 Oct 2024
Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient
  Attentions
Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient Attentions
Zhihao He
Hang Yu
Zi Gong
Shizhan Liu
Jianguo Li
Weiyao Lin
VLM
30
1
0
09 Oct 2024
MentalArena: Self-play Training of Language Models for Diagnosis and Treatment of Mental Health Disorders
MentalArena: Self-play Training of Language Models for Diagnosis and Treatment of Mental Health Disorders
Cheng-rong Li
May Fung
Qingyun Wang
Chi Han
Manling Li
Jindong Wang
Heng Ji
AI4MH
43
0
0
09 Oct 2024
Auto-Evolve: Enhancing Large Language Model's Performance via
  Self-Reasoning Framework
Auto-Evolve: Enhancing Large Language Model's Performance via Self-Reasoning Framework
Krishna Aswani
Huilin Lu
Pranav Patankar
Priya Dhalwani
Iris Tan
Jayant Ganeshmohan
Simon Lacasse
ReLM
LLMAG
LRM
22
0
0
08 Oct 2024
QERA: an Analytical Framework for Quantization Error Reconstruction
QERA: an Analytical Framework for Quantization Error Reconstruction
Cheng Zhang
Jeffrey T. H. Wong
Can Xiao
G. Constantinides
Yiren Zhao
MQ
35
0
0
08 Oct 2024
Active Evaluation Acquisition for Efficient LLM Benchmarking
Active Evaluation Acquisition for Efficient LLM Benchmarking
Yang Li
Jie Ma
Miguel Ballesteros
Yassine Benajiba
Graham Horwood
ELM
14
1
0
08 Oct 2024
ACPBench: Reasoning about Action, Change, and Planning
ACPBench: Reasoning about Action, Change, and Planning
Harsha Kokel
Michael Katz
Kavitha Srinivas
Shirin Sohrabi
ReLM
LRM
29
0
0
08 Oct 2024
fPLSA: Learning Semantic Structures in Document Collections Using
  Foundation Models
fPLSA: Learning Semantic Structures in Document Collections Using Foundation Models
Weijia Xu
Nebojsa Jojic
Nicolas Le Roux
11
0
0
07 Oct 2024
Falcon Mamba: The First Competitive Attention-free 7B Language Model
Falcon Mamba: The First Competitive Attention-free 7B Language Model
Jingwei Zuo
Maksim Velikanov
Dhia Eddine Rhaiem
Ilyas Chahed
Younes Belkada
Guillaume Kunsch
Hakim Hacid
ALM
52
12
0
07 Oct 2024
An evaluation of LLM code generation capabilities through graded
  exercises
An evaluation of LLM code generation capabilities through graded exercises
Álvaro Barbero Jiménez
ELM
20
0
0
06 Oct 2024
Gradient Routing: Masking Gradients to Localize Computation in Neural
  Networks
Gradient Routing: Masking Gradients to Localize Computation in Neural Networks
Alex Cloud
Jacob Goldman-Wetzler
Evžen Wybitul
Joseph Miller
Alexander Matt Turner
14
1
0
06 Oct 2024
A Learning Rate Path Switching Training Paradigm for Version Updates of
  Large Language Models
A Learning Rate Path Switching Training Paradigm for Version Updates of Large Language Models
Zhihao Wang
Shiyu Liu
Jianheng Huang
Zheng Wang
Yixuan Liao
Xiaoxin Chen
Junfeng Yao
Jinsong Su
18
0
0
05 Oct 2024
DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning
  Trajectories Search
DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search
Murong Yue
Wenlin Yao
Haitao Mi
Dian Yu
Ziyu Yao
Dong Yu
LRM
28
4
0
04 Oct 2024
StoryNavi: On-Demand Narrative-Driven Reconstruction of Video Play With
  Generative AI
StoryNavi: On-Demand Narrative-Driven Reconstruction of Video Play With Generative AI
Alston Lantian Xu
Tianwei Ma
Tianmeng Liu
Can Liu
Alvaro Cassinelli
VGen
24
0
0
04 Oct 2024
ProcBench: Benchmark for Multi-Step Reasoning and Following Procedure
ProcBench: Benchmark for Multi-Step Reasoning and Following Procedure
Ippei Fujisawa
Sensho Nobe
Hiroki Seto
Rina Onda
Yoshiaki Uchida
Hiroki Ikoma
Pei-Chun Chien
Ryota Kanai
LRM
34
3
0
04 Oct 2024
Residual Policy Learning for Perceptive Quadruped Control Using
  Differentiable Simulation
Residual Policy Learning for Perceptive Quadruped Control Using Differentiable Simulation
Jing Yuan Luo
Yunlong Song
Victor Klemm
Fan Shi
Davide Scaramuzza
Marco Hutter
26
1
0
04 Oct 2024
Steering Large Language Models between Code Execution and Textual Reasoning
Steering Large Language Models between Code Execution and Textual Reasoning
Yongchao Chen
Harsh Jhamtani
Srinagesh Sharma
Chuchu Fan
Chi Wang
LLMAG
LRM
31
6
0
04 Oct 2024
POSIX: A Prompt Sensitivity Index For Large Language Models
POSIX: A Prompt Sensitivity Index For Large Language Models
Anwoy Chatterjee
H. S. V. N. S. K. Renduchintala
S. Bhatia
Tanmoy Chakraborty
AAML
11
6
0
03 Oct 2024
Position: LLM Unlearning Benchmarks are Weak Measures of Progress
Position: LLM Unlearning Benchmarks are Weak Measures of Progress
Pratiksha Thaker
Shengyuan Hu
Neil Kale
Yash Maurya
Zhiwei Steven Wu
Virginia Smith
MU
39
10
0
03 Oct 2024
ReGenesis: LLMs can Grow into Reasoning Generalists via Self-Improvement
ReGenesis: LLMs can Grow into Reasoning Generalists via Self-Improvement
Xiangyu Peng
Congying Xia
Xinyi Yang
Caiming Xiong
Chien-Sheng Wu
Chen Xing
LRM
38
2
0
03 Oct 2024
Determine-Then-Ensemble: Necessity of Top-k Union for Large Language Model Ensembling
Determine-Then-Ensemble: Necessity of Top-k Union for Large Language Model Ensembling
Yuxuan Yao
Han Wu
Mingyang Liu
Sichun Luo
Xiongwei Han
Jie Liu
Zhijiang Guo
Linqi Song
47
4
0
03 Oct 2024
Quantifying Generalization Complexity for Large Language Models
Quantifying Generalization Complexity for Large Language Models
Zhenting Qi
Hongyin Luo
Xuliang Huang
Zhuokai Zhao
Yibo Jiang
Xiangjun Fan
Himabindu Lakkaraju
James Glass
LRM
ELM
26
5
0
02 Oct 2024
Upcycling Instruction Tuning from Dense to Mixture-of-Experts via
  Parameter Merging
Upcycling Instruction Tuning from Dense to Mixture-of-Experts via Parameter Merging
Tingfeng Hui
Zhenyu Zhang
Shuohuan Wang
Yu Sun
Hua-Hong Wu
Sen Su
MoE
11
0
0
02 Oct 2024
A Little Goes a Long Way: Efficient Long Context Training and Inference
  with Partial Contexts
A Little Goes a Long Way: Efficient Long Context Training and Inference with Partial Contexts
Suyu Ge
Xihui Lin
Yunan Zhang
Jiawei Han
Hao Peng
31
4
0
02 Oct 2024
Mitigating Copy Bias in In-Context Learning through Neuron Pruning
Mitigating Copy Bias in In-Context Learning through Neuron Pruning
Ameen Ali
Lior Wolf
Ivan Titov
24
2
0
02 Oct 2024
TypedThinker: Diversify Large Language Model Reasoning with Typed Thinking
TypedThinker: Diversify Large Language Model Reasoning with Typed Thinking
Danqing Wang
Jianxin Ma
Fei Fang
Lei Li
LLMAG
LRM
50
0
0
02 Oct 2024
MoS: Unleashing Parameter Efficiency of Low-Rank Adaptation with Mixture
  of Shards
MoS: Unleashing Parameter Efficiency of Low-Rank Adaptation with Mixture of Shards
Sheng Wang
Liheng Chen
Pengan Chen
Jingwei Dong
Boyang Xue
Jiyue Jiang
Lingpeng Kong
Chuan Wu
MoE
16
7
0
01 Oct 2024
DynEx: Dynamic Code Synthesis with Structured Design Exploration for
  Accelerated Exploratory Programming
DynEx: Dynamic Code Synthesis with Structured Design Exploration for Accelerated Exploratory Programming
Jenny Ma
Karthik Sreedhar
Vivian Liu
Sitong Wang
Pedro Alejandro Perez
Riya Sahni
Lydia B. Chilton
37
1
0
01 Oct 2024
Ingest-And-Ground: Dispelling Hallucinations from Continually-Pretrained
  LLMs with RAG
Ingest-And-Ground: Dispelling Hallucinations from Continually-Pretrained LLMs with RAG
Chenhao Fang
Derek Larson
Shitong Zhu
Sophie Zeng
Wendy Summer
...
Rajeev Rao
Gabriel Forgues
Arya Pudota
Alex Goncalves
Hervé Robert
VLM
15
0
0
30 Sep 2024
Easy2Hard-Bench: Standardized Difficulty Labels for Profiling LLM
  Performance and Generalization
Easy2Hard-Bench: Standardized Difficulty Labels for Profiling LLM Performance and Generalization
Mucong Ding
Chenghao Deng
Jocelyn Choo
Zichu Wu
Aakriti Agrawal
...
Tianyi Zhou
Tom Goldstein
John Langford
Anima Anandkumar
Furong Huang
42
5
0
27 Sep 2024
PEDRO: Parameter-Efficient Fine-tuning with Prompt DEpenDent
  Representation MOdification
PEDRO: Parameter-Efficient Fine-tuning with Prompt DEpenDent Representation MOdification
Tianfang Xie
Tianjing Li
Wei Zhu
Wei Han
Yi Zhao
19
5
0
26 Sep 2024
BeanCounter: A low-toxicity, large-scale, and open dataset of
  business-oriented text
BeanCounter: A low-toxicity, large-scale, and open dataset of business-oriented text
Siyan Wang
Bradford Levy
18
2
0
26 Sep 2024
HDFlow: Enhancing LLM Complex Problem-Solving with Hybrid Thinking and
  Dynamic Workflows
HDFlow: Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows
Wenlin Yao
Haitao Mi
Dong Yu
LRM
AI4CE
44
6
0
25 Sep 2024
Post-hoc Reward Calibration: A Case Study on Length Bias
Post-hoc Reward Calibration: A Case Study on Length Bias
Zeyu Huang
Zihan Qiu
Zili Wang
Edoardo M. Ponti
Ivan Titov
36
5
0
25 Sep 2024
Task-oriented Prompt Enhancement via Script Generation
Task-oriented Prompt Enhancement via Script Generation
Chung-Yu Wang
Alireza DaghighFarsoodeh
Hung Viet Pham
LRM
37
0
0
24 Sep 2024
Learning from Contrastive Prompts: Automated Optimization and Adaptation
Learning from Contrastive Prompts: Automated Optimization and Adaptation
Mingqi Li
Karan Aggarwal
Yong Xie
Aitzaz Ahmad
Stephen Lau
28
2
0
23 Sep 2024
Enhancing LLM-based Autonomous Driving Agents to Mitigate Perception
  Attacks
Enhancing LLM-based Autonomous Driving Agents to Mitigate Perception Attacks
Ruoyu Song
Muslum Ozgur Ozmen
Hyungsub Kim
Antonio Bianchi
Z. Berkay Celik
AAML
24
5
0
22 Sep 2024
Unveiling Narrative Reasoning Limits of Large Language Models with Trope
  in Movie Synopses
Unveiling Narrative Reasoning Limits of Large Language Models with Trope in Movie Synopses
Hung-Ting Su
Ya-Ching Hsu
Xudong Lin
Xiang Qian Shi
Yulei Niu
Han-Yuan Hsu
Hung-yi Lee
Winston H. Hsu
LRM
26
0
0
22 Sep 2024
Previous
123456...141516
Next