ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.03350
  4. Cited By
Measuring and Narrowing the Compositionality Gap in Language Models

Measuring and Narrowing the Compositionality Gap in Language Models

7 October 2022
Ofir Press
Muru Zhang
Sewon Min
Ludwig Schmidt
Noah A. Smith
M. Lewis
    ReLM
    KELM
    LRM
ArXivPDFHTML

Papers citing "Measuring and Narrowing the Compositionality Gap in Language Models"

50 / 419 papers shown
Title
BlendFilter: Advancing Retrieval-Augmented Large Language Models via
  Query Generation Blending and Knowledge Filtering
BlendFilter: Advancing Retrieval-Augmented Large Language Models via Query Generation Blending and Knowledge Filtering
Haoyu Wang
Ruirui Li
Haoming Jiang
Jinjin Tian
Zhengyang Wang
Chen Luo
Xianfeng Tang
Monica Cheng
Tuo Zhao
Jing Gao
RALM
KELM
38
16
0
16 Feb 2024
Chain of Logic: Rule-Based Reasoning with Large Language Models
Chain of Logic: Rule-Based Reasoning with Large Language Models
Sergio Servantez
Joe Barrow
Kristian J. Hammond
R. Jain
ReLM
ELM
AILaw
LRM
AI4CE
27
0
0
16 Feb 2024
HGOT: Hierarchical Graph of Thoughts for Retrieval-Augmented In-Context
  Learning in Factuality Evaluation
HGOT: Hierarchical Graph of Thoughts for Retrieval-Augmented In-Context Learning in Factuality Evaluation
Yihao Fang
Stephen W. Thomas
Xiaodan Zhu
RALM
11
2
0
14 Feb 2024
Towards an Understanding of Stepwise Inference in Transformers: A
  Synthetic Graph Navigation Model
Towards an Understanding of Stepwise Inference in Transformers: A Synthetic Graph Navigation Model
Mikail Khona
Maya Okawa
Jan Hula
Rahul Ramesh
Kento Nishi
Robert P. Dick
Ekdeep Singh Lubana
Hidenori Tanaka
24
5
0
12 Feb 2024
OpenToM: A Comprehensive Benchmark for Evaluating Theory-of-Mind
  Reasoning Capabilities of Large Language Models
OpenToM: A Comprehensive Benchmark for Evaluating Theory-of-Mind Reasoning Capabilities of Large Language Models
Hainiu Xu
Runcong Zhao
Lixing Zhu
Jinhua Du
Yulan He
66
18
0
08 Feb 2024
Limits of Transformer Language Models on Learning to Compose Algorithms
Limits of Transformer Language Models on Learning to Compose Algorithms
Jonathan Thomm
Aleksandar Terzić
Giacomo Camposampiero
Michael Hersche
Bernhard Schölkopf
Abbas Rahimi
26
3
0
08 Feb 2024
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning
  Tasks
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks
Jongho Park
Jaeseung Park
Zheyang Xiong
Nayoung Lee
Jaewoong Cho
Samet Oymak
Kangwook Lee
Dimitris Papailiopoulos
13
31
0
06 Feb 2024
Empowering Language Models with Active Inquiry for Deeper Understanding
Empowering Language Models with Active Inquiry for Deeper Understanding
Jing-Cheng Pang
Heng-Bo Fan
Pengyuan Wang
Jia-Hao Xiao
Nan Tang
Si-Hang Yang
Chengxing Jia
Sheng-Jun Huang
Yang Yu
11
5
0
06 Feb 2024
Graph-enhanced Large Language Models in Asynchronous Plan Reasoning
Graph-enhanced Large Language Models in Asynchronous Plan Reasoning
Fangru Lin
Emanuele La Malfa
Valentin Hofmann
Elle Michelle Yang
Anthony Cohn
J. Pierrehumbert
LRM
48
16
0
05 Feb 2024
Building Guardrails for Large Language Models
Building Guardrails for Large Language Models
Yizhen Dong
Ronghui Mu
Gao Jin
Yi Qi
Jinwei Hu
Xingyu Zhao
Jie Meng
Wenjie Ruan
Xiaowei Huang
OffRL
57
23
0
02 Feb 2024
AMOR: A Recipe for Building Adaptable Modular Knowledge Agents Through
  Process Feedback
AMOR: A Recipe for Building Adaptable Modular Knowledge Agents Through Process Feedback
Jian-Yu Guan
Wei Yu Wu
Zujie Wen
Peng Xu
Hongning Wang
Minlie Huang
LRM
14
16
0
02 Feb 2024
A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for
  Verifiers of Reasoning Chains
A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains
Alon Jacovi
Yonatan Bitton
Bernd Bohnet
Jonathan Herzig
Or Honovich
Michael Tseng
Michael Collins
Roee Aharoni
Mor Geva
LRM
24
18
0
01 Feb 2024
Don't Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM
  Collaboration
Don't Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration
Shangbin Feng
Weijia Shi
Yike Wang
Wenxuan Ding
Vidhisha Balachandran
Yulia Tsvetkov
18
22
0
01 Feb 2024
Rethinking Interpretability in the Era of Large Language Models
Rethinking Interpretability in the Era of Large Language Models
Chandan Singh
J. Inala
Michel Galley
Rich Caruana
Jianfeng Gao
LRM
AI4CE
75
59
0
30 Jan 2024
K-QA: A Real-World Medical Q&A Benchmark
K-QA: A Real-World Medical Q&A Benchmark
Itay Manes
Naama Ronn
David Cohen
Ran Ilan Ber
Zehavi Horowitz-Kugler
Gabriel Stanovsky
LM&MA
HILM
AI4MH
20
10
0
25 Jan 2024
Towards Goal-oriented Prompt Engineering for Large Language Models: A
  Survey
Towards Goal-oriented Prompt Engineering for Large Language Models: A Survey
Haochen Li
Jonathan Leung
Zhiqi Shen
LM&MA
LLMAG
LRM
10
0
0
25 Jan 2024
Demystifying Chains, Trees, and Graphs of Thoughts
Demystifying Chains, Trees, and Graphs of Thoughts
Maciej Besta
Florim Memedi
Zhenyu Zhang
Robert Gerstenberger
Guangyuan Piao
...
Aleš Kubíček
H. Niewiadomski
Aidan O'Mahony
Onur Mutlu
Torsten Hoefler
AI4CE
LRM
50
25
0
25 Jan 2024
Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding
Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding
Mirac Suzgun
Adam Tauman Kalai
KELM
LRM
LLMAG
ReLM
33
63
0
23 Jan 2024
Cheap Learning: Maximising Performance of Language Models for Social
  Data Science Using Minimal Data
Cheap Learning: Maximising Performance of Language Models for Social Data Science Using Minimal Data
Leonardo Castro-Gonzalez
Yi-Ling Chung
Hannak Rose Kirk
John Francis
Angus R. Williams
Pica Johansson
Jonathan Bright
24
1
0
22 Jan 2024
The Curious Case of Nonverbal Abstract Reasoning with Multi-Modal Large
  Language Models
The Curious Case of Nonverbal Abstract Reasoning with Multi-Modal Large Language Models
Kian Ahrabian
Zhivar Sourati
Kexuan Sun
Jiarui Zhang
Yifan Jiang
Fred Morstatter
Jay Pujara
LRM
18
9
0
22 Jan 2024
DeepEdit: Knowledge Editing as Decoding with Constraints
DeepEdit: Knowledge Editing as Decoding with Constraints
Yiwei Wang
Muhao Chen
Nanyun Peng
Kai-Wei Chang
KELM
16
26
0
19 Jan 2024
Gradable ChatGPT Translation Evaluation
Gradable ChatGPT Translation Evaluation
Hui Jiao
Bei Peng
Lu Zong
Xiaojun Zhang
Xinwei Li
28
1
0
18 Jan 2024
Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language
  Model Systems
Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems
Tianyu Cui
Yanling Wang
Chuanpu Fu
Yong Xiao
Sijia Li
...
Junwu Xiong
Xinyu Kong
Zujie Wen
Ke Xu
Qi Li
50
22
0
11 Jan 2024
How Far Are LLMs from Believable AI? A Benchmark for Evaluating the
  Believability of Human Behavior Simulation
How Far Are LLMs from Believable AI? A Benchmark for Evaluating the Believability of Human Behavior Simulation
Yang Xiao
Yi Cheng
Jinlan Fu
Jiashuo Wang
Wenjie Li
Pengfei Liu
LLMAG
41
4
0
28 Dec 2023
LARP: Language-Agent Role Play for Open-World Games
LARP: Language-Agent Role Play for Open-World Games
Ming Yan
Ruihao Li
Hao Zhang
Hao Wang
Zhilan Yang
Ji Yan
LLMAG
LM&Ro
AI4CE
22
16
0
24 Dec 2023
PokeMQA: Programmable knowledge editing for Multi-hop Question Answering
PokeMQA: Programmable knowledge editing for Multi-hop Question Answering
Hengrui Gu
Kaixiong Zhou
Xiaotian Han
Ninghao Liu
Ruobing Wang
Xin Wang
LRM
KELM
52
22
0
23 Dec 2023
SimLM: Can Language Models Infer Parameters of Physical Systems?
SimLM: Can Language Models Infer Parameters of Physical Systems?
Sean Memery
Mirella Lapata
Kartic Subr
LRM
AI4CE
8
2
0
21 Dec 2023
A Survey of Reasoning with Foundation Models
A Survey of Reasoning with Foundation Models
Jiankai Sun
Chuanyang Zheng
E. Xie
Zhengying Liu
Ruihang Chu
...
Xipeng Qiu
Yi-Chen Guo
Hui Xiong
Qun Liu
Zhenguo Li
ReLM
LRM
AI4CE
19
74
0
17 Dec 2023
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Renat Aksitov
Sobhan Miryoosefi
Zong-xiao Li
Daliang Li
Sheila Babayan
...
Sushant Prakash
Pranesh Srinivasan
Manzil Zaheer
Felix X. Yu
Sanjiv Kumar
LRM
ReLM
LLMAG
KELM
15
26
0
15 Dec 2023
Training-free Zero-shot Composed Image Retrieval with Local Concept
  Reranking
Training-free Zero-shot Composed Image Retrieval with Local Concept Reranking
Shitong Sun
Fanghua Ye
Shaogang Gong
11
13
0
14 Dec 2023
AI capabilities can be significantly improved without expensive
  retraining
AI capabilities can be significantly improved without expensive retraining
Tom Davidson
Jean-Stanislas Denain
Pablo Villalobos
Guillem Bas
OffRL
VLM
21
26
0
12 Dec 2023
An LLM Compiler for Parallel Function Calling
An LLM Compiler for Parallel Function Calling
Sehoon Kim
Suhong Moon
Ryan Tabrizi
Nicholas Lee
Michael W. Mahoney
Kurt Keutzer
A. Gholami
LRM
11
58
0
07 Dec 2023
Recursive Visual Programming
Recursive Visual Programming
Jiaxin Ge
Sanjay Subramanian
Baifeng Shi
Roei Herzig
Trevor Darrell
16
4
0
04 Dec 2023
Beyond ChatBots: ExploreLLM for Structured Thoughts and Personalized
  Model Responses
Beyond ChatBots: ExploreLLM for Structured Thoughts and Personalized Model Responses
Xiao Ma
Swaroop Mishra
Ariel Liu
S. Su
Jilin Chen
Chinmay Kulkarni
Heng-Tze Cheng
Quoc V. Le
Ed H. Chi
LM&Ro
17
36
0
01 Dec 2023
A Survey on Prompting Techniques in LLMs
A Survey on Prompting Techniques in LLMs
Prabin Bhandari
16
6
0
28 Nov 2023
Algorithm Evolution Using Large Language Model
Algorithm Evolution Using Large Language Model
Fei Liu
Xialiang Tong
Mingxuan Yuan
Qingfu Zhang
15
39
0
26 Nov 2023
Probabilistic Tree-of-thought Reasoning for Answering
  Knowledge-intensive Complex Questions
Probabilistic Tree-of-thought Reasoning for Answering Knowledge-intensive Complex Questions
S. Cao
Jiajie Zhang
Jiaxin Shi
Xin Lv
Zijun Yao
Qingwen Tian
Juanzi Li
Lei Hou
LRM
23
13
0
23 Nov 2023
Drilling Down into the Discourse Structure with LLMs for Long Document
  Question Answering
Drilling Down into the Discourse Structure with LLMs for Long Document Question Answering
Inderjeet Nair
Shwetha Somasundaram
Apoorv Saxena
Koustava Goswami
RALM
29
7
0
22 Nov 2023
GPQA: A Graduate-Level Google-Proof Q&A Benchmark
GPQA: A Graduate-Level Google-Proof Q&A Benchmark
David Rein
Betty Li Hou
Asa Cooper Stickland
Jackson Petty
Richard Yuanzhe Pang
Julien Dirani
Julian Michael
Samuel R. Bowman
AI4MH
ELM
16
433
0
20 Nov 2023
System 2 Attention (is something you might need too)
System 2 Attention (is something you might need too)
Jason Weston
Sainbayar Sukhbaatar
RALM
OffRL
LRM
11
57
0
20 Nov 2023
Graph Elicitation for Guiding Multi-Step Reasoning in Large Language
  Models
Graph Elicitation for Guiding Multi-Step Reasoning in Large Language Models
Jinyoung Park
Ameen Patel
Omar Zia Khan
Hyunwoo J. Kim
Jooyeon Kim
KELM
LRM
ReLM
23
4
0
16 Nov 2023
OVM, Outcome-supervised Value Models for Planning in Mathematical
  Reasoning
OVM, Outcome-supervised Value Models for Planning in Mathematical Reasoning
Fei Yu
Anningzhe Gao
Benyou Wang
OffRL
LRM
15
39
0
16 Nov 2023
On Evaluating the Integration of Reasoning and Action in LLM Agents with
  Database Question Answering
On Evaluating the Integration of Reasoning and Action in LLM Agents with Database Question Answering
Linyong Nan
Ellen Zhang
Weijin Zou
Yilun Zhao
Wenfei Zhou
Arman Cohan
LLMAG
33
13
0
16 Nov 2023
What if you said that differently?: How Explanation Formats Affect Human
  Feedback Efficacy and User Perception
What if you said that differently?: How Explanation Formats Affect Human Feedback Efficacy and User Perception
Chaitanya Malaviya
Subin Lee
Dan Roth
Mark Yatskar
16
1
0
16 Nov 2023
Clarify When Necessary: Resolving Ambiguity Through Interaction with LMs
Clarify When Necessary: Resolving Ambiguity Through Interaction with LMs
Michael J.Q. Zhang
Eunsol Choi
32
26
0
16 Nov 2023
Contrastive Chain-of-Thought Prompting
Contrastive Chain-of-Thought Prompting
Yew Ken Chia
Guizhen Chen
Anh Tuan Luu
Soujanya Poria
Lidong Bing
LRM
AI4CE
50
30
0
15 Nov 2023
How Well Do Large Language Models Truly Ground?
How Well Do Large Language Models Truly Ground?
Hyunji Lee
Se June Joo
Chaeeun Kim
Joel Jang
Doyoung Kim
Kyoung-Woon On
Minjoon Seo
HILM
8
6
0
15 Nov 2023
StrategyLLM: Large Language Models as Strategy Generators, Executors,
  Optimizers, and Evaluators for Problem Solving
StrategyLLM: Large Language Models as Strategy Generators, Executors, Optimizers, and Evaluators for Problem Solving
Chang Gao
Haiyun Jiang
Deng Cai
Shuming Shi
Wai Lam
LRM
23
3
0
15 Nov 2023
Attribute Diversity Determines the Systematicity Gap in VQA
Attribute Diversity Determines the Systematicity Gap in VQA
Ian Berlot-Attwell
Kumar Krishna Agrawal
A. M. Carrell
Yash Sharma
Naomi Saphra
13
1
0
15 Nov 2023
Asking More Informative Questions for Grounded Retrieval
Asking More Informative Questions for Grounded Retrieval
Sedrick Scott Keh
Justin T. Chiu
Daniel Fried
17
3
0
14 Nov 2023
Previous
123456789
Next