ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.03350
  4. Cited By
Measuring and Narrowing the Compositionality Gap in Language Models

Measuring and Narrowing the Compositionality Gap in Language Models

7 October 2022
Ofir Press
Muru Zhang
Sewon Min
Ludwig Schmidt
Noah A. Smith
M. Lewis
    ReLM
    KELM
    LRM
ArXivPDFHTML

Papers citing "Measuring and Narrowing the Compositionality Gap in Language Models"

50 / 419 papers shown
Title
SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research
  Repositories
SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research Repositories
Ben Bogin
Kejuan Yang
Shashank Gupta
Kyle Richardson
Erin Bransom
Peter Clark
Ashish Sabharwal
Tushar Khot
ELM
LRM
40
9
0
11 Sep 2024
Pandora's Box or Aladdin's Lamp: A Comprehensive Analysis Revealing the
  Role of RAG Noise in Large Language Models
Pandora's Box or Aladdin's Lamp: A Comprehensive Analysis Revealing the Role of RAG Noise in Large Language Models
Jinyang Wu
Feihu Che
Chuyuan Zhang
Jianhua Tao
Shuai Zhang
Pengpeng Shao
23
2
0
24 Aug 2024
RAGLAB: A Modular and Research-Oriented Unified Framework for
  Retrieval-Augmented Generation
RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation
Xuanwang Zhang
Yunze Song
Yidong Wang
Shuyun Tang
Xinfeng Li
...
Wenyuan Xu
Yue Zhang
Xinyu Dai
Shikun Zhang
Qingsong Wen
36
1
0
21 Aug 2024
Hierarchical Retrieval-Augmented Generation Model with Rethink for
  Multi-hop Question Answering
Hierarchical Retrieval-Augmented Generation Model with Rethink for Multi-hop Question Answering
Xiaoming Zhang
Ming Wang
Xiaocui Yang
Daling Wang
Shi Feng
Yifei Zhang
RALM
17
1
0
20 Aug 2024
Analysis of Plan-based Retrieval for Grounded Text Generation
Analysis of Plan-based Retrieval for Grounded Text Generation
Ameya Godbole
Nicholas Monath
Seungyeon Kim
A. S. Rawat
Andrew McCallum
Manzil Zaheer
RALM
25
2
0
20 Aug 2024
Chain of Condition: Construct, Verify and Solve Conditions for
  Conditional Question Answering
Chain of Condition: Construct, Verify and Solve Conditions for Conditional Question Answering
Jiuheng Lin
Yuxuan Lai
Yansong Feng
LRM
16
0
0
10 Aug 2024
EfficientRAG: Efficient Retriever for Multi-Hop Question Answering
EfficientRAG: Efficient Retriever for Multi-Hop Question Answering
Ziyuan Zhuang
Zhiyang Zhang
Sitao Cheng
Fangkai Yang
Jia Liu
Shujian Huang
Qingwei Lin
Saravan Rajmohan
Dongmei Zhang
Qi Zhang
RALM
30
6
0
08 Aug 2024
Automated Theorem Provers Help Improve Large Language Model Reasoning
Automated Theorem Provers Help Improve Large Language Model Reasoning
Lachlan McGinness
Peter Baumgartner
LRM
20
4
0
07 Aug 2024
FANNO: Augmenting High-Quality Instruction Data with Open-Sourced LLMs
  Only
FANNO: Augmenting High-Quality Instruction Data with Open-Sourced LLMs Only
He Zhu
Junyou Su
Tianle Lun
Yicheng Tao
Wenjia Zhang
Zipei Fan
Guanhua Chen
ALM
16
2
0
02 Aug 2024
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher
Zehui Chen
Kuikun Liu
Qiuchen Wang
Jiangning Liu
Wenwei Zhang
Kai Chen
Feng Zhao
LLMAG
58
18
0
29 Jul 2024
Improving Retrieval Augmented Language Model with Self-Reasoning
Improving Retrieval Augmented Language Model with Self-Reasoning
Yuan Xia
Jingbo Zhou
Zhenhui Shi
Jun Chen
Hai-ting Huang
AIFin
LRM
ReLM
KELM
25
8
0
29 Jul 2024
AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?
AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?
Ori Yoran
S. Amouyal
Chaitanya Malaviya
Ben Bogin
Ofir Press
Jonathan Berant
LLMAG
35
30
0
22 Jul 2024
Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better
  Together
Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better Together
Dilara Soylu
Christopher Potts
Omar Khattab
19
10
0
15 Jul 2024
Cross-Lingual Multi-Hop Knowledge Editing
Cross-Lingual Multi-Hop Knowledge Editing
Aditi Khandelwal
Harman Singh
Hengrui Gu
Tianlong Chen
Kaixiong Zhou
KELM
26
0
0
14 Jul 2024
Stepwise Verification and Remediation of Student Reasoning Errors with
  Large Language Model Tutors
Stepwise Verification and Remediation of Student Reasoning Errors with Large Language Model Tutors
Nico Daheim
Jakub Macina
Manu Kapur
Iryna Gurevych
Mrinmaya Sachan
LRM
18
5
0
12 Jul 2024
IDAT: A Multi-Modal Dataset and Toolkit for Building and Evaluating
  Interactive Task-Solving Agents
IDAT: A Multi-Modal Dataset and Toolkit for Building and Evaluating Interactive Task-Solving Agents
Shrestha Mohanty
Negar Arabzadeh
Andrea Tupini
Yuxuan Sun
Alexey Skrynnik
Artem Zholus
Marc-Alexandre Côté
Julia Kiseleva
33
0
0
12 Jul 2024
CiteME: Can Language Models Accurately Cite Scientific Claims?
CiteME: Can Language Models Accurately Cite Scientific Claims?
Ori Press
Andreas Hochlehnert
Ameya Prabhu
Vishaal Udandarao
Ofir Press
Matthias Bethge
24
12
0
10 Jul 2024
Distilling System 2 into System 1
Distilling System 2 into System 1
Ping Yu
Jing Xu
Jason Weston
Ilia Kulikov
OffRL
LRM
38
55
0
08 Jul 2024
AI Safety in Generative AI Large Language Models: A Survey
AI Safety in Generative AI Large Language Models: A Survey
Jaymari Chua
Yun Yvonna Li
Shiyi Yang
Chen Wang
Lina Yao
LM&MA
34
12
0
06 Jul 2024
DSLR: Document Refinement with Sentence-Level Re-ranking and
  Reconstruction to Enhance Retrieval-Augmented Generation
DSLR: Document Refinement with Sentence-Level Re-ranking and Reconstruction to Enhance Retrieval-Augmented Generation
Taeho Hwang
Soyeong Jeong
Sukmin Cho
SeungYoon Han
Jong C. Park
RALM
22
1
0
04 Jul 2024
WTU-EVAL: A Whether-or-Not Tool Usage Evaluation Benchmark for Large
  Language Models
WTU-EVAL: A Whether-or-Not Tool Usage Evaluation Benchmark for Large Language Models
Kangyun Ning
Yisong Su
Xueqiang Lv
Yuanzhe Zhang
Jian Liu
Kang Liu
Jinan Xu
ELM
LLMAG
26
2
0
02 Jul 2024
Why does in-context learning fail sometimes? Evaluating in-context
  learning on open and closed questions
Why does in-context learning fail sometimes? Evaluating in-context learning on open and closed questions
Xiang Li
Haoran Tang
Siyu Chen
Ziwei Wang
Ryan Chen
Marcin Abram
LRM
29
1
0
02 Jul 2024
SADL: An Effective In-Context Learning Method for Compositional Visual
  QA
SADL: An Effective In-Context Learning Method for Compositional Visual QA
Long Hoang Dang
T. Le
Vuong Le
Tu Minh Phuong
Truyen Tran
ReLM
CoGe
33
2
0
02 Jul 2024
Enabling Discriminative Reasoning in LLMs for Legal Judgment Prediction
Enabling Discriminative Reasoning in LLMs for Legal Judgment Prediction
Chenlong Deng
Kelong Mao
Yuyao Zhang
Zhicheng Dou
ELM
AILaw
20
1
0
02 Jul 2024
Searching for Best Practices in Retrieval-Augmented Generation
Searching for Best Practices in Retrieval-Augmented Generation
Xiaohua Wang
Zhenghua Wang
Xuan Gao
Feiran Zhang
Yixin Wu
...
Qi Qian
Ruicheng Yin
Changze Lv
Xiaoqing Zheng
Xuanjing Huang
43
39
0
01 Jul 2024
YuLan: An Open-source Large Language Model
YuLan: An Open-source Large Language Model
Yutao Zhu
Kun Zhou
Kelong Mao
Wentong Chen
Yiding Sun
...
Wenbing Huang
Ze-Feng Gao
Yueguo Chen
Weizheng Lu
Ji-Rong Wen
ALM
ELM
29
1
0
28 Jun 2024
BeamAggR: Beam Aggregation Reasoning over Multi-source Knowledge for
  Multi-hop Question Answering
BeamAggR: Beam Aggregation Reasoning over Multi-source Knowledge for Multi-hop Question Answering
Zheng Chu
Jingchang Chen
Qianglong Chen
Haotian Wang
Kun Zhu
Xiyuan Du
Weijiang Yu
Ming Liu
Bing Qin
LRM
25
4
0
28 Jun 2024
Understand What LLM Needs: Dual Preference Alignment for
  Retrieval-Augmented Generation
Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation
Guanting Dong
Yutao Zhu
Chenghao Zhang
Zechen Wang
Zhicheng Dou
Ji-Rong Wen
RALM
42
3
0
26 Jun 2024
DEXTER: A Benchmark for open-domain Complex Question Answering using
  LLMs
DEXTER: A Benchmark for open-domain Complex Question Answering using LLMs
Venktesh V. Deepali Prabhu
Avishek Anand
RALM
CoGe
20
0
0
24 Jun 2024
From Decoding to Meta-Generation: Inference-time Algorithms for Large
  Language Models
From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models
Sean Welleck
Amanda Bertsch
Matthew Finlayson
Hailey Schoelkopf
Alex Xie
Graham Neubig
Ilia Kulikov
Zaid Harchaoui
33
45
0
24 Jun 2024
Model Merging and Safety Alignment: One Bad Model Spoils the Bunch
Model Merging and Safety Alignment: One Bad Model Spoils the Bunch
Hasan Hammoud
Umberto Michieli
Fabio Pizzati
Philip H. S. Torr
Adel Bibi
Bernard Ghanem
Mete Ozay
MoMe
31
14
0
20 Jun 2024
Learning to Plan for Retrieval-Augmented Large Language Models from
  Knowledge Graphs
Learning to Plan for Retrieval-Augmented Large Language Models from Knowledge Graphs
Junjie Wang
Mingyang Chen
Binbin Hu
Dan Yang
Ziqi Liu
...
Jinjie Gu
Jun Zhou
Jeff Z. Pan
Wen Zhang
Huajun Chen
RALM
22
12
0
20 Jun 2024
Distributional reasoning in LLMs: Parallel reasoning processes in
  multi-hop reasoning
Distributional reasoning in LLMs: Parallel reasoning processes in multi-hop reasoning
Yuval Shalev
Amir Feder
Ariel Goldstein
LRM
24
4
0
19 Jun 2024
MoreHopQA: More Than Multi-hop Reasoning
MoreHopQA: More Than Multi-hop Reasoning
Julian Schnitzler
Xanh Ho
Jiahao Huang
Florian Boudin
Saku Sugawara
Akiko Aizawa
LRM
31
2
0
19 Jun 2024
Hopping Too Late: Exploring the Limitations of Large Language Models on
  Multi-Hop Queries
Hopping Too Late: Exploring the Limitations of Large Language Models on Multi-Hop Queries
Eden Biran
Daniela Gottesman
Sohee Yang
Mor Geva
Amir Globerson
LRM
31
21
0
18 Jun 2024
A Personalised Learning Tool for Physics Undergraduate Students Built On
  a Large Language Model for Symbolic Regression
A Personalised Learning Tool for Physics Undergraduate Students Built On a Large Language Model for Symbolic Regression
Yufan Zhu
Zi-Yu Khoo
Jonathan Sze Choong Low
Stephane Bressan
AI4Ed
25
1
0
17 Jun 2024
HiddenTables & PyQTax: A Cooperative Game and Dataset For TableQA to
  Ensure Scale and Data Privacy Across a Myriad of Taxonomies
HiddenTables & PyQTax: A Cooperative Game and Dataset For TableQA to Ensure Scale and Data Privacy Across a Myriad of Taxonomies
William Watson
Nicole Cho
T. Balch
Manuela Veloso
LMTD
18
0
0
16 Jun 2024
Chain of Preference Optimization: Improving Chain-of-Thought Reasoning
  in LLMs
Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs
Xuan Zhang
Chao Du
Tianyu Pang
Qian Liu
Wei Gao
Min-Bin Lin
LRM
AI4CE
44
34
0
13 Jun 2024
TextGrad: Automatic "Differentiation" via Text
TextGrad: Automatic "Differentiation" via Text
Mert Yuksekgonul
Federico Bianchi
Joseph Boen
Sheng Liu
Zhi Huang
Carlos Guestrin
James Zou
LLMAG
OOD
AI4CE
31
31
0
11 Jun 2024
DR-RAG: Applying Dynamic Document Relevance to Retrieval-Augmented
  Generation for Question-Answering
DR-RAG: Applying Dynamic Document Relevance to Retrieval-Augmented Generation for Question-Answering
Zijian Hei
Weiling Liu
Wenjie Ou
Juyi Qiao
Junming Jiao
Guowen Song
Ting Tian
Yi Lin
RALM
31
5
0
11 Jun 2024
Crayon: Customized On-Device LLM via Instant Adapter Blending and
  Edge-Server Hybrid Inference
Crayon: Customized On-Device LLM via Instant Adapter Blending and Edge-Server Hybrid Inference
Jihwan Bang
Juntae Lee
Kyuhong Shim
Seunghan Yang
Simyung Chang
18
5
0
11 Jun 2024
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning
Joongwon Kim
Bhargavi Paranjape
Tushar Khot
Hannaneh Hajishirzi
LM&Ro
ELM
LLMAG
LRM
29
8
0
10 Jun 2024
Interpretability of Language Models via Task Spaces
Interpretability of Language Models via Task Spaces
Lucas Weber
Jaap Jumelet
Elia Bruni
Dieuwke Hupkes
22
3
0
10 Jun 2024
Attention as a Hypernetwork
Attention as a Hypernetwork
Simon Schug
Seijin Kobayashi
Yassir Akram
João Sacramento
Razvan Pascanu
GNN
20
3
0
09 Jun 2024
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language
  Models
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models
Ling Yang
Zhaochen Yu
Tianjun Zhang
Shiyi Cao
Minkai Xu
Wentao Zhang
Joseph E. Gonzalez
Bin Cui
LLMAG
LM&Ro
LRM
KELM
20
34
0
06 Jun 2024
Bi-Chainer: Automated Large Language Models Reasoning with Bidirectional
  Chaining
Bi-Chainer: Automated Large Language Models Reasoning with Bidirectional Chaining
Shuqi Liu
Bowei He
Linqi Song
LRM
32
1
0
05 Jun 2024
Break the Chain: Large Language Models Can be Shortcut Reasoners
Break the Chain: Large Language Models Can be Shortcut Reasoners
Mengru Ding
Hanmeng Liu
Zhizhang Fu
Jian Song
Wenbo Xie
Yue Zhang
KELM
LRM
27
7
0
04 Jun 2024
Graph Neural Network Enhanced Retrieval for Question Answering of LLMs
Graph Neural Network Enhanced Retrieval for Question Answering of LLMs
Zijian Li
Qingyan Guo
Jiawei Shao
Lei Song
Jiang Bian
Jun Zhang
Rui Wang
RALM
27
11
0
03 Jun 2024
CtrlA: Adaptive Retrieval-Augmented Generation via Probe-Guided Control
CtrlA: Adaptive Retrieval-Augmented Generation via Probe-Guided Control
Huanshuo Liu
Hao Zhang
Zhijiang Guo
Kuicai Dong
Xiangyang Li
Yi Quan Lee
Cong Zhang
Yong-jin Liu
3DV
23
6
0
29 May 2024
Conv-CoA: Improving Open-domain Question Answering in Large Language
  Models via Conversational Chain-of-Action
Conv-CoA: Improving Open-domain Question Answering in Large Language Models via Conversational Chain-of-Action
Zhenyu Pan
Haozheng Luo
Manling Li
Han Liu
LRM
35
10
0
28 May 2024
Previous
123456789
Next