Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.05157
Cited By
MenatQA: A New Dataset for Testing the Temporal Comprehension and Reasoning Abilities of Large Language Models
8 October 2023
Yifan Wei
Yisong Su
Huanhuan Ma
Xiaoyan Yu
Fangyu Lei
Yuanzhe Zhang
Jun Zhao
Kang Liu
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MenatQA: A New Dataset for Testing the Temporal Comprehension and Reasoning Abilities of Large Language Models"
11 / 11 papers shown
Title
TRAVELER: A Benchmark for Evaluating Temporal Reasoning across Vague, Implicit and Explicit References
Svenja Kenneweg
J. Deigmöller
Philipp Cimiano
Julian Eggert
44
0
0
02 May 2025
LLMs as Repositories of Factual Knowledge: Limitations and Solutions
Seyed Mahed Mousavi
Simone Alghisi
Giuseppe Riccardi
KELM
47
0
0
22 Jan 2025
Does Knowledge Localization Hold True? Surprising Differences Between Entity and Relation Perspectives in Language Models
Yifan Wei
Xiaoyan Yu
Yixuan Weng
Huanhuan Ma
Yuanzhe Zhang
Jun Zhao
Kang Liu
KELM
51
4
0
01 Sep 2024
ComplexTempQA: A Large-Scale Dataset for Complex Temporal Question Answering
Raphael Gruber
Abdelrahman Abdallah
Michael Färber
Adam Jatowt
30
4
0
07 Jun 2024
Relational Prompt-based Pre-trained Language Models for Social Event Detection
Pu Li
Xiaoyan Yu
Hao Peng
Yantuan Xian
Linqin Wang
Li Sun
Jingyun Zhang
Philip S. Yu
38
4
0
12 Apr 2024
EX-FEVER: A Dataset for Multi-hop Explainable Fact Verification
Huanhuan Ma
Weizhi Xu
Yifan Wei
Liuji Chen
Liang Wang
Qiang Liu
Shu Wu
Liang Wang
11
14
0
15 Oct 2023
ReAct: Synergizing Reasoning and Acting in Language Models
Shunyu Yao
Jeffrey Zhao
Dian Yu
Nan Du
Izhak Shafran
Karthik Narasimhan
Yuan Cao
LLMAG
ReLM
LRM
233
2,470
0
06 Oct 2022
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng-Zhen Zhang
Yuxiao Dong
Jie Tang
BDL
LRM
245
1,071
0
05 Oct 2022
Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-Thought
Abulhair Saparov
He He
ELM
LRM
ReLM
116
274
0
03 Oct 2022
StreamingQA: A Benchmark for Adaptation to New Knowledge over Time in Question Answering Models
Adam Livska
Tomávs Kovciský
E. Gribovskaya
Tayfun Terzi
Eren Sezener
...
Susannah Young
Ellen Gilsenan-McMahon
Sophia Austin
Phil Blunsom
Angeliki Lazaridou
KELM
232
89
0
23 May 2022
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
251
2,009
0
28 Jul 2020
1