Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1909.03065
Cited By
"Going on a vacation" takes longer than "Going for a walk": A Study of Temporal Commonsense Understanding
6 September 2019
Ben Zhou
Daniel Khashabi
Qiang Ning
Dan Roth
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
""Going on a vacation" takes longer than "Going for a walk": A Study of Temporal Commonsense Understanding"
50 / 127 papers shown
Title
Chaining Event Spans for Temporal Relation Grounding
Jongho Kim
Dohyeon Lee
Minsoo Kim
Seung-won Hwang
40
0
0
17 Jun 2025
From What to Respond to When to Respond: Timely Response Generation for Open-domain Dialogue Agents
Seongbo Jang
Minjin Jeon
Jaehoon Lee
Seonghyeon Lee
Dongha Lee
Hwanjo Yu
39
0
0
17 Jun 2025
LexTime: A Benchmark for Temporal Ordering of Legal Events
Claire Barale
Leslie Barrett
Vikram Sunil Bajaj
Michael Rovatsos
AILaw
121
0
0
04 Jun 2025
Around the World in 24 Hours: Probing LLM Knowledge of Time and Place
Carolin Holtermann
Paul Röttger
Anne Lauscher
LRM
81
0
0
04 Jun 2025
CrossICL: Cross-Task In-Context Learning via Unsupervised Demonstration Transfer
Jinglong Gao
Xiao Ding
Lingxiao Zou
Bing Qin
Ting Liu
37
0
0
30 May 2025
Improving LLM First-Token Predictions in Multiple-Choice Question Answering via Prefilling Attack
Silvia Cappelletti
Tobia Poppi
Samuele Poppi
Zheng-Xin Yong
Diego Garcia-Olano
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
KELM
AAML
66
0
0
21 May 2025
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Yixin Cao
Shibo Hong
Xuzhao Li
Jiahao Ying
Yubo Ma
...
Juanzi Li
Aixin Sun
Xuanjing Huang
Tat-Seng Chua
Tianwei Zhang
ALM
ELM
261
7
0
26 Apr 2025
Learning to Reason Over Time: Timeline Self-Reflection for Improved Temporal Reasoning in Language Models
Adrián Bazaga
Rexhina Blloshmi
Bill Byrne
Adria de Gispert
ReLM
LRM
103
1
0
07 Apr 2025
WinoWhat: A Parallel Corpus of Paraphrased WinoGrande Sentences with Common Sense Categorization
I. Gevers
Victor De Marez
Luna De Bruyne
Walter Daelemans
69
0
0
31 Mar 2025
Improving Preference Extraction In LLMs By Identifying Latent Knowledge Through Classifying Probes
Sharan Maiya
Yinhong Liu
Ramit Debnath
Anna Korhonen
79
0
0
22 Mar 2025
A Study into Investigating Temporal Robustness of LLMs
Jonas Wallat
Abdelrahman Abdallah
Adam Jatowt
Avishek Anand
76
3
0
21 Mar 2025
Are Sparse Autoencoders Useful? A Case Study in Sparse Probing
Subhash Kantamneni
Joshua Engels
Senthooran Rajamanoharan
Max Tegmark
Neel Nanda
149
17
0
23 Feb 2025
Counterfactual-Consistency Prompting for Relative Temporal Understanding in Large Language Models
Jongho Kim
Seung-won Hwang
LRM
AI4CE
173
1
0
17 Feb 2025
Weak-to-Strong Generalization Through the Data-Centric Lens
Changho Shin
John Cooper
Frederic Sala
192
9
0
05 Dec 2024
ActionCOMET: A Zero-shot Approach to Learn Image-specific Commonsense Concepts about Actions
Shailaja Keyur Sampat
Yezhou Yang
Chitta Baral
LM&Ro
85
0
0
17 Oct 2024
MM-Forecast: A Multimodal Approach to Temporal Event Forecasting with Large Language Models
Haoxuan Li
Zhengmao Yang
Yunshan Ma
Yi Bin
Yang Yang
Tat-Seng Chua
107
1
0
08 Aug 2024
A Comprehensive Evaluation of Large Language Models on Temporal Event Forecasting
He Chang
Chenchen Ye
Zhulin Tao
Jie Wu
Zhengmao Yang
Yunshan Ma
Xianglin Huang
Tat-Seng Chua
AI4TS
98
2
0
16 Jul 2024
UnSeenTimeQA: Time-Sensitive Question-Answering Beyond LLMs' Memorization
Md Nayem Uddin
Amir Saeidi
Divij Handa
Agastya Seth
Tran Cao Son
Eduardo Blanco
Steven Corman
Chitta Baral
142
4
0
03 Jul 2024
Timo: Towards Better Temporal Reasoning for Language Models
Zhaochen Su
Jun Zhang
Tong Zhu
Xiaoye Qu
Juntao Li
Min Zhang
Yu Cheng
LRM
98
23
0
20 Jun 2024
Evaluating the Generalization Ability of Quantized LLMs: Benchmark, Analysis, and Toolbox
Yijun Liu
Yuan Meng
Fang Wu
Shenhao Peng
Hang Yao
Chaoyu Guan
Chen Tang
Xinzhu Ma
Zhi Wang
Wenwu Zhu
MQ
115
8
0
15 Jun 2024
Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning?
Zhaochen Su
Juntao Li
Jun Zhang
Tong Zhu
Xiaoye Qu
Pan Zhou
Yan Bowen
Yu Cheng
Min zhang
LRM
136
25
0
13 Jun 2024
Scaling and evaluating sparse autoencoders
Leo Gao
Tom Dupré la Tour
Henk Tillman
Gabriel Goh
Rajan Troll
Alec Radford
Ilya Sutskever
Jan Leike
Jeffrey Wu
102
163
0
06 Jun 2024
A Comprehensive Evaluation on Event Reasoning of Large Language Models
Zhengwei Tao
Zhi Jin
Yifan Zhang
Xiancai Chen
Xiaoying Bai
Yue Fang
Haiyan Zhao
Jia Li
Chongyang Tao
LRM
73
4
0
26 Apr 2024
Continual Learning of Large Language Models: A Comprehensive Survey
Haizhou Shi
Zihao Xu
Hengyi Wang
Weiyi Qin
Wenyuan Wang
Yibin Wang
Zifeng Wang
Sayna Ebrahimi
Hao Wang
CLL
KELM
LRM
167
88
0
25 Apr 2024
LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models
Mihir Parmar
Nisarg Patel
Neeraj Varshney
Mutsumi Nakamura
Man Luo
Santosh Mashetty
Arindam Mitra
Chitta Baral
LRM
ReLM
ELM
215
31
0
23 Apr 2024
EVIT: Event-Oriented Instruction Tuning for Event Reasoning
Zhengwei Tao
Xiancai Chen
Zhi Jin
Xiaoying Bai
Haiyan Zhao
Yiwei Lou
114
3
0
18 Apr 2024
AcTED: Automatic Acquisition of Typical Event Duration for Semi-supervised Temporal Commonsense QA
Felix Giovanni Virgo
Fei Cheng
L. Pereira
Masayuki Asahara
Ichiro Kobayashi
Sadao Kurohashi
41
0
0
27 Mar 2024
Formulation Comparison for Timeline Construction using LLMs
Kimihiro Hasegawa
Nikhil Kandukuri
Susan Holm
Yukari Yamakawa
Teruko Mitamura
90
0
0
01 Mar 2024
When LLMs Meet Cunning Texts: A Fallacy Understanding Benchmark for Large Language Models
Hai-Tao Zheng
Qingyu Zhou
Yuanzhen Luo
Shirong Ma
Yangning Li
Hai-Tao Zheng
Xuming Hu
Philip S. Yu
LRM
123
14
0
16 Feb 2024
Large Language Models Can Learn Temporal Reasoning
Siheng Xiong
Ali Payani
Ramana Rao Kompella
Faramarz Fekri
LRM
127
97
0
12 Jan 2024
Temporal Validity Change Prediction
Georg Wenzel
Adam Jatowt
94
0
0
01 Jan 2024
CORECODE: A Common Sense Annotated Dialogue Dataset with Benchmark Tasks for Chinese Large Language Models
Dan Shi
Chaobin You
Jian-Tao Huang
Taihao Li
Deyi Xiong
LRM
60
1
0
20 Dec 2023
Catwalk: A Unified Language Model Evaluation Framework for Many Datasets
Dirk Groeneveld
Anas Awadalla
Iz Beltagy
Akshita Bhagia
Ian H. Magnusson
Hao Peng
Oyvind Tafjord
Pete Walsh
Kyle Richardson
Jesse Dodge
150
1
0
15 Dec 2023
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
Collin Burns
Pavel Izmailov
Jan Hendrik Kirchner
Bowen Baker
Leo Gao
...
Adrien Ecoffet
Manas Joglekar
Jan Leike
Ilya Sutskever
Jeff Wu
ELM
143
299
0
14 Dec 2023
TimeBench: A Comprehensive Evaluation of Temporal Reasoning Abilities in Large Language Models
Zheng Chu
Jingchang Chen
Qianglong Chen
Weijiang Yu
Haotian Wang
Ming Liu
Bing Qin
LRM
ELM
128
15
0
29 Nov 2023
Towards Robust Temporal Reasoning of Large Language Models via a Multi-Hop QA Dataset and Pseudo-Instruction Tuning
Qingyu Tan
Hwee Tou Ng
Lidong Bing
107
11
0
16 Nov 2023
Are Large Language Models Temporally Grounded?
Yifu Qiu
Zheng Zhao
Yftah Ziser
Anna Korhonen
Edoardo Ponti
Shay B. Cohen
LRM
85
11
0
14 Nov 2023
MTGER: Multi-view Temporal Graph Enhanced Temporal Reasoning over Time-Involved Document
Zheng Chu
Zekun Wang
Jiafeng Liang
Ming Liu
Bing Qin
77
2
0
08 Nov 2023
Mind the Gap Between Conversations for Improved Long-Term Dialogue Generation
Qiang Zhang
Jason Naradowsky
Yusuke Miyao
59
10
0
24 Oct 2023
CRoW: Benchmarking Commonsense Reasoning in Real-World Tasks
Mete Ismayilzada
Debjit Paul
Syrielle Montariol
Mor Geva
Antoine Bosselut
LRM
96
5
0
23 Oct 2023
How Much Consistency Is Your Accuracy Worth?
Jacob K. Johnson
Ana Marasović
58
1
0
20 Oct 2023
Instructive Dialogue Summarization with Query Aggregations
Bin Wang
Zhengyuan Liu
Nancy F. Chen
97
3
0
17 Oct 2023
TRAM: Benchmarking Temporal Reasoning for Large Language Models
Yuqing Wang
Yun Zhao
LRM
111
14
0
02 Oct 2023
Navigate through Enigmatic Labyrinth A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future
Zheng Chu
Jingchang Chen
Qianglong Chen
Weijiang Yu
Tao He
Haotian Wang
Weihua Peng
Ming-Yuan Liu
Bing Qin
Ting Liu
LRM
AI4CE
131
175
0
27 Sep 2023
Can NLP Models Ídentify', 'Distinguish', and 'Justify' Questions that Don't have a Definitive Answer?
Ayushi Agarwal
Nisarg Patel
Neeraj Varshney
Mihir Parmar
Pavan Mallina
Aryan Bhavin Shah
Srihari Sangaraju
Tirth Patel
Nihar Thakkar
Chitta Baral
ELM
68
4
0
08 Sep 2023
An Overview Of Temporal Commonsense Reasoning and Acquisition
Georg Wenzel
Adam Jatowt
ReLM
LRM
138
9
0
28 Jul 2023
SituatedGen: Incorporating Geographical and Temporal Contexts into Generative Commonsense Reasoning
Yunxiang Zhang
Xiaojun Wan
AILaw
LRM
89
7
0
21 Jun 2023
FERMAT: An Alternative to Accuracy for Numerical Reasoning
Jasivan Sivakumar
N. Moosavi
ReLM
LRM
93
4
0
27 May 2023
Mitigating Temporal Misalignment by Discarding Outdated Facts
Michael J.Q. Zhang
Eunsol Choi
KELM
HILM
103
20
0
24 May 2023
Few-shot Unified Question Answering: Tuning Models or Prompts?
Srijan Bansal
Semih Yavuz
Bo Pang
Meghana Moorthy Bhat
Yingbo Zhou
108
2
0
23 May 2023
1
2
3
Next