ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.01951
  4. Cited By
Mind the Gap: Assessing Temporal Generalization in Neural Language
  Models

Mind the Gap: Assessing Temporal Generalization in Neural Language Models

3 February 2021
Angeliki Lazaridou
A. Kuncoro
E. Gribovskaya
Devang Agrawal
Adam Liska
Tayfun Terzi
Mai Giménez
Cyprien de Masson dÁutume
Tomás Kociský
Sebastian Ruder
Dani Yogatama
Kris Cao
Susannah Young
Phil Blunsom
    VLM
ArXivPDFHTML

Papers citing "Mind the Gap: Assessing Temporal Generalization in Neural Language Models"

42 / 42 papers shown
Title
A Reasoning-Focused Legal Retrieval Benchmark
A Reasoning-Focused Legal Retrieval Benchmark
Lucia Zheng
Neel Guha
Javokhir Arifov
Sarah Zhang
Michal Skreta
Christopher D. Manning
Peter Henderson
Daniel E. Ho
AILaw
RALM
ELM
94
2
0
06 May 2025
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining
Jeffrey Li
Mohammadreza Armandpour
Iman Mirzadeh
Sachin Mehta
Vaishaal Shankar
...
Samy Bengio
Oncel Tuzel
Mehrdad Farajtabar
Hadi Pouransari
Fartash Faghri
CLL
KELM
59
0
0
02 Apr 2025
Measuring temporal effects of agent knowledge by date-controlled tool use
Measuring temporal effects of agent knowledge by date-controlled tool use
R. Xian
Qiming Cui
Stefan Bauer
Reza Abbasi-Asl
KELM
54
0
0
06 Mar 2025
Reinforced Lifelong Editing for Language Models
Reinforced Lifelong Editing for Language Models
Zherui Li
Houcheng Jiang
Hao Chen
Baolong Bi
Z. Zhou
Fei Sun
Junfeng Fang
X. Wang
KELM
51
5
0
09 Feb 2025
Evolution and The Knightian Blindspot of Machine Learning
Evolution and The Knightian Blindspot of Machine Learning
Joel Lehman
Elliot Meyerson
Tarek El-Gaaly
Kenneth O. Stanley
Tarin Ziyaee
84
1
0
22 Jan 2025
Gradient Localization Improves Lifelong Pretraining of Language Models
Gradient Localization Improves Lifelong Pretraining of Language Models
Jared Fernandez
Yonatan Bisk
Emma Strubell
KELM
31
1
0
07 Nov 2024
Understanding the Interplay between Parametric and Contextual Knowledge
  for Large Language Models
Understanding the Interplay between Parametric and Contextual Knowledge for Large Language Models
Sitao Cheng
Liangming Pan
Xunjian Yin
Xinyi Wang
William Yang Wang
KELM
37
4
0
10 Oct 2024
AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge
AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge
Han Wang
Archiki Prasad
Elias Stengel-Eskin
Mohit Bansal
75
5
0
11 Sep 2024
Teaching LLMs to Abstain across Languages via Multilingual Feedback
Teaching LLMs to Abstain across Languages via Multilingual Feedback
Shangbin Feng
Weijia Shi
Yike Wang
Wenxuan Ding
Orevaoghene Ahia
Shuyue Stella Li
Vidhisha Balachandran
Sunayana Sitaram
Yulia Tsvetkov
65
4
0
22 Jun 2024
HelloFresh: LLM Evaluations on Streams of Real-World Human Editorial
  Actions across X Community Notes and Wikipedia edits
HelloFresh: LLM Evaluations on Streams of Real-World Human Editorial Actions across X Community Notes and Wikipedia edits
Tim Franzmeyer
Aleksandar Shtedritski
Samuel Albanie
Philip H. S. Torr
João F. Henriques
Jakob N. Foerster
27
1
0
05 Jun 2024
SAVA: Scalable Learning-Agnostic Data Valuation
SAVA: Scalable Learning-Agnostic Data Valuation
Samuel Kessler
Tam Le
Vu Nguyen
TDI
51
0
0
03 Jun 2024
Stable Neural Stochastic Differential Equations in Analyzing Irregular
  Time Series Data
Stable Neural Stochastic Differential Equations in Analyzing Irregular Time Series Data
YongKyung Oh
Dongyoung Lim
Sungil Kim
AI4TS
35
11
0
22 Feb 2024
Temporal Blind Spots in Large Language Models
Temporal Blind Spots in Large Language Models
Jonas Wallat
Adam Jatowt
Avishek Anand
36
3
0
22 Jan 2024
Latent Feature-based Data Splits to Improve Generalisation Evaluation: A
  Hate Speech Detection Case Study
Latent Feature-based Data Splits to Improve Generalisation Evaluation: A Hate Speech Detection Case Study
Maike Zufle
Verna Dankers
Ivan Titov
25
0
0
16 Nov 2023
Evaluating the Ripple Effects of Knowledge Editing in Language Models
Evaluating the Ripple Effects of Knowledge Editing in Language Models
Roi Cohen
Eden Biran
Ori Yoran
Amir Globerson
Mor Geva
KELM
33
155
0
24 Jul 2023
Empowering Cross-lingual Behavioral Testing of NLP Models with
  Typological Features
Empowering Cross-lingual Behavioral Testing of NLP Models with Typological Features
Ester Hlavnova
Sebastian Ruder
30
5
0
11 Jul 2023
Out-of-Distribution Generalization in Text Classification: Past,
  Present, and Future
Out-of-Distribution Generalization in Text Classification: Past, Present, and Future
Linyi Yang
Y. Song
Xuan Ren
Chenyang Lyu
Yidong Wang
Lingqiao Liu
Jindong Wang
Jennifer Foster
Yue Zhang
OOD
20
2
0
23 May 2023
Revisiting Entropy Rate Constancy in Text
Revisiting Entropy Rate Constancy in Text
Vivek Verma
Nicholas Tomlin
Dan Klein
14
4
0
20 May 2023
Unsupervised Semantic Variation Prediction using the Distribution of
  Sibling Embeddings
Unsupervised Semantic Variation Prediction using the Distribution of Sibling Embeddings
Taichi Aida
Danushka Bollegala
18
8
0
15 May 2023
SwissBERT: The Multilingual Language Model for Switzerland
SwissBERT: The Multilingual Language Model for Switzerland
Jannis Vamvas
Johannes Graen
Rico Sennrich
25
6
0
23 Mar 2023
Improving Transformer Performance for French Clinical Notes
  Classification Using Mixture of Experts on a Limited Dataset
Improving Transformer Performance for French Clinical Notes Classification Using Mixture of Experts on a Limited Dataset
Thanh-Dung Le
P. Jouvet
R. Noumeir
MoE
MedIm
67
5
0
22 Mar 2023
An Overview on Language Models: Recent Developments and Outlook
An Overview on Language Models: Recent Developments and Outlook
Chengwei Wei
Yun Cheng Wang
Bin Wang
C.-C. Jay Kuo
17
41
0
10 Mar 2023
Two Losses Are Better Than One: Faster Optimization Using a Cheaper
  Proxy
Two Losses Are Better Than One: Faster Optimization Using a Cheaper Proxy
Blake E. Woodworth
Konstantin Mishchenko
Francis R. Bach
26
6
0
07 Feb 2023
TempEL: Linking Dynamically Evolving and Newly Emerging Entities
TempEL: Linking Dynamically Evolving and Newly Emerging Entities
Klim Zaporojets
Lucie-Aimée Kaffee
Johannes Deleu
Thomas Demeester
Chris Develder
Isabelle Augenstein
KELM
21
15
0
05 Feb 2023
Addressing Distribution Shift at Test Time in Pre-trained Language
  Models
Addressing Distribution Shift at Test Time in Pre-trained Language Models
Ayush Singh
J. Ortega
VLM
6
4
0
05 Dec 2022
Time-Aware Datasets are Adaptive Knowledgebases for the New Normal
Time-Aware Datasets are Adaptive Knowledgebases for the New Normal
Abhijit Suprem
Sanjyot Vaidya
J. Ferreira
C. Pu
24
2
0
22 Nov 2022
Large Language Models with Controllable Working Memory
Large Language Models with Controllable Working Memory
Daliang Li
A. S. Rawat
Manzil Zaheer
Xin Wang
Michal Lukasik
Andreas Veit
Felix X. Yu
Surinder Kumar
KELM
34
151
0
09 Nov 2022
Time-aware Prompting for Text Generation
Time-aware Prompting for Text Generation
Shuyang Cao
Lu Wang
16
11
0
03 Nov 2022
Improving Temporal Generalization of Pre-trained Language Models with
  Lexical Semantic Change
Improving Temporal Generalization of Pre-trained Language Models with Lexical Semantic Change
Zhao-yu Su
Zecheng Tang
Xinyan Guan
Juntao Li
Lijun Wu
M. Zhang
CLL
AI4CE
18
22
0
31 Oct 2022
Named Entity Recognition in Twitter: A Dataset and Analysis on
  Short-Term Temporal Shifts
Named Entity Recognition in Twitter: A Dataset and Analysis on Short-Term Temporal Shifts
Asahi Ushio
Leonardo Neves
Vítor Silva
Francesco Barbieri
Jose Camacho-Collados
23
26
0
07 Oct 2022
Env-Aware Anomaly Detection: Ignore Style Changes, Stay True to Content!
Env-Aware Anomaly Detection: Ignore Style Changes, Stay True to Content!
Stefan Smeu
Elena Burceanu
Andrei Liviu Nicolicioiu
Emanuela Haller
21
4
0
06 Oct 2022
Memory-Based Model Editing at Scale
Memory-Based Model Editing at Scale
E. Mitchell
Charles Lin
Antoine Bosselut
Christopher D. Manning
Chelsea Finn
KELM
16
318
0
13 Jun 2022
Building for Tomorrow: Assessing the Temporal Persistence of Text
  Classifiers
Building for Tomorrow: Assessing the Temporal Persistence of Text Classifiers
Rabab Alkhalifa
E. Kochkina
A. Zubiaga
19
25
0
11 May 2022
Entity Cloze By Date: What LMs Know About Unseen Entities
Entity Cloze By Date: What LMs Know About Unseen Entities
Yasumasa Onoe
Michael J.Q. Zhang
Eunsol Choi
Greg Durrett
KELM
19
49
0
05 May 2022
Temporal Attention for Language Models
Temporal Attention for Language Models
Guy D. Rosin
Kira Radinsky
VLM
22
33
0
04 Feb 2022
Do Language Models Have Beliefs? Methods for Detecting, Updating, and
  Visualizing Model Beliefs
Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs
Peter Hase
Mona T. Diab
Asli Celikyilmaz
Xian Li
Zornitsa Kozareva
Veselin Stoyanov
Mohit Bansal
Srini Iyer
KELM
LRM
17
79
0
26 Nov 2021
Temporal Effects on Pre-trained Models for Language Processing Tasks
Temporal Effects on Pre-trained Models for Language Processing Tasks
Oshin Agarwal
A. Nenkova
VLM
14
52
0
24 Nov 2021
Time-Aware Language Models as Temporal Knowledge Bases
Time-Aware Language Models as Temporal Knowledge Bases
Bhuwan Dhingra
Jeremy R. Cole
Julian Martin Eisenschlos
D. Gillick
Jacob Eisenstein
William W. Cohen
KELM
25
264
0
29 Jun 2021
Carbon Emissions and Large Neural Network Training
Carbon Emissions and Large Neural Network Training
David A. Patterson
Joseph E. Gonzalez
Quoc V. Le
Chen Liang
Lluís-Miquel Munguía
D. Rothchild
David R. So
Maud Texier
J. Dean
AI4CE
239
643
0
21 Apr 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
248
1,986
0
31 Dec 2020
DynaSent: A Dynamic Benchmark for Sentiment Analysis
DynaSent: A Dynamic Benchmark for Sentiment Analysis
Christopher Potts
Zhengxuan Wu
Atticus Geiger
Douwe Kiela
230
77
0
30 Dec 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
226
4,453
0
23 Jan 2020
1