Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.04620
Cited By
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
5 July 2024
Yu Sun
Xinhao Li
Karan Dalal
Jiarui Xu
Arjun Vikram
Genghan Zhang
Yann Dubois
Xinlei Chen
Xiaolong Wang
Sanmi Koyejo
Tatsunori Hashimoto
Carlos Guestrin
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning to (Learn at Test Time): RNNs with Expressive Hidden States"
50 / 66 papers shown
Title
Putting It All into Context: Simplifying Agents with LCLMs
Mingjian Jiang
Yangjun Ruan
Luis A. Lastras
Pavan Kapanipathi
Tatsunori Hashimoto
LLMAG
16
0
0
12 May 2025
Overflow Prevention Enhances Long-Context Recurrent LLMs
Assaf Ben-Kish
Itamar Zimerman
M. Jehanzeb Mirza
James R. Glass
Leonid Karlinsky
Raja Giryes
LRM
12
0
0
12 May 2025
Focus on the Likely: Test-time Instance-based Uncertainty Removal
Johannes Schneider
20
0
0
02 May 2025
TTTFusion: A Test-Time Training-Based Strategy for Multimodal Medical Image Fusion in Surgical Robots
Qinhua Xie
Hao Tang
43
0
0
29 Apr 2025
TTRL: Test-Time Reinforcement Learning
Yuxin Zuo
Kaiyan Zhang
Shang Qu
Li Sheng
Xuekai Zhu
Biqing Qi
Youbang Sun
Ganqu Cui
Ning Ding
Bowen Zhou
OffRL
41
1
0
22 Apr 2025
Rethinking Temporal Fusion with a Unified Gradient Descent View for 3D Semantic Occupancy Prediction
Dubing Chen
Huan Zheng
Jin Fang
Xingping Dong
Xianfei Li
Wenlong Liao
Tao He
Pai Peng
Jianbing Shen
27
0
0
17 Apr 2025
It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization
Ali Behrouz
Meisam Razaviyayn
Peilin Zhong
Vahab Mirrokni
36
0
0
17 Apr 2025
Open Problems and a Hypothetical Path Forward in LLM Knowledge Paradigms
Xiaotian Ye
M. Zhang
Shu Wu
KELM
ELM
39
0
0
09 Apr 2025
State Tuning: State-based Test-Time Scaling on RWKV-7
Liu Xiao
Li Zhiyuan
Lin Yueyu
28
0
0
07 Apr 2025
One-Minute Video Generation with Test-Time Training
Karan Dalal
Daniel Koceja
Gashon Hussein
Jiarui Xu
Yue Zhao
...
Tatsunori Hashimoto
Sanmi Koyejo
Yejin Choi
Yu Sun
Xiaolong Wang
ViT
91
3
0
07 Apr 2025
Learning from Streaming Video with Orthogonal Gradients
Tengda Han
Dilara Gokay
Joseph Heyward
Chuhan Zhang
Daniel Zoran
Viorica Patraucean
João Carreira
Dima Damen
Andrew Zisserman
40
0
0
02 Apr 2025
AU-TTT: Vision Test-Time Training model for Facial Action Unit Detection
Bohao Xing
Kaishen Yuan
Zitong Yu
X. Liu
H. Kalviainen
ViT
37
0
0
30 Mar 2025
Resona: Improving Context Copying in Linear Recurrence Models with Retrieval
X. Wang
Linrui Ma
Jerry Huang
Peng Lu
Prasanna Parthasarathi
Xiao-Wen Chang
Boxing Chen
Yufei Cui
KELM
39
1
0
28 Mar 2025
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond
Xiaoye Qu
Yafu Li
Zhaochen Su
Weigao Sun
Jianhao Yan
...
Chaochao Lu
Yue Zhang
Xian-Sheng Hua
Bowen Zhou
Yu Cheng
ReLM
OffRL
LRM
80
12
0
27 Mar 2025
Scaled Supervision is an Implicit Lipschitz Regularizer
Z. Ouyang
Chunhui Zhang
Yaning Jia
Soroush Vosoughi
BDL
OffRL
72
0
0
19 Mar 2025
MambaIC: State Space Models for High-Performance Learned Image Compression
Fanhu Zeng
Hao Tang
Yihua Shao
Siyu Chen
Ling Shao
Yan Wang
Mamba
66
0
0
16 Mar 2025
Test-Time Training Provably Improves Transformers as In-context Learners
Halil Alperen Gozeten
M. E. Ildiz
Xuechen Zhang
Mahdi Soltanolkotabi
Marco Mondelli
Samet Oymak
44
1
0
14 Mar 2025
Centaur: Robust End-to-End Autonomous Driving with Test-Time Training
Chonghao Sima
Kashyap Chitta
Zhiding Yu
Shiyi Lan
Ping Luo
Andreas Geiger
H. Li
Jose M. Alvarez
56
1
0
14 Mar 2025
Transformers without Normalization
Jiachen Zhu
Xinlei Chen
Kaiming He
Yann LeCun
Zhuang Liu
ViT
OffRL
48
7
0
13 Mar 2025
BlackGoose Rimer: Harnessing RWKV-7 as a Simple yet Superior Replacement for Transformers in Large-Scale Time Series Modeling
Li weile
Liu Xiao
53
1
0
08 Mar 2025
Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts
Weigao Sun
Disen Lan
Tong Zhu
Xiaoye Qu
Yu-Xi Cheng
MoE
61
1
0
07 Mar 2025
L
2
^2
2
M: Mutual Information Scaling Law for Long-Context Language Modeling
Zhuo Chen
Oriol Mayné i Comas
Zhuotao Jin
Di Luo
Marin Soljacic
62
0
0
06 Mar 2025
Delta-WKV: A Novel Meta-in-Context Learner for MRI Super-Resolution
Rongchang Lu
Bingcheng Liao
Haowen Hou
Jiahang Lv
Xin Hai
42
0
0
28 Feb 2025
Sliding Window Attention Training for Efficient Large Language Models
Zichuan Fu
Wentao Song
Y. Wang
X. Wu
Yefeng Zheng
Yingying Zhang
Derong Xu
Xuetao Wei
Tong Bill Xu
Xiangyu Zhao
76
1
0
26 Feb 2025
MoM: Linear Sequence Modeling with Mixture-of-Memories
Jusen Du
Weigao Sun
Disen Lan
Jiaxi Hu
Yu-Xi Cheng
KELM
75
3
0
19 Feb 2025
Uni-Retrieval: A Multi-Style Retrieval Framework for STEM's Education
Yanhao Jia
Xinyi Wu
Hao Li
Qinglin Zhang
Yuxiao Hu
Shuai Zhao
Wenqi Fan
38
2
0
09 Feb 2025
Adaptive Self-improvement LLM Agentic System for ML Library Development
Genghan Zhang
Weixin Liang
Olivia Hsu
K. Olukotun
68
0
0
04 Feb 2025
Explaining Context Length Scaling and Bounds for Language Models
Jingzhe Shi
Qinwei Ma
Hongyi Liu
Hang Zhao
Jeng-Neng Hwang
Serge Belongie
Lei Li
LRM
70
2
0
03 Feb 2025
Video Latent Flow Matching: Optimal Polynomial Projections for Video Interpolation and Extrapolation
Yang Cao
Zhao-quan Song
Chiwun Yang
VGen
44
2
0
01 Feb 2025
LIFT: Improving Long Context Understanding Through Long Input Fine-Tuning
Yansheng Mao
Jiaqi Li
Fanxu Meng
Jing Xiong
Zilong Zheng
Muhan Zhang
LLMAG
RALM
90
1
0
18 Dec 2024
MAL: Cluster-Masked and Multi-Task Pretraining for Enhanced xLSTM Vision Performance
Wenjun Huang
Jianguo Hu
79
0
0
14 Dec 2024
Marconi: Prefix Caching for the Era of Hybrid LLMs
Rui Pan
Zhuang Wang
Zhen Jia
Can Karakus
Luca Zancato
Tri Dao
Ravi Netravali
Yida Wang
87
4
0
28 Nov 2024
MobileMamba: Lightweight Multi-Receptive Visual Mamba Network
Haoyang He
J. Zhang
Yuxuan Cai
Hongxu Chen
Xiaobin Hu
Zhenye Gan
Y. Wang
Chengjie Wang
Yunsheng Wu
Lei Xie
Mamba
77
3
0
24 Nov 2024
Best of Both Worlds: Advantages of Hybrid Graph Sequence Models
Ali Behrouz
Ali Parviz
Mahdi Karami
Clayton Sanford
Bryan Perozzi
Vahab Mirrokni
79
2
0
23 Nov 2024
DATTA: Domain-Adversarial Test-Time Adaptation for Cross-Domain WiFi-Based Human Activity Recognition
Julian Strohmayer
Rafael Sterzinger
Matthias Wödlinger
Martin Kampel
TTA
87
0
0
20 Nov 2024
Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues
Riccardo Grazzi
Julien N. Siems
Jörg K.H. Franke
Arber Zela
Frank Hutter
Massimiliano Pontil
84
10
0
19 Nov 2024
StreamAdapter: Efficient Test Time Adaptation from Contextual Streams
Dilxat Muhtar
Yelong Shen
Y. Yang
Xiaodong Liu
Yadong Lu
...
Feng Sun
Xueliang Zhang
Jianfeng Gao
Weizhu Chen
Qi Zhang
TTA
62
0
0
14 Nov 2024
Clustering Algorithms and RAG Enhancing Semi-Supervised Text Classification with Large LLMs
Shan Zhong
Jiahao Zeng
Yongxin Yu
Bohong Lin
34
1
0
09 Nov 2024
Wave Network: An Ultra-Small Language Model
Xin Zhang
Victor S. Sheng
39
1
0
04 Nov 2024
Human-inspired Perspectives: A Survey on AI Long-term Memory
Zihong He
Weizhe Lin
Hao Zheng
Fan Zhang
Matt Jones
Laurence Aitchison
X. Xu
Miao Liu
Per Ola Kristensson
Junxiao Shen
72
2
0
01 Nov 2024
A prescriptive theory for brain-like inference
Hadi Vafaii
Dekel Galor
Jacob L. Yates
DRL
30
0
0
25 Oct 2024
Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination
Jerry Huang
Prasanna Parthasarathi
Mehdi Rezagholizadeh
Boxing Chen
Sarath Chandar
48
0
0
22 Oct 2024
scFusionTTT: Single-cell transcriptomics and proteomics fusion with Test-Time Training layers
Dian Meng
Bohao Xing
Xinlei Huang
Yanran Liu
Yijun Zhou
Yongjun xiao
Zitong Yu
Xubin Zheng
26
1
0
17 Oct 2024
How much do contextualized representations encode long-range context?
Simeng Sun
Cheng-Ping Hsieh
39
0
0
16 Oct 2024
State-space models can learn in-context by gradient descent
Neeraj Mohan Sushma
Yudou Tian
Harshvardhan Mestha
Nicolo Colombo
David Kappel
Anand Subramoney
35
3
0
15 Oct 2024
MatMamba: A Matryoshka State Space Model
Abhinav Shukla
Sai H. Vemprala
Aditya Kusupati
Ashish Kapoor
Mamba
28
0
0
09 Oct 2024
On Efficient Variants of Segment Anything Model: A Survey
Xiaorui Sun
J. Liu
H. Shen
Xiaofeng Zhu
Ping Hu
VLM
43
4
0
07 Oct 2024
Med-TTT: Vision Test-Time Training model for Medical Image Segmentation
Jiashu Xu
ViT
LM&MA
MedIm
21
0
0
03 Oct 2024
The Role of Deductive and Inductive Reasoning in Large Language Models
Chengkun Cai
Xu Zhao
Haoliang Liu
Zhongyu Jiang
Tianfang Zhang
Zongkai Wu
Jenq-Neng Hwang
Serge Belongie
Lei Li
LRM
37
2
0
03 Oct 2024
A Survey for Deep Reinforcement Learning Based Network Intrusion Detection
Wanrong Yang
Alberto Acuto
Yihang Zhou
Dominik Wojtczak
OffRL
31
2
0
25 Sep 2024
1
2
Next