ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.13048
  4. Cited By
RWKV: Reinventing RNNs for the Transformer Era

RWKV: Reinventing RNNs for the Transformer Era

22 May 2023
Bo Peng
Eric Alcaide
Quentin G. Anthony
Alon Albalak
Samuel Arcadinho
Stella Biderman
Huanqi Cao
Xin Cheng
Michael Chung
Matteo Grella
G. Kranthikiran
Xuming He
Haowen Hou
Jiaju Lin
Przemyslaw Kazienko
Jan Kocoñ
Jiaming Kong
Bartlomiej Koptyra
Hayden Lau
Krishna Sri Ipsit Mantri
Ferdinand Mom
Atsushi Saito
Guangyu Song
Xiangru Tang
Bolun Wang
J. S. Wind
Stansilaw Wozniak
Ruichong Zhang
Zhenyuan Zhang
Qihang Zhao
P. Zhou
Qinghua Zhou
Jian Zhu
Rui-Jie Zhu
ArXivPDFHTML

Papers citing "RWKV: Reinventing RNNs for the Transformer Era"

50 / 388 papers shown
Title
Multi-View Learning with Context-Guided Receptance for Image Denoising
Multi-View Learning with Context-Guided Receptance for Image Denoising
Binghong Chen
Tingting Chai
Wei Jiang
Yuanrong Xu
Guanglu Zhou
Xiangqian Wu
32
0
0
05 May 2025
RWKVQuant: Quantizing the RWKV Family with Proxy Guided Hybrid of Scalar and Vector Quantization
RWKVQuant: Quantizing the RWKV Family with Proxy Guided Hybrid of Scalar and Vector Quantization
Chen Xu
Yuxuan Yue
Zukang Xu
Xing Hu
Jiangyong Yu
Zhixuan Chen
Sifan Zhou
Zhihang Yuan
Dawei Yang
MQ
17
0
0
02 May 2025
R-Sparse: Rank-Aware Activation Sparsity for Efficient LLM Inference
R-Sparse: Rank-Aware Activation Sparsity for Efficient LLM Inference
Zhenyu (Allen) Zhang
Zechun Liu
Yuandong Tian
Harshit Khaitan
Z. Wang
Steven Li
54
0
0
28 Apr 2025
Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models
Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models
Xu Ma
Peize Sun
Haoyu Ma
Hao Tang
Chih-Yao Ma
...
Matt Feiszli
Peizhao Zhang
Peter Vajda
Sam S. Tsai
Y. Fu
65
1
0
24 Apr 2025
Random Long-Context Access for Mamba via Hardware-aligned Hierarchical Sparse Attention
Random Long-Context Access for Mamba via Hardware-aligned Hierarchical Sparse Attention
Xiang Hu
Jiaqi Leng
Jun Zhao
Kewei Tu
Wei Wu
Mamba
45
0
0
23 Apr 2025
Empirical Evaluation of Knowledge Distillation from Transformers to Subquadratic Language Models
Empirical Evaluation of Knowledge Distillation from Transformers to Subquadratic Language Models
Patrick Haller
Jonas Golde
Alan Akbik
19
0
0
19 Apr 2025
Rethinking Temporal Fusion with a Unified Gradient Descent View for 3D Semantic Occupancy Prediction
Rethinking Temporal Fusion with a Unified Gradient Descent View for 3D Semantic Occupancy Prediction
Dubing Chen
Huan Zheng
Jin Fang
Xingping Dong
Xianfei Li
Wenlong Liao
Tao He
Pai Peng
Jianbing Shen
22
0
0
17 Apr 2025
It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization
It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization
Ali Behrouz
Meisam Razaviyayn
Peilin Zhong
Vahab Mirrokni
31
0
0
17 Apr 2025
ACMamba: Fast Unsupervised Anomaly Detection via An Asymmetrical Consensus State Space Model
ACMamba: Fast Unsupervised Anomaly Detection via An Asymmetrical Consensus State Space Model
Guanchun Wang
Xiangrong Zhang
Yifei Zhang
Zelin Peng
Tianyang Zhang
Xu Tang
Licheng Jiao
Mamba
34
0
0
16 Apr 2025
Explicit and Implicit Representations in AI-based 3D Reconstruction for Radiology: A systematic literature review
Explicit and Implicit Representations in AI-based 3D Reconstruction for Radiology: A systematic literature review
Yuezhe Yang
Boyu Yang
Yaqian Wang
Yang He
Xingbo Dong
Zhe Jin
33
0
0
15 Apr 2025
RGB-Event based Pedestrian Attribute Recognition: A Benchmark Dataset and An Asymmetric RWKV Fusion Framework
RGB-Event based Pedestrian Attribute Recognition: A Benchmark Dataset and An Asymmetric RWKV Fusion Framework
X. Wang
Haiyang Wang
Shiao Wang
Qiang Chen
Jiandong Jin
Haoyu Song
Bo Jiang
Chenglong Li
26
0
0
14 Apr 2025
Bidirectional Linear Recurrent Models for Sequence-Level Multisource Fusion
Bidirectional Linear Recurrent Models for Sequence-Level Multisource Fusion
Qisai Liu
Zhanhong Jiang
Joshua R. Waite
Chao Liu
Aditya Balu
S. Sarkar
AI4TS
24
0
0
11 Apr 2025
SWAN-GPT: An Efficient and Scalable Approach for Long-Context Language Modeling
SWAN-GPT: An Efficient and Scalable Approach for Long-Context Language Modeling
Krishna C. Puvvada
Faisal Ladhak
Santiago Akle Serrano
Cheng-Ping Hsieh
Shantanu Acharya
...
Fei Jia
Samuel Kriman
Simeng Sun
Dima Rekesh
Boris Ginsburg
RALM
52
0
0
11 Apr 2025
Open Problems and a Hypothetical Path Forward in LLM Knowledge Paradigms
Open Problems and a Hypothetical Path Forward in LLM Knowledge Paradigms
Xiaotian Ye
M. Zhang
Shu Wu
KELM
ELM
36
0
0
09 Apr 2025
Compound and Parallel Modes of Tropical Convolutional Neural Networks
Compound and Parallel Modes of Tropical Convolutional Neural Networks
Mingbo Li
Liying Liu
Ye Luo
30
0
0
09 Apr 2025
Gating is Weighting: Understanding Gated Linear Attention through In-context Learning
Gating is Weighting: Understanding Gated Linear Attention through In-context Learning
Yingcong Li
Davoud Ataee Tarzanagh
A. S. Rawat
Maryam Fazel
Samet Oymak
23
0
0
06 Apr 2025
Reasoning on Multiple Needles In A Haystack
Reasoning on Multiple Needles In A Haystack
Yidong Wang
LRM
26
0
0
05 Apr 2025
Towards Understanding and Improving Refusal in Compressed Models via Mechanistic Interpretability
Towards Understanding and Improving Refusal in Compressed Models via Mechanistic Interpretability
Vishnu Kabir Chhabra
Mohammad Mahdi Khalili
AI4CE
23
0
0
05 Apr 2025
Learning Phase Distortion with Selective State Space Models for Video Turbulence Mitigation
Learning Phase Distortion with Selective State Space Models for Video Turbulence Mitigation
Xingguang Zhang
Nicholas Chimitt
Xijun Wang
Yu Yuan
Stanley H. Chan
34
0
0
03 Apr 2025
Cognitive Memory in Large Language Models
Cognitive Memory in Large Language Models
Lianlei Shan
Shixian Luo
Zezhou Zhu
Yu Yuan
Yong Wu
LLMAG
KELM
60
1
0
03 Apr 2025
ViT-Linearizer: Distilling Quadratic Knowledge into Linear-Time Vision Models
ViT-Linearizer: Distilling Quadratic Knowledge into Linear-Time Vision Models
Guoyizhe Wei
Rama Chellappa
31
0
0
30 Mar 2025
Efficient Inference for Large Reasoning Models: A Survey
Efficient Inference for Large Reasoning Models: A Survey
Y. Liu
Jiaying Wu
Yufei He
Hongcheng Gao
Hongyu Chen
Baolong Bi
Jiaheng Zhang
Zhiqi Huang
Bryan Hooi
LLMAG
LRM
58
7
0
29 Mar 2025
EdgeInfinite: A Memory-Efficient Infinite-Context Transformer for Edge Devices
EdgeInfinite: A Memory-Efficient Infinite-Context Transformer for Edge Devices
Jiyu Chen
Shuang Peng
Daxiong Luo
Fan Yang
Renshou Wu
Fangyuan Li
Xiaoxin Chen
39
0
0
28 Mar 2025
DREMnet: An Interpretable Denoising Framework for Semi-Airborne Transient Electromagnetic Signal
DREMnet: An Interpretable Denoising Framework for Semi-Airborne Transient Electromagnetic Signal
Shuang Wang
Ming Guo
X. Wang
Fei Deng
Lifeng Mao
Bin Wang
Wenlong Gao
44
0
0
28 Mar 2025
Resona: Improving Context Copying in Linear Recurrence Models with Retrieval
Resona: Improving Context Copying in Linear Recurrence Models with Retrieval
X. Wang
Linrui Ma
Jerry Huang
Peng Lu
Prasanna Parthasarathi
Xiao-Wen Chang
Boxing Chen
Yufei Cui
KELM
39
1
0
28 Mar 2025
Exploring the Roles of Large Language Models in Reshaping Transportation Systems: A Survey, Framework, and Roadmap
Exploring the Roles of Large Language Models in Reshaping Transportation Systems: A Survey, Framework, and Roadmap
Tong Nie
Jian-jun Sun
Wei Ma
58
1
0
27 Mar 2025
LOCORE: Image Re-ranking with Long-Context Sequence Modeling
LOCORE: Image Re-ranking with Long-Context Sequence Modeling
Zilin Xiao
Pavel Suma
Ayush Sachdeva
Hao-Jen Wang
Giorgos Kordopatis-Zilos
Giorgos Tolias
Vicente Ordonez
49
0
0
27 Mar 2025
RSRWKV: A Linear-Complexity 2D Attention Mechanism for Efficient Remote Sensing Vision Task
RSRWKV: A Linear-Complexity 2D Attention Mechanism for Efficient Remote Sensing Vision Task
Chunshan Li
Rong Wang
Xiaofei Yang
Dianhui Chu
72
0
0
26 Mar 2025
GIViC: Generative Implicit Video Compression
GIViC: Generative Implicit Video Compression
Ge Gao
Siyue Teng
Tianhao Peng
Fan Zhang
David Bull
DiffM
VGen
33
0
0
25 Mar 2025
From S4 to Mamba: A Comprehensive Survey on Structured State Space Models
From S4 to Mamba: A Comprehensive Survey on Structured State Space Models
Shriyank Somvanshi
Md Monzurul Islam
Mahmuda Sultana Mimi
Sazzad Bin Bashar Polock
Gaurab Chhetri
Subasish Das
Mamba
AI4TS
37
0
0
22 Mar 2025
SpeCache: Speculative Key-Value Caching for Efficient Generation of LLMs
SpeCache: Speculative Key-Value Caching for Efficient Generation of LLMs
Shibo Jie
Yehui Tang
Kai Han
Zhi-Hong Deng
Jing Han
87
0
0
20 Mar 2025
iFlame: Interleaving Full and Linear Attention for Efficient Mesh Generation
iFlame: Interleaving Full and Linear Attention for Efficient Mesh Generation
Hanxiao Wang
Biao Zhang
Weize Quan
Dong-ming Yan
Peter Wonka
46
0
0
20 Mar 2025
xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference
xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference
M. Beck
Korbinian Poppel
Phillip Lippe
Richard Kurle
P. Blies
G. Klambauer
Sebastian Böck
Sepp Hochreiter
LRM
40
0
0
17 Mar 2025
MambaIC: State Space Models for High-Performance Learned Image Compression
MambaIC: State Space Models for High-Performance Learned Image Compression
Fanhu Zeng
Hao Tang
Yihua Shao
Siyu Chen
Ling Shao
Yan Wang
Mamba
55
0
0
16 Mar 2025
Key, Value, Compress: A Systematic Exploration of KV Cache Compression Techniques
Key, Value, Compress: A Systematic Exploration of KV Cache Compression Techniques
Neusha Javidnia
B. Rouhani
F. Koushanfar
47
0
0
14 Mar 2025
Robustness Tokens: Towards Adversarial Robustness of Transformers
Brian Pulfer
Yury Belousov
S. Voloshynovskiy
AAML
37
0
0
13 Mar 2025
Attention Reveals More Than Tokens: Training-Free Long-Context Reasoning with Attention-guided Retrieval
Yuwei Zhang
Jayanth Srinivasa
Gaowen Liu
Jingbo Shang
LRM
LLMAG
RALM
78
1
0
12 Mar 2025
SEAP: Training-free Sparse Expert Activation Pruning Unlock the Brainpower of Large Language Models
Xun Liang
Hanyu Wang
Huayi Lai
Simin Niu
Shichao Song
Jiawei Yang
Jihao Zhao
Feiyu Xiong
Bo Tang
Z. Li
VLM
35
0
0
10 Mar 2025
Can Small Language Models Reliably Resist Jailbreak Attacks? A Comprehensive Evaluation
Wenhui Zhang
Huiyu Xu
Zhibo Wang
Zeqing He
Ziqi Zhu
Kui Ren
AAML
PILM
67
0
0
09 Mar 2025
Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts
Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts
Weigao Sun
Disen Lan
Tong Zhu
Xiaoye Qu
Yu-Xi Cheng
MoE
55
1
0
07 Mar 2025
L2^22M: Mutual Information Scaling Law for Long-Context Language Modeling
Zhuo Chen
Oriol Mayné i Comas
Zhuotao Jin
Di Luo
Marin Soljacic
57
0
0
06 Mar 2025
Liger: Linearizing Large Language Models to Gated Recurrent Structures
Liger: Linearizing Large Language Models to Gated Recurrent Structures
Disen Lan
Weigao Sun
Jiaxi Hu
Jusen Du
Yu-Xi Cheng
64
0
0
03 Mar 2025
FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference
FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference
Xunhao Lai
Jianqiao Lu
Yao Luo
Yiyuan Ma
Xun Zhou
63
5
0
28 Feb 2025
ProDapt: Proprioceptive Adaptation using Long-term Memory Diffusion
Federico Pizarro Bejarano
Bryson Jones
Daniel Pastor Moreno
J. Bowkett
Paul Backes
Angela P. Schoellig
31
0
0
28 Feb 2025
Delta-WKV: A Novel Meta-in-Context Learner for MRI Super-Resolution
Delta-WKV: A Novel Meta-in-Context Learner for MRI Super-Resolution
Rongchang Lu
Bingcheng Liao
Haowen Hou
Jiahang Lv
Xin Hai
37
0
0
28 Feb 2025
Sliding Window Attention Training for Efficient Large Language Models
Sliding Window Attention Training for Efficient Large Language Models
Zichuan Fu
Wentao Song
Y. Wang
X. Wu
Yefeng Zheng
Yingying Zhang
Derong Xu
Xuetao Wei
Tong Bill Xu
Xiangyu Zhao
71
1
0
26 Feb 2025
TransMamba: Fast Universal Architecture Adaption from Transformers to Mamba
TransMamba: Fast Universal Architecture Adaption from Transformers to Mamba
Xiuwei Chen
Sihao Lin
Xiao Dong
Z. Chen
Meng Cao
J. Han
Hang Xu
Xiaodan Liang
Mamba
54
0
0
24 Feb 2025
Vision-LSTM: xLSTM as Generic Vision Backbone
Vision-LSTM: xLSTM as Generic Vision Backbone
Benedikt Alkin
M. Beck
Korbinian Poppel
Sepp Hochreiter
Johannes Brandstetter
VLM
44
36
0
24 Feb 2025
Enhancing RWKV-based Language Models for Long-Sequence Text Generation
Enhancing RWKV-based Language Models for Long-Sequence Text Generation
Xinghan Pan
44
0
0
21 Feb 2025
A Survey of Model Architectures in Information Retrieval
A Survey of Model Architectures in Information Retrieval
Zhichao Xu
Fengran Mo
Zhiqi Huang
Crystina Zhang
Puxuan Yu
Bei Wang
Jimmy J. Lin
Vivek Srikumar
KELM
3DV
46
2
0
21 Feb 2025
12345678
Next