ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.13048
  4. Cited By
RWKV: Reinventing RNNs for the Transformer Era

RWKV: Reinventing RNNs for the Transformer Era

22 May 2023
Bo Peng
Eric Alcaide
Quentin G. Anthony
Alon Albalak
Samuel Arcadinho
Stella Biderman
Huanqi Cao
Xin Cheng
Michael Chung
Matteo Grella
G. Kranthikiran
Xuming He
Haowen Hou
Jiaju Lin
Przemyslaw Kazienko
Jan Kocoñ
Jiaming Kong
Bartlomiej Koptyra
Hayden Lau
Krishna Sri Ipsit Mantri
Ferdinand Mom
Atsushi Saito
Guangyu Song
Xiangru Tang
Bolun Wang
J. S. Wind
Stansilaw Wozniak
Ruichong Zhang
Zhenyuan Zhang
Qihang Zhao
P. Zhou
Qinghua Zhou
Jian Zhu
Rui-Jie Zhu
ArXivPDFHTML

Papers citing "RWKV: Reinventing RNNs for the Transformer Era"

50 / 388 papers shown
Title
RWKV-CLIP: A Robust Vision-Language Representation Learner
RWKV-CLIP: A Robust Vision-Language Representation Learner
Tiancheng Gu
Kaicheng Yang
Xiang An
Ziyong Feng
Dongnan Liu
Weidong Cai
Jiankang Deng
VLM
CLIP
32
13
0
11 Jun 2024
Recurrent Context Compression: Efficiently Expanding the Context Window
  of LLM
Recurrent Context Compression: Efficiently Expanding the Context Window of LLM
Chensen Huang
Guibo Zhu
Xuepeng Wang
Yifei Luo
Guojing Ge
Haoran Chen
Dong Yi
Jinqiao Wang
40
1
0
10 Jun 2024
What Can We Learn from State Space Models for Machine Learning on
  Graphs?
What Can We Learn from State Space Models for Machine Learning on Graphs?
Yinan Huang
Siqi Miao
Pan Li
39
7
0
09 Jun 2024
Benchmarking Neural Decoding Backbones towards Enhanced On-edge iBCI
  Applications
Benchmarking Neural Decoding Backbones towards Enhanced On-edge iBCI Applications
Zhou Zhou
Guohang He
Zheng Zhang
Luziwei Leng
Qinghai Guo
Jianxing Liao
Xuan Song
Ran Cheng
34
2
0
08 Jun 2024
C-Mamba: Channel Correlation Enhanced State Space Models for
  Multivariate Time Series Forecasting
C-Mamba: Channel Correlation Enhanced State Space Models for Multivariate Time Series Forecasting
Chaolv Zeng
Zhanyu Liu
Guanjie Zheng
Linghe Kong
Mamba
31
2
0
08 Jun 2024
Small-E: Small Language Model with Linear Attention for Efficient Speech
  Synthesis
Small-E: Small Language Model with Linear Attention for Efficient Speech Synthesis
Théodor Lemerle
Nicolas Obin
Axel Roebel
21
6
0
06 Jun 2024
Learning 1D Causal Visual Representation with De-focus Attention
  Networks
Learning 1D Causal Visual Representation with De-focus Attention Networks
Chenxin Tao
Xizhou Zhu
Shiqian Su
Lewei Lu
Changyao Tian
...
Gao Huang
Hongsheng Li
Yu Qiao
Jie Zhou
Jifeng Dai
52
1
0
06 Jun 2024
RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning
  and Manipulation
RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation
Jiaming Liu
Mengzhen Liu
Zhenyu Wang
Lily Lee
Kaichen Zhou
Pengju An
Senqiao Yang
Renrui Zhang
Yandong Guo
Shanghang Zhang
LM&Ro
LRM
Mamba
27
5
0
06 Jun 2024
U-KAN Makes Strong Backbone for Medical Image Segmentation and
  Generation
U-KAN Makes Strong Backbone for Medical Image Segmentation and Generation
Chenxin Li
Xinyu Liu
W. J. Li
Cheng Wang
Hengyu Liu
Yifan Liu
Zhen Chen
Yixuan Yuan
MedIm
DiffM
SSeg
46
72
0
05 Jun 2024
Scalable MatMul-free Language Modeling
Scalable MatMul-free Language Modeling
Rui-Jie Zhu
Yu Zhang
Ethan Sifferman
Tyler Sheaves
Yiqiao Wang
Dustin Richmond
P. Zhou
Jason Eshraghian
21
15
0
04 Jun 2024
Iteration Head: A Mechanistic Study of Chain-of-Thought
Iteration Head: A Mechanistic Study of Chain-of-Thought
Vivien A. Cabannes
Charles Arnal
Wassim Bouaziz
Alice Yang
Francois Charton
Julia Kempe
LRM
14
7
0
04 Jun 2024
Mamba as Decision Maker: Exploring Multi-scale Sequence Modeling in
  Offline Reinforcement Learning
Mamba as Decision Maker: Exploring Multi-scale Sequence Modeling in Offline Reinforcement Learning
Jiahang Cao
Qiang Zhang
Ziqing Wang
Jiaxu Wang
Hao Cheng
Yecheng Shao
Wen Zhao
Gang Han
Yijie Guo
Renjing Xu
Mamba
37
2
0
04 Jun 2024
SUBLLM: A Novel Efficient Architecture with Token Sequence Subsampling
  for LLM
SUBLLM: A Novel Efficient Architecture with Token Sequence Subsampling for LLM
Quandong Wang
Yuxuan Yuan
Xiaoyu Yang
Ruike Zhang
Kang Zhao
Wei Liu
Jian Luan
Daniel Povey
Bin Wang
20
0
0
03 Jun 2024
Pretrained Hybrids with MAD Skills
Pretrained Hybrids with MAD Skills
Nicholas Roberts
Samuel Guo
Zhiqi Gao
Satya Sai Srinath Namburi
Sonia Cromp
Chengjun Wu
Chengyu Duan
Frederic Sala
Mamba
35
0
0
02 Jun 2024
You Only Scan Once: Efficient Multi-dimension Sequential Modeling with
  LightNet
You Only Scan Once: Efficient Multi-dimension Sequential Modeling with LightNet
Zhen Qin
Yuxin Mao
Xuyang Shen
Dong Li
Jing Zhang
Yuchao Dai
Yiran Zhong
44
1
0
31 May 2024
MeshXL: Neural Coordinate Field for Generative 3D Foundation Models
MeshXL: Neural Coordinate Field for Generative 3D Foundation Models
Sijin Chen
Xin Chen
Anqi Pang
Xianfang Zeng
Wei Cheng
...
C. Zhang
Jingyi Yu
Gang Yu
Bin-Bin Fu
Tao Chen
AI4CE
47
35
0
31 May 2024
Language Models Need Inductive Biases to Count Inductively
Language Models Need Inductive Biases to Count Inductively
Yingshan Chang
Yonatan Bisk
LRM
32
5
0
30 May 2024
Fourier Controller Networks for Real-Time Decision-Making in Embodied
  Learning
Fourier Controller Networks for Real-Time Decision-Making in Embodied Learning
Hengkai Tan
Songming Liu
Kai Ma
Chengyang Ying
Xingxing Zhang
Hang Su
Jun Zhu
23
2
0
30 May 2024
DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention
DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention
Lianghui Zhu
Zilong Huang
Bencheng Liao
Jun Hao Liew
Hanshu Yan
Jiashi Feng
Xinggang Wang
60
12
0
28 May 2024
ViG: Linear-complexity Visual Sequence Learning with Gated Linear
  Attention
ViG: Linear-complexity Visual Sequence Learning with Gated Linear Attention
Bencheng Liao
Xinggang Wang
Lianghui Zhu
Qian Zhang
Chang Huang
45
3
0
28 May 2024
Various Lengths, Constant Speed: Efficient Language Modeling with
  Lightning Attention
Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention
Zhen Qin
Weigao Sun
Dong Li
Xuyang Shen
Weixuan Sun
Yiran Zhong
28
9
0
27 May 2024
Zamba: A Compact 7B SSM Hybrid Model
Zamba: A Compact 7B SSM Hybrid Model
Paolo Glorioso
Quentin G. Anthony
Yury Tokpanov
James Whittington
Jonathan Pilault
Adam Ibrahim
Beren Millidge
17
7
0
26 May 2024
MambaTS: Improved Selective State Space Models for Long-term Time Series
  Forecasting
MambaTS: Improved Selective State Space Models for Long-term Time Series Forecasting
Xiuding Cai
Yaoyao Zhu
Xueyao Wang
Yu Yao
Mamba
AI4TS
16
7
0
26 May 2024
Building Vision Models upon Heat Conduction
Building Vision Models upon Heat Conduction
Zhaozhi Wang
Yue Liu
Yunfan Liu
Hongtian Yu
Yaowei Wang
QiXiang Ye
ViT
VLM
44
0
0
26 May 2024
Understanding the differences in Foundation Models: Attention, State
  Space Models, and Recurrent Neural Networks
Understanding the differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks
Jerome Sieber
Carmen Amo Alonso
A. Didier
M. Zeilinger
Antonio Orvieto
AAML
39
7
0
24 May 2024
PointRWKV: Efficient RWKV-Like Model for Hierarchical Point Cloud
  Learning
PointRWKV: Efficient RWKV-Like Model for Hierarchical Point Cloud Learning
Qingdong He
Jiangning Zhang
Jinlong Peng
Haoyang He
Yabiao Wang
Chengjie Wang
3DPC
35
12
0
24 May 2024
AstroPT: Scaling Large Observation Models for Astronomy
AstroPT: Scaling Large Observation Models for Astronomy
Michael J. Smith
Ryan J. Roberts
E. Angeloudi
M. Huertas-Company
30
1
0
23 May 2024
Mamba-R: Vision Mamba ALSO Needs Registers
Mamba-R: Vision Mamba ALSO Needs Registers
Feng Wang
Jiahao Wang
Sucheng Ren
Guoyizhe Wei
Jieru Mei
Wei Shao
Yuyin Zhou
Alan L. Yuille
Cihang Xie
Mamba
20
19
0
23 May 2024
Lessons from the Trenches on Reproducible Evaluation of Language Models
Lessons from the Trenches on Reproducible Evaluation of Language Models
Stella Biderman
Hailey Schoelkopf
Lintang Sutawika
Leo Gao
J. Tow
...
Xiangru Tang
Kevin A. Wang
Genta Indra Winata
Franccois Yvon
Andy Zou
ELM
ALM
115
16
3
23 May 2024
Base of RoPE Bounds Context Length
Base of RoPE Bounds Context Length
Xin Men
Mingyu Xu
Bingning Wang
Qingyu Zhang
Hongyu Lin
Xianpei Han
Weipeng Chen
26
18
0
23 May 2024
Attention as an RNN
Attention as an RNN
Leo Feng
Frederick Tung
Hossein Hajimirsadeghi
Mohamed Osama Ahmed
Yoshua Bengio
Greg Mori
GNN
AI4TS
25
0
0
22 May 2024
Audio Mamba: Pretrained Audio State Space Model For Audio Tagging
Audio Mamba: Pretrained Audio State Space Model For Audio Tagging
Jiaju Lin
Haoxuan Hu
Mamba
31
7
0
22 May 2024
Score-CDM: Score-Weighted Convolutional Diffusion Model for Multivariate
  Time Series Imputation
Score-CDM: Score-Weighted Convolutional Diffusion Model for Multivariate Time Series Imputation
Shunyang Zhang
Senzhang Wang
Hao Miao
Hao Chen
Changjun Fan
Jian Zhang
22
2
0
21 May 2024
Efficient Multimodal Large Language Models: A Survey
Efficient Multimodal Large Language Models: A Survey
Yizhang Jin
Jian Li
Yexin Liu
Tianjun Gu
Kai Wu
...
Xin Tan
Zhenye Gan
Yabiao Wang
Chengjie Wang
Lizhuang Ma
LRM
39
44
0
17 May 2024
Improving Transformers with Dynamically Composable Multi-Head Attention
Improving Transformers with Dynamically Composable Multi-Head Attention
Da Xiao
Qingye Meng
Shengping Li
Xingyuan Yuan
18
0
0
14 May 2024
Rethinking Scanning Strategies with Vision Mamba in Semantic
  Segmentation of Remote Sensing Imagery: An Experimental Study
Rethinking Scanning Strategies with Vision Mamba in Semantic Segmentation of Remote Sensing Imagery: An Experimental Study
Qinfeng Zhu
Yuan-Sheng Fang
Yuanzhi Cai
Cheng Chen
Lei Fan
Mamba
17
16
0
14 May 2024
MambaOut: Do We Really Need Mamba for Vision?
MambaOut: Do We Really Need Mamba for Vision?
Weihao Yu
Xinchao Wang
Mamba
31
46
0
13 May 2024
Linearizing Large Language Models
Linearizing Large Language Models
Jean-Pierre Mercat
Igor Vasiljevic
Sedrick Scott Keh
Kushal Arora
Achal Dave
Adrien Gaidon
Thomas Kollar
24
2
0
10 May 2024
Memory Mosaics
Memory Mosaics
Jianyu Zhang
Niklas Nolte
Ranajoy Sadhukhan
Beidi Chen
Léon Bottou
VLM
30
3
0
10 May 2024
OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage
  Pruning
OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning
Dan Qiao
Yi Su
Pinzheng Wang
Jing Ye
Wen Xie
...
Wenliang Chen
Guohong Fu
Guodong Zhou
Qiaoming Zhu
Min Zhang
MQ
26
0
0
09 May 2024
Weight Sparsity Complements Activity Sparsity in Neuromorphic Language
  Models
Weight Sparsity Complements Activity Sparsity in Neuromorphic Language Models
Rishav Mukherji
Mark Schöne
Khaleelulla Khan Nazeer
Christian Mayr
David Kappel
Anand Subramoney
27
0
0
01 May 2024
Revenge of the Fallen? Recurrent Models Match Transformers at Predicting
  Human Language Comprehension Metrics
Revenge of the Fallen? Recurrent Models Match Transformers at Predicting Human Language Comprehension Metrics
J. Michaelov
Catherine Arnett
Benjamin Bergen
16
3
0
30 Apr 2024
Make Your LLM Fully Utilize the Context
Make Your LLM Fully Utilize the Context
Shengnan An
Zexiong Ma
Zeqi Lin
Nanning Zheng
Jian-Guang Lou
SyDa
44
52
0
25 Apr 2024
Mamba-360: Survey of State Space Models as Transformer Alternative for
  Long Sequence Modelling: Methods, Applications, and Challenges
Mamba-360: Survey of State Space Models as Transformer Alternative for Long Sequence Modelling: Methods, Applications, and Challenges
Badri N. Patro
Vijay Srinivas Agneeswaran
Mamba
19
37
0
24 Apr 2024
A Survey on Visual Mamba
A Survey on Visual Mamba
Hanwei Zhang
Ying Zhu
Dan Wang
Lijun Zhang
Tianxiang Chen
Zi Ye
Mamba
26
52
0
24 Apr 2024
A Survey on Efficient Inference for Large Language Models
A Survey on Efficient Inference for Large Language Models
Zixuan Zhou
Xuefei Ning
Ke Hong
Tianyu Fu
Jiaming Xu
...
Shengen Yan
Guohao Dai
Xiao-Ping Zhang
Yuhan Dong
Yu-Xiang Wang
46
78
0
22 Apr 2024
Small Language Models are Good Too: An Empirical Study of Zero-Shot
  Classification
Small Language Models are Good Too: An Empirical Study of Zero-Shot Classification
Pierre Lepagnol
Thomas Gerald
Sahar Ghannay
Christophe Servan
Sophie Rosset
32
2
0
17 Apr 2024
State Space Model for New-Generation Network Alternative to
  Transformers: A Survey
State Space Model for New-Generation Network Alternative to Transformers: A Survey
Xiao Wang
Shiao Wang
Yuhe Ding
Yuehang Li
Wentao Wu
...
Bowei Jiang
Chenglong Li
Yaowei Wang
Yonghong Tian
Jin Tang
Mamba
28
48
0
15 Apr 2024
A Survey on Multimodal Wearable Sensor-based Human Action Recognition
A Survey on Multimodal Wearable Sensor-based Human Action Recognition
Jianyuan Ni
Hao Tang
Syed Tousiful Haque
Yan Yan
A. Ngu
61
5
0
14 Apr 2024
TransformerFAM: Feedback attention is working memory
TransformerFAM: Feedback attention is working memory
Dongseong Hwang
Weiran Wang
Zhuoyuan Huo
K. Sim
P. M. Mengibar
19
7
0
14 Apr 2024
Previous
12345678
Next