ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.13048
  4. Cited By
RWKV: Reinventing RNNs for the Transformer Era

RWKV: Reinventing RNNs for the Transformer Era

22 May 2023
Bo Peng
Eric Alcaide
Quentin G. Anthony
Alon Albalak
Samuel Arcadinho
Stella Biderman
Huanqi Cao
Xin Cheng
Michael Chung
Matteo Grella
G. Kranthikiran
Xuming He
Haowen Hou
Jiaju Lin
Przemyslaw Kazienko
Jan Kocoñ
Jiaming Kong
Bartlomiej Koptyra
Hayden Lau
Krishna Sri Ipsit Mantri
Ferdinand Mom
Atsushi Saito
Guangyu Song
Xiangru Tang
Bolun Wang
J. S. Wind
Stansilaw Wozniak
Ruichong Zhang
Zhenyuan Zhang
Qihang Zhao
P. Zhou
Qinghua Zhou
Jian Zhu
Rui-Jie Zhu
ArXivPDFHTML

Papers citing "RWKV: Reinventing RNNs for the Transformer Era"

50 / 388 papers shown
Title
Is Mamba Capable of In-Context Learning?
Is Mamba Capable of In-Context Learning?
Riccardo Grazzi
Julien N. Siems
Simon Schrodi
Thomas Brox
Frank Hutter
14
20
0
05 Feb 2024
A Survey on Transformer Compression
A Survey on Transformer Compression
Yehui Tang
Yunhe Wang
Jianyuan Guo
Zhijun Tu
Kai Han
Hailin Hu
Dacheng Tao
21
26
0
05 Feb 2024
Beyond the Limits: A Survey of Techniques to Extend the Context Length
  in Large Language Models
Beyond the Limits: A Survey of Techniques to Extend the Context Length in Large Language Models
Xindi Wang
Mahsa Salmani
Parsa Omidi
Xiangyu Ren
Mehdi Rezagholizadeh
A. Eshaghi
LRM
23
35
0
03 Feb 2024
Repeat After Me: Transformers are Better than State Space Models at
  Copying
Repeat After Me: Transformers are Better than State Space Models at Copying
Samy Jelassi
David Brandfonbrener
Sham Kakade
Eran Malach
92
77
0
01 Feb 2024
Investigating Recurrent Transformers with Dynamic Halt
Investigating Recurrent Transformers with Dynamic Halt
Jishnu Ray Chowdhury
Cornelia Caragea
32
1
0
01 Feb 2024
Graph-Mamba: Towards Long-Range Graph Sequence Modeling with Selective
  State Spaces
Graph-Mamba: Towards Long-Range Graph Sequence Modeling with Selective State Spaces
Chloe X. Wang
Oleksii Tsepa
Jun Ma
Bo Wang
Mamba
22
85
0
01 Feb 2024
BlackMamba: Mixture of Experts for State-Space Models
BlackMamba: Mixture of Experts for State-Space Models
Quentin G. Anthony
Yury Tokpanov
Paolo Glorioso
Beren Millidge
12
21
0
01 Feb 2024
TeenyTinyLlama: open-source tiny language models trained in Brazilian
  Portuguese
TeenyTinyLlama: open-source tiny language models trained in Brazilian Portuguese
N. Corrêa
Sophia Falk
Shiza Fatimah
Aniket Sen
N. D. Oliveira
12
9
0
30 Jan 2024
VIALM: A Survey and Benchmark of Visually Impaired Assistance with Large
  Models
VIALM: A Survey and Benchmark of Visually Impaired Assistance with Large Models
Yi Zhao
Yilin Zhang
Rong Xiang
Jing Li
Hillming Li
18
16
0
29 Jan 2024
PRE: A Peer Review Based Large Language Model Evaluator
PRE: A Peer Review Based Large Language Model Evaluator
Zhumin Chu
Qingyao Ai
Yiteng Tu
Haitao Li
Yiqun Liu
LRM
ALM
15
21
0
28 Jan 2024
In-Context Language Learning: Architectures and Algorithms
In-Context Language Learning: Architectures and Algorithms
Ekin Akyürek
Bailin Wang
Yoon Kim
Jacob Andreas
LRM
ReLM
22
16
0
23 Jan 2024
In-context Learning with Retrieved Demonstrations for Language Models: A
  Survey
In-context Learning with Retrieved Demonstrations for Language Models: A Survey
an Luo
Xin Xu
Yue Liu
Panupong Pasupat
Mehran Kazemi
RALM
18
54
0
21 Jan 2024
LMUFormer: Low Complexity Yet Powerful Spiking Model With Legendre
  Memory Units
LMUFormer: Low Complexity Yet Powerful Spiking Model With Legendre Memory Units
Zeyu Liu
Gourav Datta
Anni Li
P. Beerel
17
8
0
20 Jan 2024
RWKV-TS: Beyond Traditional Recurrent Neural Network for Time Series
  Tasks
RWKV-TS: Beyond Traditional Recurrent Neural Network for Time Series Tasks
Haowen Hou
F. Richard Yu
AI4TS
22
19
0
17 Jan 2024
TransliCo: A Contrastive Learning Framework to Address the Script
  Barrier in Multilingual Pretrained Language Models
TransliCo: A Contrastive Learning Framework to Address the Script Barrier in Multilingual Pretrained Language Models
Yihong Liu
Chunlan Ma
Haotian Ye
Hinrich Schütze
15
1
0
12 Jan 2024
Transformers are Multi-State RNNs
Transformers are Multi-State RNNs
Matanel Oren
Michael Hassid
Nir Yarden
Yossi Adi
Roy Schwartz
OffRL
19
34
0
11 Jan 2024
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence
  Lengths in Large Language Models
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
Zhen Qin
Weigao Sun
Dong Li
Xuyang Shen
Weixuan Sun
Yiran Zhong
60
21
0
09 Jan 2024
Understanding LLMs: A Comprehensive Overview from Training to Inference
Understanding LLMs: A Comprehensive Overview from Training to Inference
Yi-Hsueh Liu
Haoyang He
Tianle Han
Xu-Yao Zhang
Mengyuan Liu
...
Xintao Hu
Tuo Zhang
Ning Qiang
Tianming Liu
Bao Ge
SyDa
6
64
0
04 Jan 2024
Benchmarking Large Language Models on Controllable Generation under
  Diversified Instructions
Benchmarking Large Language Models on Controllable Generation under Diversified Instructions
Yihan Chen
Benfeng Xu
Quan Wang
Yi Liu
Zhendong Mao
ALM
ELM
11
26
0
01 Jan 2024
PanGu-$π$: Enhancing Language Model Architectures via Nonlinearity
  Compensation
PanGu-πππ: Enhancing Language Model Architectures via Nonlinearity Compensation
Yunhe Wang
Hanting Chen
Yehui Tang
Tianyu Guo
Kai Han
...
Qinghua Xu
Qun Liu
Jun Yao
Chao Xu
Dacheng Tao
53
15
0
27 Dec 2023
Towards Efficient Generative Large Language Model Serving: A Survey from
  Algorithms to Systems
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems
Xupeng Miao
Gabriele Oliaro
Zhihao Zhang
Xinhao Cheng
Hongyi Jin
Tianqi Chen
Zhihao Jia
37
75
0
23 Dec 2023
Advancing VAD Systems Based on Multi-Task Learning with Improved Model
  Structures
Advancing VAD Systems Based on Multi-Task Learning with Improved Model Structures
Lingyun Zuo
Keyu An
Shiliang Zhang
Zhijie Yan
13
1
0
19 Dec 2023
A Survey of Reasoning with Foundation Models
A Survey of Reasoning with Foundation Models
Jiankai Sun
Chuanyang Zheng
E. Xie
Zhengying Liu
Ruihang Chu
...
Xipeng Qiu
Yi-Chen Guo
Hui Xiong
Qun Liu
Zhenguo Li
ReLM
LRM
AI4CE
16
74
0
17 Dec 2023
Language Modeling on a SpiNNaker 2 Neuromorphic Chip
Language Modeling on a SpiNNaker 2 Neuromorphic Chip
Khaleelulla Khan Nazeer
Mark Schöne
Rishav Mukherji
Bernhard Vogginger
Christian Mayr
David Kappel
Anand Subramoney
17
5
0
14 Dec 2023
Learning Long Sequences in Spiking Neural Networks
Learning Long Sequences in Spiking Neural Networks
Matei Ioan Stan
Oliver Rhodes
22
10
0
14 Dec 2023
Gated Linear Attention Transformers with Hardware-Efficient Training
Gated Linear Attention Transformers with Hardware-Efficient Training
Songlin Yang
Bailin Wang
Yikang Shen
Rameswar Panda
Yoon Kim
34
138
0
11 Dec 2023
User Modeling in the Era of Large Language Models: Current Research and
  Future Directions
User Modeling in the Era of Large Language Models: Current Research and Future Directions
Zhaoxuan Tan
Meng-Long Jiang
11
8
0
11 Dec 2023
Recurrent Distance Filtering for Graph Representation Learning
Recurrent Distance Filtering for Graph Representation Learning
Yuhui Ding
Antonio Orvieto
Bobby He
Thomas Hofmann
GNN
16
6
0
03 Dec 2023
Recent Advances in Scalable Energy-Efficient and Trustworthy Spiking
  Neural networks: from Algorithms to Technology
Recent Advances in Scalable Energy-Efficient and Trustworthy Spiking Neural networks: from Algorithms to Technology
Souvik Kundu
Rui-jie Zhu
Akhilesh R. Jaiswal
P. Beerel
22
4
0
02 Dec 2023
The Efficiency Spectrum of Large Language Models: An Algorithmic Survey
The Efficiency Spectrum of Large Language Models: An Algorithmic Survey
Tianyu Ding
Tianyi Chen
Haidong Zhu
Jiachen Jiang
Yiqi Zhong
Jinxin Zhou
Guangzhi Wang
Zhihui Zhu
Ilya Zharkov
Luming Liang
25
21
0
01 Dec 2023
HOT: Higher-Order Dynamic Graph Representation Learning with Efficient
  Transformers
HOT: Higher-Order Dynamic Graph Representation Learning with Efficient Transformers
Maciej Besta
Afonso Claudino Catarino
Lukas Gianinazzi
Nils Blach
Piotr Nyczyk
H. Niewiadomski
Torsten Hoefler
22
6
0
30 Nov 2023
Diffusion Models Without Attention
Diffusion Models Without Attention
Jing Nathan Yan
Jiatao Gu
Alexander M. Rush
11
60
0
30 Nov 2023
StableSSM: Alleviating the Curse of Memory in State-space Models through
  Stable Reparameterization
StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization
Shida Wang
Qianxiao Li
6
12
0
24 Nov 2023
Long-MIL: Scaling Long Contextual Multiple Instance Learning for
  Histopathology Whole Slide Image Analysis
Long-MIL: Scaling Long Contextual Multiple Instance Learning for Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Chenglu Zhu
Jiatong Cai
Sunyi Zheng
Lin Yang
VLM
14
4
0
21 Nov 2023
Never Lost in the Middle: Improving Large Language Models via Attention
  Strengthening Question Answering
Never Lost in the Middle: Improving Large Language Models via Attention Strengthening Question Answering
Junqing He
Kunhao Pan
Xiaoqun Dong
Zhuoyang Song
LiuYiBo LiuYiBo
...
Hao Wang
Qianguosun Qianguosun
Enming Zhang
Zejian Xie
Jiaxing Zhang
KELM
RALM
10
16
0
15 Nov 2023
Autoregressive Language Models For Estimating the Entropy of Epic EHR
  Audit Logs
Autoregressive Language Models For Estimating the Entropy of Epic EHR Audit Logs
Benjamin C. Warner
Thomas Kannampallil
Seunghwan Kim
21
0
0
10 Nov 2023
Hierarchically Gated Recurrent Neural Network for Sequence Modeling
Hierarchically Gated Recurrent Neural Network for Sequence Modeling
Zhen Qin
Songlin Yang
Yiran Zhong
27
72
0
08 Nov 2023
ProSG: Using Prompt Synthetic Gradients to Alleviate Prompt Forgetting
  of RNN-like Language Models
ProSG: Using Prompt Synthetic Gradients to Alleviate Prompt Forgetting of RNN-like Language Models
Haotian Luo
Kunming Wu
Cheng Dai
Sixian Ding
Xinhao Chen
16
1
0
03 Nov 2023
ViR: Towards Efficient Vision Retention Backbones
ViR: Towards Efficient Vision Retention Backbones
Ali Hatamizadeh
Michael Ranzinger
Shiyi Lan
Jose M. Alvarez
Sanja Fidler
Jan Kautz
GNN
14
0
0
30 Oct 2023
Learning Successor Features with Distributed Hebbian Temporal Memory
Learning Successor Features with Distributed Hebbian Temporal Memory
E. Dzhivelikian
Petr Kuderov
Aleksandr I. Panov
12
0
0
20 Oct 2023
Generative Calibration for In-context Learning
Generative Calibration for In-context Learning
Zhongtao Jiang
Yuanzhe Zhang
Cao Liu
Jun Zhao
Kang Liu
149
8
0
16 Oct 2023
USTEP: Spatio-Temporal Predictive Learning under A Unified View
USTEP: Spatio-Temporal Predictive Learning under A Unified View
Cheng Tan
Jue Wang
Zhangyang Gao
Siyuan Li
Stan Z. Li
31
1
0
09 Oct 2023
SCALE: Synergized Collaboration of Asymmetric Language Translation
  Engines
SCALE: Synergized Collaboration of Asymmetric Language Translation Engines
Xin Cheng
Xun Wang
Tao Ge
Si-Qing Chen
Y. Li
Dongyan Zhao
Rui Yan
32
2
0
29 Sep 2023
Transformer-VQ: Linear-Time Transformers via Vector Quantization
Transformer-VQ: Linear-Time Transformers via Vector Quantization
Albert Mohwald
13
15
0
28 Sep 2023
Exploring RWKV for Memory Efficient and Low Latency Streaming ASR
Exploring RWKV for Memory Efficient and Low Latency Streaming ASR
Keyu An
Shiliang Zhang
6
4
0
26 Sep 2023
Multi-Dimensional Hyena for Spatial Inductive Bias
Multi-Dimensional Hyena for Spatial Inductive Bias
Itamar Zimerman
Lior Wolf
ViT
15
4
0
24 Sep 2023
BAMBOO: A Comprehensive Benchmark for Evaluating Long Text Modeling
  Capacities of Large Language Models
BAMBOO: A Comprehensive Benchmark for Evaluating Long Text Modeling Capacities of Large Language Models
Zican Dong
Tianyi Tang
Junyi Li
Wayne Xin Zhao
Ji-Rong Wen
RALM
ALM
10
34
0
23 Sep 2023
The Languini Kitchen: Enabling Language Modelling Research at Different
  Scales of Compute
The Languini Kitchen: Enabling Language Modelling Research at Different Scales of Compute
Aleksandar Stanić
Dylan R. Ashley
Oleg Serikov
Louis Kirsch
Francesco Faccio
Jürgen Schmidhuber
Thomas Hofmann
Imanol Schlag
MoE
30
9
0
20 Sep 2023
RMT: Retentive Networks Meet Vision Transformers
RMT: Retentive Networks Meet Vision Transformers
Qihang Fan
Huaibo Huang
Mingrui Chen
Hongmin Liu
Ran He
ViT
24
65
0
20 Sep 2023
Advancing Regular Language Reasoning in Linear Recurrent Neural Networks
Advancing Regular Language Reasoning in Linear Recurrent Neural Networks
Ting-Han Fan
Ta-Chung Chi
Alexander I. Rudnicky
LRM
14
5
0
14 Sep 2023
Previous
12345678
Next