ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.13048
  4. Cited By
RWKV: Reinventing RNNs for the Transformer Era

RWKV: Reinventing RNNs for the Transformer Era

22 May 2023
Bo Peng
Eric Alcaide
Quentin G. Anthony
Alon Albalak
Samuel Arcadinho
Stella Biderman
Huanqi Cao
Xin Cheng
Michael Chung
Matteo Grella
G. Kranthikiran
Xuming He
Haowen Hou
Jiaju Lin
Przemyslaw Kazienko
Jan Kocoñ
Jiaming Kong
Bartlomiej Koptyra
Hayden Lau
Krishna Sri Ipsit Mantri
Ferdinand Mom
Atsushi Saito
Guangyu Song
Xiangru Tang
Bolun Wang
J. S. Wind
Stansilaw Wozniak
Ruichong Zhang
Zhenyuan Zhang
Qihang Zhao
P. Zhou
Qinghua Zhou
Jian Zhu
Rui-Jie Zhu
ArXivPDFHTML

Papers citing "RWKV: Reinventing RNNs for the Transformer Era"

50 / 388 papers shown
Title
Baichuan-M1: Pushing the Medical Capability of Large Language Models
B. Wang
Haizhou Zhao
Huozhi Zhou
Liang Song
Mingyu Xu
...
Yan Zhang
Yifei Duan
Yuyan Zhou
Zhi-Ming Ma
Z. Wu
LM&MA
ELM
AI4MH
37
3
0
18 Feb 2025
MALT Diffusion: Memory-Augmented Latent Transformers for Any-Length Video Generation
MALT Diffusion: Memory-Augmented Latent Transformers for Any-Length Video Generation
Sihyun Yu
Meera Hahn
Dan Kondratyuk
Jinwoo Shin
Agrim Gupta
José Lezama
Irfan Essa
David A. Ross
Jonathan Huang
DiffM
VGen
67
0
0
18 Feb 2025
HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading
HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading
Cheng Luo
Zefan Cai
Hanshi Sun
Jinqi Xiao
Bo Yuan
Wen Xiao
Junjie Hu
Jiawei Zhao
Beidi Chen
Anima Anandkumar
56
1
0
18 Feb 2025
Associative Recurrent Memory Transformer
Associative Recurrent Memory Transformer
Ivan Rodkin
Yuri Kuratov
Aydar Bulatov
Mikhail Burtsev
60
2
0
17 Feb 2025
DuplexMamba: Enhancing Real-time Speech Conversations with Duplex and Streaming Capabilities
DuplexMamba: Enhancing Real-time Speech Conversations with Duplex and Streaming Capabilities
Xiangyu Lu
Wang Xu
Haoyu Wang
Hongyun Zhou
Haiyan Zhao
Conghui Zhu
T. Zhao
M. Yang
Mamba
AuLLM
54
0
0
16 Feb 2025
Leveraging Constrained Monte Carlo Tree Search to Generate Reliable Long Chain-of-Thought for Mathematical Reasoning
Leveraging Constrained Monte Carlo Tree Search to Generate Reliable Long Chain-of-Thought for Mathematical Reasoning
Qingwen Lin
Boyan Xu
Zijian Li
Z. Hao
Keli Zhang
Ruichu Cai
LRM
38
2
0
16 Feb 2025
Surprisal Takes It All: Eye Tracking Based Cognitive Evaluation of Text Readability Measures
Surprisal Takes It All: Eye Tracking Based Cognitive Evaluation of Text Readability Measures
Keren Gruteke Klein
Shachar Frenkel
Omer Shubi
Yevgeni Berzak
31
0
0
16 Feb 2025
KernelBench: Can LLMs Write Efficient GPU Kernels?
KernelBench: Can LLMs Write Efficient GPU Kernels?
Anne Ouyang
Simon Guo
Simran Arora
Alex L. Zhang
William Hu
Christopher Ré
Azalia Mirhoseini
ALM
38
1
0
14 Feb 2025
LOB-Bench: Benchmarking Generative AI for Finance - an Application to Limit Order Book Data
LOB-Bench: Benchmarking Generative AI for Finance - an Application to Limit Order Book Data
Peer Nagy
Sascha Frey
Kang Li
Bidipta Sarkar
Svitlana Vyetrenko
Stefan Zohren
Ani Calinescu
Jakob Foerster
73
1
0
13 Feb 2025
Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More
Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More
Feng Wang
Yaodong Yu
Guoyizhe Wei
Wei Shao
Yuyin Zhou
Alan Yuille
Cihang Xie
ViT
82
4
0
06 Feb 2025
Exploring Linear Attention Alternative for Single Image Super-Resolution
Exploring Linear Attention Alternative for Single Image Super-Resolution
Rongchang Lu
Changyu Li
Donghang Li
Guojing Zhang
Jianqiang Huang
X. Li
36
0
0
01 Feb 2025
Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models
Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models
Jingwei Yi
Yueqi Xie
Bin Zhu
Emre Kiciman
Guangzhong Sun
Xing Xie
Fangzhao Wu
AAML
47
61
0
28 Jan 2025
State-space models are accurate and efficient neural operators for dynamical systems
State-space models are accurate and efficient neural operators for dynamical systems
Zheyuan Hu
Nazanin Ahmadi Daryakenari
Qianli Shen
Kenji Kawaguchi
George Karniadakis
Mamba
AI4CE
46
10
0
28 Jan 2025
LongReason: A Synthetic Long-Context Reasoning Benchmark via Context Expansion
LongReason: A Synthetic Long-Context Reasoning Benchmark via Context Expansion
Zhan Ling
Kang Liu
Kai Yan
Y. Yang
Weijian Lin
Ting-Han Fan
Lingfeng Shen
Zhengyin Du
Jiecao Chen
ReLM
ELM
LRM
40
2
0
25 Jan 2025
Rate-Aware Learned Speech Compression
Rate-Aware Learned Speech Compression
Jun Xu
Zhengxue Cheng
Guangchuan Chi
Yuhan Liu
Yuelin Hu
Li-Na Song
30
0
0
21 Jan 2025
ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models
ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models
Thibaut Thonet
Jos Rozen
Laurent Besacier
RALM
126
2
0
20 Jan 2025
Scaling Large Language Model Training on Frontier with Low-Bandwidth Partitioning
Scaling Large Language Model Training on Frontier with Low-Bandwidth Partitioning
Lang Xu
Quentin G. Anthony
Jacob Hatef
A. Shafi
Hari Subramoni
Dhabaleswar K.
Panda
32
0
0
08 Jan 2025
Detection, Retrieval, and Explanation Unified: A Violence Detection System Based on Knowledge Graphs and GAT
Detection, Retrieval, and Explanation Unified: A Violence Detection System Based on Knowledge Graphs and GAT
Wen-Dong Jiang
Chih-Yung Chang
Diptendu Sinha Roy
36
0
0
07 Jan 2025
KM-UNet KAN Mamba UNet for medical image segmentation
Yibo Zhang
Mamba
29
0
0
05 Jan 2025
Efficient Relational Context Perception for Knowledge Graph Completion
Wenkai Tu
Guojia Wan
Zhengchun Shang
Bo Du
32
0
0
03 Jan 2025
VMamba: Visual State Space Model
VMamba: Visual State Space Model
Yue Liu
Yunjie Tian
Yuzhong Zhao
Hongtian Yu
Lingxi Xie
Yaowei Wang
Qixiang Ye
Jianbin Jiao
Yunfan Liu
Mamba
106
592
0
31 Dec 2024
A Comprehensive Survey of Large Language Models and Multimodal Large Language Models in Medicine
A Comprehensive Survey of Large Language Models and Multimodal Large Language Models in Medicine
Hanguang Xiao
Feizhong Zhou
X. Liu
Tianqi Liu
Zhipeng Li
Xin Liu
Xiaoxuan Huang
AILaw
LM&MA
LRM
59
17
0
31 Dec 2024
Injecting Explainability and Lightweight Design into Weakly Supervised Video Anomaly Detection Systems
Injecting Explainability and Lightweight Design into Weakly Supervised Video Anomaly Detection Systems
Wen-Dong Jiang
Chih-Yung Chang
Hsiang-Chuan Chang
Ji-Yuan Chen
Diptendu Sinha Roy
21
0
0
31 Dec 2024
PTQ4VM: Post-Training Quantization for Visual Mamba
PTQ4VM: Post-Training Quantization for Visual Mamba
Younghyun Cho
Changhun Lee
Seonggon Kim
Eunhyeok Park
MQ
Mamba
31
2
0
29 Dec 2024
A novel framework for MCDM based on Z numbers and soft likelihood
  function
A novel framework for MCDM based on Z numbers and soft likelihood function
Yuanpeng He
31
0
0
26 Dec 2024
L3TC: Leveraging RWKV for Learned Lossless Low-Complexity Text
  Compression
L3TC: Leveraging RWKV for Learned Lossless Low-Complexity Text Compression
J. Zhang
Zhengxue Cheng
Yan Zhao
Shihao Wang
Dajiang Zhou
Guo Lu
Li-Na Song
71
1
0
21 Dec 2024
V"Mean"ba: Visual State Space Models only need 1 hidden dimension
V"Mean"ba: Visual State Space Models only need 1 hidden dimension
Tien-Yu Chi
Hung-Yueh Chiang
Chi-Chih Chang
N. Huang
Kai-Chiang Wu
83
0
0
21 Dec 2024
Formal Mathematical Reasoning: A New Frontier in AI
Formal Mathematical Reasoning: A New Frontier in AI
Kaiyu Yang
Gabriel Poesia
Jingxuan He
Wenda Li
Kristin Lauter
Swarat Chaudhuri
Dawn Song
LRM
AI4CE
82
20
0
20 Dec 2024
BabyHGRN: Exploring RNNs for Sample-Efficient Training of Language
  Models
BabyHGRN: Exploring RNNs for Sample-Efficient Training of Language Models
Patrick Haller
Jonas Golde
A. Akbik
72
0
0
20 Dec 2024
Efficient Self-Supervised Video Hashing with Selective State Spaces
Efficient Self-Supervised Video Hashing with Selective State Spaces
Jinpeng Wang
Niu Lian
Jun Li
Yuting Wang
Yan Feng
Bin Chen
Yongbing Zhang
Shu-Tao Xia
69
2
0
19 Dec 2024
RWKV-Lite: Deeply Compressed RWKV for Resource-Constrained Devices
RWKV-Lite: Deeply Compressed RWKV for Resource-Constrained Devices
Wonkyo Choe
Yangfeng Ji
F. Lin
54
1
0
14 Dec 2024
Findings of the Second BabyLM Challenge: Sample-Efficient Pretraining on
  Developmentally Plausible Corpora
Findings of the Second BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora
Michael Y. Hu
Aaron Mueller
Candace Ross
Adina Williams
Tal Linzen
Chengxu Zhuang
Ryan Cotterell
Leshem Choshen
Alex Warstadt
Ethan Gotlieb Wilcox
91
7
0
06 Dec 2024
Exploring Real&Synthetic Dataset and Linear Attention in Image
  Restoration
Exploring Real&Synthetic Dataset and Linear Attention in Image Restoration
Yuzhen Du
Teng Hu
J. Zhang
Ran Yi Chengming Xu
Xiaobin Hu
Kai WU
Donghao Luo
Y. Wang
Lizhuang Ma
73
1
0
05 Dec 2024
Marconi: Prefix Caching for the Era of Hybrid LLMs
Marconi: Prefix Caching for the Era of Hybrid LLMs
Rui Pan
Zhuang Wang
Zhen Jia
Can Karakus
Luca Zancato
Tri Dao
Ravi Netravali
Yida Wang
84
4
0
28 Nov 2024
Best of Both Worlds: Advantages of Hybrid Graph Sequence Models
Best of Both Worlds: Advantages of Hybrid Graph Sequence Models
Ali Behrouz
Ali Parviz
Mahdi Karami
Clayton Sanford
Bryan Perozzi
Vahab Mirrokni
76
2
0
23 Nov 2024
Financial Risk Assessment via Long-term Payment Behavior Sequence
  Folding
Financial Risk Assessment via Long-term Payment Behavior Sequence Folding
Yiran Qiao
Yateng Tang
Xiang Ao
Qi Yuan
Ziming Liu
Chen Shen
Xuehao Zheng
60
0
0
22 Nov 2024
Hymba: A Hybrid-head Architecture for Small Language Models
Hymba: A Hybrid-head Architecture for Small Language Models
Xin Dong
Y. Fu
Shizhe Diao
Wonmin Byeon
Zijia Chen
...
Min-Hung Chen
Yoshi Suhara
Y. Lin
Jan Kautz
Pavlo Molchanov
Mamba
88
13
0
20 Nov 2024
Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues
Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues
Riccardo Grazzi
Julien N. Siems
Jörg K.H. Franke
Arber Zela
Frank Hutter
Massimiliano Pontil
84
10
0
19 Nov 2024
Bi-Mamba: Towards Accurate 1-Bit State Space Models
Shengkun Tang
Liqun Ma
H. Li
Mingjie Sun
Zhiqiang Shen
Mamba
60
3
0
18 Nov 2024
MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map
Yuhong Chou
Man Yao
Kexin Wang
Yuqi Pan
Ruijie Zhu
Yiran Zhong
Yu Qiao
J. Wu
Bo Xu
Guoqi Li
33
4
0
16 Nov 2024
LongSafety: Enhance Safety for Long-Context LLMs
LongSafety: Enhance Safety for Long-Context LLMs
Mianqiu Huang
Xiaoran Liu
Shaojun Zhou
Mozhi Zhang
Chenkun Tan
...
Zhikai Lei
Linlin Li
Q. Liu
Yaqian Zhou
Xipeng Qiu
ELM
ALM
30
0
0
11 Nov 2024
Retentive Neural Quantum States: Efficient Ansätze for Ab Initio
  Quantum Chemistry
Retentive Neural Quantum States: Efficient Ansätze for Ab Initio Quantum Chemistry
Oliver Knitter
Dan Zhao
J. Stokes
M. Ganahl
Stefan Leichenauer
S. Veerapaneni
27
1
0
06 Nov 2024
MambaPEFT: Exploring Parameter-Efficient Fine-Tuning for Mamba
MambaPEFT: Exploring Parameter-Efficient Fine-Tuning for Mamba
Masakazu Yoshimura
Teruaki Hayashi
Yota Maeda
Mamba
58
2
0
06 Nov 2024
The Evolution of RWKV: Advancements in Efficient Language Modeling
The Evolution of RWKV: Advancements in Efficient Language Modeling
Akul Datta
VLM
28
1
0
05 Nov 2024
DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for
  Efficient Robot Execution
DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution
Yang Yue
Yulin Wang
Bingyi Kang
Yizeng Han
Shenzhi Wang
Shiji Song
Jiashi Feng
Gao Huang
VLM
35
16
0
04 Nov 2024
Data movement limits to frontier model training
Data movement limits to frontier model training
Ege Erdil
David Schneider-Joseph
23
0
0
02 Nov 2024
When can classical neural networks represent quantum states?
When can classical neural networks represent quantum states?
Tai-Hsuan Yang
Mehdi Soleimanifar
Thiago Bergamaschi
J. Preskill
27
2
0
30 Oct 2024
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
Haiyang Wang
Yue Fan
Muhammad Ferjad Naeem
Yongqin Xian
J. E. Lenssen
Liwei Wang
F. Tombari
Bernt Schiele
33
2
0
30 Oct 2024
From Explicit Rules to Implicit Reasoning in an Interpretable Violence
  Monitoring System
From Explicit Rules to Implicit Reasoning in an Interpretable Violence Monitoring System
Wen-Dong Jiang
Chih-Yung Chang
Ssu-Chi Kuai
Diptendu Sinha Roy
21
0
0
29 Oct 2024
Counting Ability of Large Language Models and Impact of Tokenization
Counting Ability of Large Language Models and Impact of Tokenization
Xiang Zhang
Juntai Cao
Chenyu You
LRM
25
0
0
25 Oct 2024
Previous
12345678
Next