Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.04243
Cited By
Fine-Tuning Pre-trained Transformers into Decaying Fast Weights
9 October 2022
H. H. Mao
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Fine-Tuning Pre-trained Transformers into Decaying Fast Weights"
5 / 5 papers shown
Title
Conformal Transformations for Symmetric Power Transformers
Saurabh Kumar
Jacob Buckman
Carles Gelada
Sean Zhang
60
0
0
05 Mar 2025
Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear Complexity
Mutian He
Philip N. Garner
73
0
0
09 Oct 2024
Linear Attention Sequence Parallelism
Weigao Sun
Zhen Qin
Dong Li
Xuyang Shen
Yu Qiao
Yiran Zhong
53
2
0
03 Apr 2024
ABC: Attention with Bounded-memory Control
Hao Peng
Jungo Kasai
Nikolaos Pappas
Dani Yogatama
Zhaofeng Wu
Lingpeng Kong
Roy Schwartz
Noah A. Smith
43
21
0
06 Oct 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
231
1,508
0
31 Dec 2020
1