Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.02721
Cited By
To Asymmetry and Beyond: Structured Pruning of Sequence to Sequence Models for Improved Inference Efficiency
5 April 2023
Daniel Fernando Campos
Chengxiang Zhai
Re-assign community
ArXiv
PDF
HTML
Papers citing
"To Asymmetry and Beyond: Structured Pruning of Sequence to Sequence Models for Improved Inference Efficiency"
4 / 4 papers shown
Title
Predictive Pipelined Decoding: A Compute-Latency Trade-off for Exact LLM Decoding
Seongjun Yang
Gibbeum Lee
Jaewoong Cho
Dimitris Papailiopoulos
Kangwook Lee
21
32
0
12 Jul 2023
Train Flat, Then Compress: Sharpness-Aware Minimization Learns More Compressible Models
Clara Na
Sanket Vaibhav Mehta
Emma Strubell
62
19
0
25 May 2022
Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
Yi Tay
Mostafa Dehghani
J. Rao
W. Fedus
Samira Abnar
Hyung Won Chung
Sharan Narang
Dani Yogatama
Ashish Vaswani
Donald Metzler
185
110
0
22 Sep 2021
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
226
4,424
0
23 Jan 2020
1