Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.13112
Cited By
Efficient Encoder-Decoder Transformer Decoding for Decomposable Tasks
19 March 2024
Bo-Ru Lu
Nikita Haduong
Chien-Yu Lin
Hao Cheng
Noah A. Smith
Mari Ostendorf
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Efficient Encoder-Decoder Transformer Decoding for Decomposable Tasks"
5 / 5 papers shown
Title
Hydragen: High-Throughput LLM Inference with Shared Prefixes
Jordan Juravsky
Bradley Brown
Ryan Ehrlich
Daniel Y. Fu
Christopher Ré
Azalia Mirhoseini
49
35
0
07 Feb 2024
A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models
Takuma Udagawa
Aashka Trivedi
Michele Merler
Bishwaranjan Bhattacharjee
23
7
0
13 Oct 2023
Description-Driven Task-Oriented Dialog Modeling
Jeffrey Zhao
Raghav Gupta
Yuan Cao
Dian Yu
Mingqiu Wang
Harrison Lee
Abhinav Rastogi
Izhak Shafran
Yonghui Wu
41
64
0
21 Jan 2022
Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference
Sneha Kudugunta
Yanping Huang
Ankur Bapna
M. Krikun
Dmitry Lepikhin
Minh-Thang Luong
Orhan Firat
MoE
119
104
0
24 Sep 2021
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
228
578
0
12 Mar 2020
1