ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.10427
  4. Cited By
Accelerating Transformer Inference for Translation via Parallel Decoding

Accelerating Transformer Inference for Translation via Parallel Decoding

17 May 2023
Andrea Santilli
Silvio Severino
Emilian Postolache
Valentino Maiorca
Michele Mancusi
R. Marin
Emanuele Rodolà
ArXivPDFHTML

Papers citing "Accelerating Transformer Inference for Translation via Parallel Decoding"

23 / 73 papers shown
Title
Speech Translation with Speech Foundation Models and Large Language
  Models: What is There and What is Missing?
Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?
Marco Gaido
Sara Papi
Matteo Negri
L. Bentivogli
41
12
0
19 Feb 2024
PaDeLLM-NER: Parallel Decoding in Large Language Models for Named Entity
  Recognition
PaDeLLM-NER: Parallel Decoding in Large Language Models for Named Entity Recognition
Jinghui Lu
Ziwei Yang
Yanjie Wang
Xuejing Liu
Brian Mac Namee
Can Huang
MoE
45
4
0
07 Feb 2024
Break the Sequential Dependency of LLM Inference Using Lookahead
  Decoding
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Yichao Fu
Peter Bailis
Ion Stoica
Hao Zhang
123
139
0
03 Feb 2024
Decoding Speculative Decoding
Decoding Speculative Decoding
Minghao Yan
Saurabh Agarwal
Shivaram Venkataraman
LRM
25
5
0
02 Feb 2024
EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty
EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty
Yuhui Li
Fangyun Wei
Chao Zhang
Hongyang R. Zhang
31
121
0
26 Jan 2024
BiTA: Bi-Directional Tuning for Lossless Acceleration in Large Language
  Models
BiTA: Bi-Directional Tuning for Lossless Acceleration in Large Language Models
Feng-Huei Lin
Hanling Yi
Hongbin Li
Yifan Yang
Xiaotian Yu
Guangming Lu
Rong Xiao
34
3
0
23 Jan 2024
Unlocking Efficiency in Large Language Model Inference: A Comprehensive
  Survey of Speculative Decoding
Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative Decoding
Heming Xia
Zhe Yang
Qingxiu Dong
Peiyi Wang
Yongqi Li
Tao Ge
Tianyu Liu
Wenjie Li
Zhifang Sui
LRM
22
97
0
15 Jan 2024
Towards Efficient Generative Large Language Model Serving: A Survey from
  Algorithms to Systems
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems
Xupeng Miao
Gabriele Oliaro
Zhihao Zhang
Xinhao Cheng
Hongyi Jin
Tianqi Chen
Zhihao Jia
61
76
0
23 Dec 2023
Lookahead: An Inference Acceleration Framework for Large Language Model
  with Lossless Generation Accuracy
Lookahead: An Inference Acceleration Framework for Large Language Model with Lossless Generation Accuracy
Yao-Min Zhao
Zhitian Xie
Chen Liang
Chenyi Zhuang
Jinjie Gu
50
11
0
20 Dec 2023
EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language
  Models with 3D Parallelism
EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language Models with 3D Parallelism
Yanxi Chen
Xuchen Pan
Yaliang Li
Bolin Ding
Jingren Zhou
LRM
33
31
0
08 Dec 2023
Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads
  to Answers Faster
Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads to Answers Faster
Hongxuan Zhang
Zhining Liu
Yao Zhao
Jiaqi Zheng
Chenyi Zhuang
Jinjie Gu
Guihai Chen
LRM
MLLM
15
1
0
14 Nov 2023
Improving Machine Translation with Large Language Models: A Preliminary
  Study with Cooperative Decoding
Improving Machine Translation with Large Language Models: A Preliminary Study with Cooperative Decoding
Jiali Zeng
Fandong Meng
Yongjing Yin
Jie Zhou
21
10
0
06 Nov 2023
Enhancing Abstractiveness of Summarization Models through Calibrated
  Distillation
Enhancing Abstractiveness of Summarization Models through Calibrated Distillation
Hwanjun Song
Igor Shalyminov
Hang Su
Siffi Singh
Kaisheng Yao
Saab Mansour
14
6
0
20 Oct 2023
Fast and Robust Early-Exiting Framework for Autoregressive Language
  Models with Synchronized Parallel Decoding
Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding
Sangmin Bae
Jongwoo Ko
Hwanjun Song
SeYoung Yun
22
53
0
09 Oct 2023
Camoscio: an Italian Instruction-tuned LLaMA
Camoscio: an Italian Instruction-tuned LLaMA
Andrea Santilli
Emanuele Rodolà
11
26
0
31 Jul 2023
FlexGen: High-Throughput Generative Inference of Large Language Models
  with a Single GPU
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
Ying Sheng
Lianmin Zheng
Binhang Yuan
Zhuohan Li
Max Ryabinin
...
Joseph E. Gonzalez
Percy Liang
Christopher Ré
Ion Stoica
Ce Zhang
144
366
0
13 Mar 2023
Are Character-level Translations Worth the Wait? Comparing ByT5 and mT5
  for Machine Translation
Are Character-level Translations Worth the Wait? Comparing ByT5 and mT5 for Machine Translation
Lukas Edman
Gabriele Sarti
Antonio Toral
Gertjan van Noord
Arianna Bisazza
16
11
0
28 Feb 2023
Multi-Source Diffusion Models for Simultaneous Music Generation and
  Separation
Multi-Source Diffusion Models for Simultaneous Music Generation and Separation
Giorgio Mariani
Irene Tallini
Emilian Postolache
Michele Mancusi
Luca Cosmo
Emanuele Rodolà
DiffM
22
36
0
04 Feb 2023
Speculative Decoding: Exploiting Speculative Execution for Accelerating
  Seq2seq Generation
Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation
Heming Xia
Tao Ge
Peiyi Wang
Si-Qing Chen
Furu Wei
Zhifang Sui
19
69
0
30 Mar 2022
Non-Autoregressive Translation with Layer-Wise Prediction and Deep
  Supervision
Non-Autoregressive Translation with Layer-Wise Prediction and Deep Supervision
Chenyang Huang
Hao Zhou
Osmar R. Zaïane
Lili Mou
Lei Li
92
59
0
14 Oct 2021
AligNART: Non-autoregressive Neural Machine Translation by Jointly
  Learning to Estimate Alignment and Translate
AligNART: Non-autoregressive Neural Machine Translation by Jointly Learning to Estimate Alignment and Translate
Jongyoon Song
Sungwon Kim
Sungroh Yoon
66
37
0
14 Sep 2021
Multi-Task Learning with Shared Encoder for Non-Autoregressive Machine
  Translation
Multi-Task Learning with Shared Encoder for Non-Autoregressive Machine Translation
Yongchang Hao
Shilin He
Wenxiang Jiao
Zhaopeng Tu
Michael Lyu
Xing Wang
95
28
0
24 Oct 2020
Google's Neural Machine Translation System: Bridging the Gap between
  Human and Machine Translation
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,740
0
26 Sep 2016
Previous
12