Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2203.11239
Cited By
DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
21 March 2022
Zheng Li
Zijian Wang
Ming Tan
Ramesh Nallapati
Parminder Bhatia
Andrew O. Arnold
Bing Xiang
Dan Roth
MQ
Re-assign community
ArXiv (abs)
PDF
HTML
Github
Papers citing
"DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization"
25 / 25 papers shown
QUADS: QUAntized Distillation Framework for Efficient Speech Language Understanding
Subrata Biswas
Mohammad Nur Hossain Khan
Bashima Islam
331
2
0
19 May 2025
ResMoE: Space-efficient Compression of Mixture of Experts LLMs via Residual Restoration
Knowledge Discovery and Data Mining (KDD), 2025
Mengting Ai
Tianxin Wei
Yifan Chen
Zhichen Zeng
Ritchie Zhao
G. Varatkar
B. Rouhani
Xianfeng Tang
Hanghang Tong
Jingrui He
MoE
312
18
0
10 Mar 2025
SpikeLM: Towards General Spike-Driven Language Modeling via Elastic Bi-Spiking Mechanisms
Xingrun Xing
Zheng Zhang
Ziyi Ni
Shitao Xiao
Yiming Ju
Siqi Fan
Yequan Wang
Jiajun Zhang
Guoqi Li
278
30
0
05 Jun 2024
Switchable Decision: Dynamic Neural Generation Networks
Shujian Zhang
Korawat Tanwisuth
Chengyue Gong
Pengcheng He
Mi Zhou
BDL
254
0
0
07 May 2024
Model Compression and Efficient Inference for Large Language Models: A Survey
Wenxiao Wang
Wei Chen
Yicong Luo
Yongliu Long
Zhengkai Lin
Liye Zhang
Binbin Lin
Deng Cai
Xiaofei He
MQ
379
95
0
15 Feb 2024
BiPFT: Binary Pre-trained Foundation Transformer with Low-rank Estimation of Binarization Residual Polynomials
AAAI Conference on Artificial Intelligence (AAAI), 2023
Xingrun Xing
Li Du
Xinyuan Wang
Xianlin Zeng
Yequan Wang
Zheng Zhang
Jiajun Zhang
296
5
0
14 Dec 2023
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Xiaoxia Wu
Haojun Xia
Stephen Youn
Zhen Zheng
Shiyang Chen
...
Reza Yazdani Aminabadi
Yuxiong He
Olatunji Ruwase
Leon Song
Zhewei Yao
375
18
0
14 Dec 2023
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling
Sanchit Gandhi
Patrick von Platen
Alexander M. Rush
VLM
452
121
0
01 Nov 2023
Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Shicheng Tan
Weng Lam Tam
Yuanchun Wang
Wenwen Gong
Shuo Zhao
Peng Zhang
Jie Tang
VLM
184
1
0
11 Jun 2023
Modular Transformers: Compressing Transformers into Modularized Layers for Flexible Efficient Inference
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Wangchunshu Zhou
Ronan Le Bras
Yejin Choi
196
2
0
04 Jun 2023
Binary and Ternary Natural Language Generation
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Zechun Liu
Barlas Oğuz
Aasish Pappu
Yangyang Shi
Raghuraman Krishnamoorthi
MQ
350
11
0
02 Jun 2023
Expand, Rerank, and Retrieve: Query Reranking for Open-Domain Question Answering
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Yung-Sung Chuang
Wei Fang
Shang-Wen Li
Anuj Kumar
James R. Glass
LRM
301
29
0
26 May 2023
Task-agnostic Distillation of Encoder-Decoder Language Models
International Conference on Language Resources and Evaluation (LREC), 2023
Chen Zhang
Yang Yang
Jingang Wang
Dawei Song
176
5
0
21 May 2023
xPQA: Cross-Lingual Product Question Answering across 12 Languages
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Xiaoyu Shen
Akari Asai
Bill Byrne
Adria de Gispert
289
11
0
16 May 2023
A Systematic Study of Knowledge Distillation for Natural Language Generation with Pseudo-Target Training
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Nitay Calderon
Subhabrata Mukherjee
Roi Reichart
Amir Kantor
362
24
0
03 May 2023
To Asymmetry and Beyond: Structured Pruning of Sequence to Sequence Models for Improved Inference Efficiency
Daniel Fernando Campos
Chengxiang Zhai
391
2
0
05 Apr 2023
Greener yet Powerful: Taming Large Code Generation Models with Quantization
Xiaokai Wei
Sujan Kumar Gonugondla
W. Ahmad
Shiqi Wang
Baishakhi Ray
...
Ben Athiwaratkun
Mingyue Shang
M. K. Ramanathan
Parminder Bhatia
Bing Xiang
MQ
259
8
0
09 Mar 2023
A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT
Yihan Cao
Siyu Li
Yixin Liu
Zhiling Yan
Yutong Dai
Philip S. Yu
Lichao Sun
403
780
0
07 Mar 2023
Transformer models: an introduction and catalog
X. Amatriain
Ananth Sankar
Jie Bing
Praveen Kumar Bodigutla
Timothy J. Hazen
Michaeel Kazi
633
77
0
12 Feb 2023
Understanding INT4 Quantization for Transformer Models: Latency Speedup, Composability, and Failure Cases
International Conference on Machine Learning (ICML), 2023
Xiaoxia Wu
Cheng-rong Li
Reza Yazdani Aminabadi
Z. Yao
Yuxiong He
MQ
300
41
0
27 Jan 2023
Can Open-Domain QA Reader Utilize External Knowledge Efficiently like Humans?
Neeraj Varshney
Man Luo
Chitta Baral
RALM
241
15
0
23 Nov 2022
Understanding and Improving Knowledge Distillation for Quantization-Aware Training of Large Transformer Encoders
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Minsoo Kim
Sihwa Lee
S. Hong
Duhyeuk Chang
Jungwook Choi
MQ
240
16
0
20 Nov 2022
Model Cascading: Towards Jointly Improving Efficiency and Accuracy of NLP Systems
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Neeraj Varshney
Chitta Baral
211
43
0
11 Oct 2022
EdgeFormer: A Parameter-Efficient Transformer for On-Device Seq2seq Generation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Tao Ge
Si-Qing Chen
Furu Wei
MoE
355
31
0
16 Feb 2022
Improving Question Answering Performance Using Knowledge Distillation and Active Learning
Engineering applications of artificial intelligence (EAAI), 2021
Yasaman Boreshban
Seyed Morteza Mirbostani
Gholamreza Ghassem-Sani
Seyed Abolghasem Mirroshandel
Shahin Amiriparian
228
18
0
26 Sep 2021
1
Page 1 of 1