Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box
Floating-Point Transformer Models

Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer Models

23 March 2022

Mostafa Mahmoud

Ameer Abdelhadi

Andreas Moshovos

Papers citing "Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer Models"

12 / 12 papers shown

Title
LightNobel: Improving Sequence Length Limitation in Protein Structure Prediction Model via Adaptive Activation Quantization Seunghee Han S. Choi J. Kim 26 0 0 09 May 2025
Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache Quantization Minsu Kim Seongmin Hong RyeoWook Ko S. Choi Hunjong Lee Junsoo Kim J. Kim Jongse Park 57 0 0 24 Mar 2025
HEPPO: Hardware-Efficient Proximal Policy Optimization -- A Universal Pipelined Architecture for Generalized Advantage Estimation Hazem Taha Ameer M. S. Abdelhadi 35 1 0 22 Jan 2025
Ditto: Accelerating Diffusion Model via Temporal Value Similarity Sungbin Kim Hyunwuk Lee Wonho Cho Mincheol Park Won Woo Ro 58 1 0 20 Jan 2025
BitMoD: Bit-serial Mixture-of-Datatype LLM Acceleration Yuzong Chen Ahmed F. AbouElhamayed Xilai Dai Yang Wang Marta Andronic G. Constantinides Mohamed S. Abdelfattah MQ 103 1 0 18 Nov 2024
Efficient Methods for Natural Language Processing: A Survey Marcos Vinícius Treviso Ji-Ung Lee Tianchu Ji Betty van Aken Qingqing Cao ... Emma Strubell Niranjan Balasubramanian Leon Derczynski Iryna Gurevych Roy Schwartz 28 109 0 31 Aug 2022
Layer-wise Pruning of Transformer Attention Heads for Efficient Language Modeling Kyuhong Shim Iksoo Choi Wonyong Sung Jungwook Choi 21 15 0 07 Oct 2021
Hierarchical Transformer-based Large-Context End-to-end ASR with Large-Context Knowledge Distillation Ryo Masumura Naoki Makishima Mana Ihori Akihiko Takashima Tomohiro Tanaka Shota Orihashi 21 29 0 16 Feb 2021
I-BERT: Integer-only BERT Quantization Sehoon Kim A. Gholami Z. Yao Michael W. Mahoney Kurt Keutzer MQ 93 341 0 05 Jan 2021
Big Bird: Transformers for Longer Sequences Manzil Zaheer Guru Guruganesh Kumar Avinava Dubey Joshua Ainslie Chris Alberti ... Philip Pham Anirudh Ravula Qifan Wang Li Yang Amr Ahmed VLM 268 2,013 0 28 Jul 2020
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT Sheng Shen Zhen Dong Jiayu Ye Linjian Ma Z. Yao A. Gholami Michael W. Mahoney Kurt Keutzer MQ 225 575 0 12 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 297 6,956 0 20 Apr 2018