Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.10770
Cited By
Efficient Softmax Approximation for Deep Neural Networks with Attention Mechanism
21 November 2021
Ihor Vasyltsov
Wooseok Chang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Efficient Softmax Approximation for Deep Neural Networks with Attention Mechanism"
7 / 7 papers shown
Title
EXAQ: Exponent Aware Quantization For LLMs Acceleration
Moran Shkolnik
Maxim Fishman
Brian Chmiel
Hilla Ben-Yaacov
Ron Banner
Kfir Y. Levy
MQ
16
0
0
04 Oct 2024
KWT-Tiny: RISC-V Accelerated, Embedded Keyword Spotting Transformer
Aness Al-Qawlaq
Ajay Kumar
Deepu John
24
0
0
22 Jul 2024
Integer-only Quantized Transformers for Embedded FPGA-based Time-series Forecasting in AIoT
Tianheng Ling
Chao Qian
Gregor Schiele
AI4TS
MQ
19
1
0
06 Jul 2024
SimA: Simple Softmax-free Attention for Vision Transformers
Soroush Abbasi Koohpayegani
Hamed Pirsiavash
16
25
0
17 Jun 2022
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
225
575
0
12 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,950
0
20 Apr 2018
OpenNMT: Open-Source Toolkit for Neural Machine Translation
Guillaume Klein
Yoon Kim
Yuntian Deng
Jean Senellart
Alexander M. Rush
254
1,896
0
10 Jan 2017
1