ConSmax: Hardware-Friendly Alternative Softmax with Learnable Parameters

31 January 2024

Papers citing "ConSmax: Hardware-Friendly Alternative Softmax with Learnable Parameters"

7 / 7 papers shown

Title
VEXP: A Low-Cost RISC-V ISA Extension for Accelerated Softmax Computation in Transformers Run Wang Gamze Islamoglu Andrea Belano Viviane Potocnik Francesco Conti Angelo Garofalo Luca Benini 26 0 0 15 Apr 2025
PIM-LLM: A High-Throughput Hybrid PIM Architecture for 1-bit LLMs Jinendra Malekar Peyton S. Chandarana Md Hasibul Amin Mohammed E. Elbtity Ramtin Zand 26 1 0 31 Mar 2025
SoftmAP: Software-Hardware Co-design for Integer-Only Softmax on Associative Processors M. Rakka J. Li Guohao Dai A. Eltawil M. Fouda Fadi J. Kurdahi 60 1 0 26 Nov 2024
More Expressive Attention with Negative Weights Ang Lv Ruobing Xie Shuaipeng Li Jiayi Liao X. Sun Zhanhui Kang Di Wang Rui Yan 30 0 0 11 Nov 2024
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective Jinhao Li Jiaming Xu Shan Huang Yonghua Chen Wen Li ... Jiayi Pan Li Ding Hao Zhou Yu Wang Guohao Dai 57 15 0 06 Oct 2024
DFX: A Low-latency Multi-FPGA Appliance for Accelerating Transformer-based Text Generation Seongmin Hong Seungjae Moon Junsoo Kim Sungjae Lee Minsub Kim Dongsoo Lee Joo-Young Kim 64 76 0 22 Sep 2022
I-BERT: Integer-only BERT Quantization Sehoon Kim A. Gholami Z. Yao Michael W. Mahoney Kurt Keutzer MQ 86 336 0 05 Jan 2021