ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.05877
  4. Cited By
Quantization and Training of Neural Networks for Efficient
  Integer-Arithmetic-Only Inference

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

15 December 2017
Benoit Jacob
S. Kligys
Bo Chen
Menglong Zhu
Matthew Tang
Andrew G. Howard
Hartwig Adam
Dmitry Kalenichenko
    MQ
ArXivPDFHTML

Papers citing "Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference"

50 / 1,255 papers shown
Title
Hybrid Offline-online Scheduling Method for Large Language Model Inference Optimization
Hybrid Offline-online Scheduling Method for Large Language Model Inference Optimization
Bowen Pang
Kai Li
Ruifeng She
Feifan Wang
OffRL
43
2
0
14 Feb 2025
Vision-Language Models for Edge Networks: A Comprehensive Survey
Vision-Language Models for Edge Networks: A Comprehensive Survey
Ahmed Sharshar
Latif U. Khan
Waseem Ullah
Mohsen Guizani
VLM
70
3
0
11 Feb 2025
Finetuning and Quantization of EEG-Based Foundational BioSignal Models on ECG and PPG Data for Blood Pressure Estimation
Finetuning and Quantization of EEG-Based Foundational BioSignal Models on ECG and PPG Data for Blood Pressure Estimation
Bálint Tóth
Dominik Senti
T. Ingolfsson
Jeffrey Zweidler
Alexandre Elsig
Luca Benini
Yawei Li
38
0
0
10 Feb 2025
Membership Inference Risks in Quantized Models: A Theoretical and Empirical Study
Eric Aubinais
Philippe Formont
Pablo Piantanida
Elisabeth Gassiat
45
0
0
10 Feb 2025
Performance Analysis of Traditional VQA Models Under Limited Computational Resources
Jihao Gu
44
0
0
09 Feb 2025
Nearly Lossless Adaptive Bit Switching
Nearly Lossless Adaptive Bit Switching
Haiduo Huang
Zhenhua Liu
Tian Xia
Wenzhe zhao
Pengju Ren
MQ
58
0
0
03 Feb 2025
LLM-based Affective Text Generation Quality Based on Different Quantization Values
LLM-based Affective Text Generation Quality Based on Different Quantization Values
Yarik Menchaca Resendiz
Roman Klinger
MQ
90
0
0
31 Jan 2025
UAV-Assisted Real-Time Disaster Detection Using Optimized Transformer Model
UAV-Assisted Real-Time Disaster Detection Using Optimized Transformer Model
Branislava Jankovic
Sabina Jangirova
Waseem Ullah
Latif U. Khan
Mohsen Guizani
31
0
0
21 Jan 2025
Multi-stage Training of Bilingual Islamic LLM for Neural Passage Retrieval
Multi-stage Training of Bilingual Islamic LLM for Neural Passage Retrieval
Vera Pavlova
48
2
0
20 Jan 2025
UPAQ: A Framework for Real-Time and Energy-Efficient 3D Object Detection in Autonomous Vehicles
UPAQ: A Framework for Real-Time and Energy-Efficient 3D Object Detection in Autonomous Vehicles
Abhishek Balasubramaniam
Febin P. Sunny
S. Pasricha
3DPC
39
0
0
08 Jan 2025
Clinical Insights: A Comprehensive Review of Language Models in Medicine
Clinical Insights: A Comprehensive Review of Language Models in Medicine
Nikita Neveditsin
Pawan Lingras
V. Mago
LM&MA
58
4
0
08 Jan 2025
Quantization Meets Reasoning: Exploring LLM Low-Bit Quantization Degradation for Mathematical Reasoning
Quantization Meets Reasoning: Exploring LLM Low-Bit Quantization Degradation for Mathematical Reasoning
Zhen Li
Yupeng Su
Runming Yang
C. Xie
Z. Wang
Zhongwei Xie
Ngai Wong
Hongxia Yang
MQ
LRM
44
3
0
06 Jan 2025
Pruning-based Data Selection and Network Fusion for Efficient Deep Learning
Humaira Kousar
Hasnain Irshad Bhatti
Jaekyun Moon
32
0
0
03 Jan 2025
PTQ4VM: Post-Training Quantization for Visual Mamba
PTQ4VM: Post-Training Quantization for Visual Mamba
Younghyun Cho
Changhun Lee
Seonggon Kim
Eunhyeok Park
MQ
Mamba
43
2
0
29 Dec 2024
CBNN: 3-Party Secure Framework for Customized Binary Neural Networks
  Inference
CBNN: 3-Party Secure Framework for Customized Binary Neural Networks Inference
Benchang Dong
Zhili Chen
Xin Chen
Shiwen Wei
Jie Fu
Huifa Li
73
0
0
21 Dec 2024
Improving Quantization-aware Training of Low-Precision Network via Block
  Replacement on Full-Precision Counterpart
Improving Quantization-aware Training of Low-Precision Network via Block Replacement on Full-Precision Counterpart
Chengting Yu
Shu Yang
Fengzhao Zhang
Hanzhi Ma
Aili Wang
Er-ping Li
MQ
77
2
0
20 Dec 2024
MPQ-DM: Mixed Precision Quantization for Extremely Low Bit Diffusion
  Models
MPQ-DM: Mixed Precision Quantization for Extremely Low Bit Diffusion Models
Weilun Feng
Haotong Qin
Chuanguang Yang
Zhulin An
Libo Huang
Boyu Diao
Fei Wang
Renshuai Tao
Y. Xu
Michele Magno
DiffM
MQ
80
5
0
16 Dec 2024
Efficient Quantization-Aware Training on Segment Anything Model in
  Medical Images and Its Deployment
Efficient Quantization-Aware Training on Segment Anything Model in Medical Images and Its Deployment
Haisheng Lu
Yujie Fu
Fan Zhang
Le Zhang
MedIm
MQ
68
0
0
15 Dec 2024
Optimising TinyML with Quantization and Distillation of Transformer and
  Mamba Models for Indoor Localisation on Edge Devices
Optimising TinyML with Quantization and Distillation of Transformer and Mamba Models for Indoor Localisation on Edge Devices
Thanaphon Suwannaphong
Ferdian Jovan
I. Craddock
Ryan McConville
Mamba
68
1
0
12 Dec 2024
MOFHEI: Model Optimizing Framework for Fast and Efficient
  Homomorphically Encrypted Neural Network Inference
MOFHEI: Model Optimizing Framework for Fast and Efficient Homomorphically Encrypted Neural Network Inference
Parsa Ghazvinian
Robert Podschwadt
Prajwal Panzade
Mohammad H. Rafiei
Daniel Takabi
72
0
0
10 Dec 2024
DEX: Data Channel Extension for Efficient CNN Inference on Tiny AI
  Accelerators
DEX: Data Channel Extension for Efficient CNN Inference on Tiny AI Accelerators
Taesik Gong
F. Kawsar
Chulhong Min
64
3
0
09 Dec 2024
SKIM: Any-bit Quantization Pushing The Limits of Post-Training
  Quantization
SKIM: Any-bit Quantization Pushing The Limits of Post-Training Quantization
Runsheng Bai
Qiang Liu
B. Liu
MQ
61
1
0
05 Dec 2024
Designing DNNs for a trade-off between robustness and processing
  performance in embedded devices
Designing DNNs for a trade-off between robustness and processing performance in embedded devices
Jon Gutiérrez-Zaballa
Koldo Basterretxea
Javier Echanobe
AAML
96
2
0
04 Dec 2024
CPTQuant -- A Novel Mixed Precision Post-Training Quantization
  Techniques for Large Language Models
CPTQuant -- A Novel Mixed Precision Post-Training Quantization Techniques for Large Language Models
Amitash Nanda
Sree Bhargavi Balija
D. Sahoo
MQ
59
0
0
03 Dec 2024
Behavior Backdoor for Deep Learning Models
Behavior Backdoor for Deep Learning Models
J. T. Wang
Pengfei Zhang
R. Tao
Jian Yang
Hao Liu
X. Liu
Y. X. Wei
Yao Zhao
AAML
75
0
0
02 Dec 2024
Quantization-Aware Imitation-Learning for Resource-Efficient Robotic
  Control
Quantization-Aware Imitation-Learning for Resource-Efficient Robotic Control
Seongmin Park
Hyungmin Kim
Wonseok Jeon
Juyoung Yang
Byeongwook Jeon
Yoonseon Oh
Jungwook Choi
93
1
0
02 Dec 2024
On-chip Hyperspectral Image Segmentation with Fully Convolutional
  Networks for Scene Understanding in Autonomous Driving
On-chip Hyperspectral Image Segmentation with Fully Convolutional Networks for Scene Understanding in Autonomous Driving
Jon Gutiérrez-Zaballa
Koldo Basterretxea
Javier Echanobe
M. Victoria Martínez
Unai Martínez-Corral
Óscar Mata-Carballeira
Inés del Campo
100
17
0
28 Nov 2024
Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for
  Quantized LLMs with 100T Training Tokens
Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens
Xu Ouyang
Tao Ge
Thomas Hartvigsen
Zhisong Zhang
Haitao Mi
Dong Yu
MQ
90
3
0
26 Nov 2024
Rapid Deployment of Domain-specific Hyperspectral Image Processors with
  Application to Autonomous Driving
Rapid Deployment of Domain-specific Hyperspectral Image Processors with Application to Autonomous Driving
Jon Gutiérrez-Zaballa
Koldo Basterretxea
Javier Echanobe
Óscar Mata-Carballeira
M. Victoria Martínez
82
3
0
26 Nov 2024
PassionSR: Post-Training Quantization with Adaptive Scale in One-Step
  Diffusion based Image Super-Resolution
PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion based Image Super-Resolution
Libo Zhu
J. Li
Haotong Qin
W. J. Li
Yulun Zhang
Yong Guo
Xiaokang Yang
DiffM
MQ
72
2
0
26 Nov 2024
Efficient Ternary Weight Embedding Model: Bridging Scalability and
  Performance
Efficient Ternary Weight Embedding Model: Bridging Scalability and Performance
Jiayi Chen
Chen Wu
S. Zhang
Nan Li
L. Zhang
Qi Zhang
69
0
0
23 Nov 2024
Ex Uno Pluria: Insights on Ensembling in Low Precision Number Systems
Ex Uno Pluria: Insights on Ensembling in Low Precision Number Systems
G. Nam
Juho Lee
69
0
0
22 Nov 2024
Exploring the Robustness and Transferability of Patch-Based Adversarial Attacks in Quantized Neural Networks
Exploring the Robustness and Transferability of Patch-Based Adversarial Attacks in Quantized Neural Networks
Amira Guesmi
B. Ouni
Muhammad Shafique
AAML
74
0
0
22 Nov 2024
Quantized symbolic time series approximation
Quantized symbolic time series approximation
Erin Carson
Xinye Chen
Cheng Kang
AI4TS
74
0
0
20 Nov 2024
Understanding Student Sentiment on Mental Health Support in Colleges
  Using Large Language Models
Understanding Student Sentiment on Mental Health Support in Colleges Using Large Language Models
Palak Sood
Chengyang He
Divyanshu Gupta
Yue Ning
Ping Wang
AI4MH
70
0
0
18 Nov 2024
Towards Accurate and Efficient Sub-8-Bit Integer Training
Wenjin Guo
Donglai Liu
Weiying Xie
Yunsong Li
Xuefei Ning
Zihan Meng
Shulin Zeng
Jie Lei
Zhenman Fang
Yu Wang
MQ
34
1
0
17 Nov 2024
Bag of Design Choices for Inference of High-Resolution Masked Generative Transformer
Shitong Shao
Zikai Zhou
Tian Ye
Lichen Bai
Zhiqiang Xu
Zeke Xie
DiffM
49
0
0
16 Nov 2024
CULL-MT: Compression Using Language and Layer pruning for Machine
  Translation
CULL-MT: Compression Using Language and Layer pruning for Machine Translation
Pedram Rostami
M. Dousti
32
0
0
10 Nov 2024
Building an Efficient Multilingual Non-Profit IR System for the Islamic
  Domain Leveraging Multiprocessing Design in Rust
Building an Efficient Multilingual Non-Profit IR System for the Islamic Domain Leveraging Multiprocessing Design in Rust
Vera Pavlova
Mohammed Makhlouf
29
1
0
09 Nov 2024
Optimizing Large Language Models through Quantization: A Comparative
  Analysis of PTQ and QAT Techniques
Optimizing Large Language Models through Quantization: A Comparative Analysis of PTQ and QAT Techniques
Jahid Hasan
MQ
25
1
0
09 Nov 2024
Towards Multi-Modal Mastery: A 4.5B Parameter Truly Multi-Modal Small
  Language Model
Towards Multi-Modal Mastery: A 4.5B Parameter Truly Multi-Modal Small Language Model
Ben Koska
Mojmír Horváth
MoE
37
1
0
08 Nov 2024
Fine-Grained Reward Optimization for Machine Translation using Error Severity Mappings
Fine-Grained Reward Optimization for Machine Translation using Error Severity Mappings
Miguel Moura Ramos
Tomás Almeida
Daniel Vareta
Filipe Azevedo
Sweta Agrawal
Patrick Fernandes
André F. T. Martins
31
1
0
08 Nov 2024
Poor Man's Training on MCUs: A Memory-Efficient Quantized
  Back-Propagation-Free Approach
Poor Man's Training on MCUs: A Memory-Efficient Quantized Back-Propagation-Free Approach
Yequan Zhao
Hai Li
Ian Young
Zheng-Wei Zhang
MQ
37
2
0
07 Nov 2024
Scaling Laws for Precision
Scaling Laws for Precision
Tanishq Kumar
Zachary Ankner
Benjamin Spector
Blake Bordelon
Niklas Muennighoff
Mansheej Paul
C. Pehlevan
Christopher Ré
Aditi Raghunathan
AIFin
MoMe
46
13
0
07 Nov 2024
Decoupling Dark Knowledge via Block-wise Logit Distillation for
  Feature-level Alignment
Decoupling Dark Knowledge via Block-wise Logit Distillation for Feature-level Alignment
Chengting Yu
Fengzhao Zhang
Ruizhe Chen
Zuozhu Liu
Shurun Tan
Er-ping Li
Aili Wang
36
2
0
03 Nov 2024
BF-IMNA: A Bit Fluid In-Memory Neural Architecture for Neural Network
  Acceleration
BF-IMNA: A Bit Fluid In-Memory Neural Architecture for Neural Network Acceleration
M. Rakka
Rachid Karami
A. Eltawil
M. Fouda
Fadi J. Kurdahi
MQ
39
1
0
03 Nov 2024
On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance
On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance
Jaskirat Singh
Bram Adams
Ahmed E. Hassan
VLM
43
0
0
01 Nov 2024
Breaking Determinism: Fuzzy Modeling of Sequential Recommendation Using
  Discrete State Space Diffusion Model
Breaking Determinism: Fuzzy Modeling of Sequential Recommendation Using Discrete State Space Diffusion Model
Wenjia Xie
Hao Wang
L. Zhang
Rui Zhou
Defu Lian
Enhong Chen
DiffM
41
3
0
31 Oct 2024
Neural Model Checking
Neural Model Checking
Mirco Giacobbe
Daniel Kroening
Abhinandan Pal
Michael Tautschnig
NAI
24
1
0
31 Oct 2024
IntLoRA: Integral Low-rank Adaptation of Quantized Diffusion Models
IntLoRA: Integral Low-rank Adaptation of Quantized Diffusion Models
Hang Guo
Yawei Li
Tao Dai
Shu-Tao Xia
Luca Benini
MQ
29
1
0
29 Oct 2024
Previous
12345...242526
Next