v1v2 (latest)

Streamlined Deployment for Quantized Neural Networks

12 September 2017

Yaman Umuroglu

Magnus Jahre

ArXiv (abs)PDF HTML

Papers citing "Streamlined Deployment for Quantized Neural Networks"

12 / 12 papers shown

Real-Time Multi-Object Tracking using YOLOv8 and SORT on a SoC FPGAInternational Workshop on Applied Reconfigurable Computing (ARC), 2025

Michal Danilowicz

T. Kryjak

VOT

307

17 Mar 2025

Fast, Scalable, Energy-Efficient Non-element-wise Matrix Multiplication on FPGA

Klaus D. McDonald-Maier

X. Zhai

254

02 Jul 2024

A2Q+: Improving Accumulator-Aware Weight Quantization

Ian Colbert

Alessandro Pappalardo

Jakoba Petri-Koenig

Yaman Umuroglu

262

19 Jan 2024

A2Q: Accumulator-Aware Quantization with Guaranteed Overflow AvoidanceIEEE International Conference on Computer Vision (ICCV), 2023

Ian Colbert

Alessandro Pappalardo

Jakoba Petri-Koenig

311

25 Aug 2023

Open-source FPGA-ML codesign for the MLPerf Tiny Benchmark

H. Borras

G. D. Guglielmo

Javier Mauricio Duarte

...

199

23 Jun 2022

Applications and Techniques for Fast Machine Learning in ScienceFrontiers in Big Data (Front. Big Data), 2021

...

294

25 Oct 2021

Benchmarking Quantized Neural Networks on FPGAs with FINN

158

02 Feb 2021

Diagnostic data integration using deep neural networks for real-time plasma analysisIEEE Transactions on Nuclear Science (TNS), 2020

235

28 Oct 2020

Resource-Efficient Speech Mask Estimation for Multi-Channel Speech Enhancement

Holger Fröning

142

22 Jul 2020

Quantized Neural Network Inference with Precision Batching

Maximilian Lam

Zachary Yedidia

Colby R. Banbury

Vijay Janapa Reddi

203

26 Feb 2020

Memory-Driven Mixed Low Precision Quantization For Enabling Deep Network Inference On MicrocontrollersConference on Machine Learning and Systems (MLSys), 2019

Manuele Rusci

Alessandro Capotondi

Luca Benini

360

30 May 2019

High performance ultra-low-precision convolutions on mobile devices

Andrew Tulloch

Yangqing Jia

HAI MQ

209

06 Dec 2017