v1v2 (latest)

Achieving Super-Linear Speedup across Multi-FPGA for Real-Time DNN Inference

ACM Transactions on Embedded Computing Systems (ACM TECS), 2019

21 July 2019

Papers citing "Achieving Super-Linear Speedup across Multi-FPGA for Real-Time DNN Inference"

24 / 24 papers shown

Embedded Distributed Inference of Deep Neural Networks: A Systematic Review

Federico Nicolás Peccia

Oliver Bringmann

286

06 May 2024

SGPRS: Seamless GPU Partitioning Real-Time Scheduler for Periodic Deep Learning Workloads

Amir Fakhim Babaei

Thidapat Chantem

13 Apr 2024

TAPA-CS: Enabling Scalable Accelerator Design on Distributed HBM-FPGAs

277

16 Nov 2023

Throughput Maximization of DNN Inference: Batching or Multi-Tenancy?

Seyed Morteza Nabavinejad

M. Ebrahimi

Sherief Reda

295

26 Aug 2023

MARS: Exploiting Multi-Level Parallelism for DNN Workloads on Adaptive Multi-Accelerator SystemsDesign Automation Conference (DAC), 2023

Jieru Zhao

Wenchao Ding

141

23 Jul 2023

On-Device Unsupervised Image SegmentationDesign Automation Conference (DAC), 2023

277

24 Feb 2023

Auditing Membership Leakages of Multi-Exit NetworksConference on Computer and Communications Security (CCS), 2022

Zheng Li

Michael Backes

222

23 Aug 2022

H2H: Heterogeneous Model to Heterogeneous System Mapping with Computation and Communication AwarenessDesign Automation Conference (DAC), 2022

162

29 Apr 2022

A Semi-Decoupled Approach to Fast and Optimal Hardware-Software Co-Design of Neural Accelerators

260

25 Mar 2022

The Larger The Fairer? Small Neural Networks Can Achieve Fairness for Edge DevicesDesign Automation Conference (DAC), 2022

240

23 Feb 2022

EF-Train: Enable Efficient On-device CNN Training on FPGA Through Data Reshaping for Online Adaptation or Personalization

183

18 Feb 2022

Accelerating Framework of Transformer by Hardware Design and Model Compression Co-Optimization

Hongwu Peng

246

19 Oct 2021

Exploration of Quantum Neural Architecture by Mixing Quantum Neuron Designs

Zhiding Liang

Caiwen Ding

311

08 Sep 2021

Can Noise on Qubits Be Learned in Quantum Neural Network? A Case Study on QuantumFlow

Zhiding Liang

Jinjun Xiong

308

08 Sep 2021

Enabling OpenMP Task Parallelism on Multi-FPGAsIEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), 2021

123

19 Mar 2021

Dancing along Battery: Enabling Transformer with Run-time Reconfigurability on Mobile DevicesDesign Automation Conference (DAC), 2021

Caiwen Ding

166

12 Feb 2021

When Machine Learning Meets Quantum Computers: A Case StudyAsia and South Pacific Design Automation Conference (ASP-DAC), 2020

Weiwen Jiang

Jinjun Xiong

Yiyu Shi

231

18 Dec 2020

A Panda? No, It's a Sloth: Slowdown Attacks on Adaptive Multi-Exit Neural Network InferenceInternational Conference on Learning Representations (ICLR), 2020

Sanghyun Hong

284

06 Oct 2020

ConfuciuX: Autonomous Hardware Resource Assignment for DNN Accelerators using Reinforcement LearningMicro (MICRO), 2020

Sheng-Chun Kao

Geonhwa Jeong

T. Krishna

350

114

04 Sep 2020

Standing on the Shoulders of Giants: Hardware and Neural Architecture Co-Search with Hot StartIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2020

266

17 Jul 2020

Hardware Acceleration of Sparse and Irregular Tensor Computations of ML Models: A Survey and Insights

335

100

02 Jul 2020

Co-Exploration of Neural Architectures and Heterogeneous ASIC Accelerator Designs Targeting Multiple TasksDesign Automation Conference (DAC), 2020

293

123

10 Feb 2020

Device-Circuit-Architecture Co-Exploration for Computing-in-Memory Neural AcceleratorsIEEE transactions on computers (IEEE Trans. Comput.), 2019

694

31 Oct 2019

When Single Event Upset Meets Deep Neural Networks: Observations, Explorations, and RemediesAsia and South Pacific Design Automation Conference (ASP-DAC), 2019

240

10 Sep 2019