ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1410.0759
  4. Cited By
cuDNN: Efficient Primitives for Deep Learning

cuDNN: Efficient Primitives for Deep Learning

3 October 2014
Sharan Chetlur
Cliff Woolley
Philippe Vandermersch
Jonathan M. Cohen
J. Tran
Bryan Catanzaro
Evan Shelhamer
ArXivPDFHTML

Papers citing "cuDNN: Efficient Primitives for Deep Learning"

50 / 236 papers shown
Title
A compact butterfly-style silicon photonic-electronic neural chip for
  hardware-efficient deep learning
A compact butterfly-style silicon photonic-electronic neural chip for hardware-efficient deep learning
Chenghao Feng
Jiaqi Gu
Hanqing Zhu
Zhoufeng Ying
Zheng Zhao
David Z. Pan
Ray T. Chen
22
39
0
11 Nov 2021
Bolt: Bridging the Gap between Auto-tuners and Hardware-native
  Performance
Bolt: Bridging the Gap between Auto-tuners and Hardware-native Performance
Jiarong Xing
Leyuan Wang
Shang Zhang
Jack H Chen
Ang Chen
Yibo Zhu
33
43
0
25 Oct 2021
Characterizing and Demystifying the Implicit Convolution Algorithm on
  Commercial Matrix-Multiplication Accelerators
Characterizing and Demystifying the Implicit Convolution Algorithm on Commercial Matrix-Multiplication Accelerators
Yangjie Zhou
Mengtian Yang
Cong Guo
Jingwen Leng
Yun Liang
Quan Chen
M. Guo
Yuhao Zhu
34
33
0
08 Oct 2021
AdjointBackMapV2: Precise Reconstruction of Arbitrary CNN Unit's
  Activation via Adjoint Operators
AdjointBackMapV2: Precise Reconstruction of Arbitrary CNN Unit's Activation via Adjoint Operators
Qing Wan
Siu Wun Cheung
Yoonsuck Choe
29
0
0
04 Oct 2021
RED++ : Data-Free Pruning of Deep Neural Networks via Input Splitting
  and Output Merging
RED++ : Data-Free Pruning of Deep Neural Networks via Input Splitting and Output Merging
Edouard Yvinec
Arnaud Dapogny
Matthieu Cord
Kévin Bailly
28
15
0
30 Sep 2021
Stochastic Training is Not Necessary for Generalization
Stochastic Training is Not Necessary for Generalization
Jonas Geiping
Micah Goldblum
Phillip E. Pope
Michael Moeller
Tom Goldstein
89
72
0
29 Sep 2021
Real-Time Glaucoma Detection from Digital Fundus Images using Self-ONNs
Real-Time Glaucoma Detection from Digital Fundus Images using Self-ONNs
Ozer Can Devecioglu
Junaid Malik
T. Ince
S. Kiranyaz
E. Atalay
Moncef Gabbouj
37
47
0
28 Sep 2021
TSM: Temporal Shift Module for Efficient and Scalable Video
  Understanding on Edge Device
TSM: Temporal Shift Module for Efficient and Scalable Video Understanding on Edge Device
Ji Lin
Chuang Gan
Kuan-Chieh Jackson Wang
Song Han
40
64
0
27 Sep 2021
Union: A Unified HW-SW Co-Design Ecosystem in MLIR for Evaluating Tensor
  Operations on Spatial Accelerators
Union: A Unified HW-SW Co-Design Ecosystem in MLIR for Evaluating Tensor Operations on Spatial Accelerators
Geonhwa Jeong
Gokcen Kestor
Prasanth Chatarasi
A. Parashar
Po-An Tsai
S. Rajamanickam
R. Gioiosa
T. Krishna
35
13
0
15 Sep 2021
Impact of GPU uncertainty on the training of predictive deep neural networks
Maciej Pietrowski
A. Gajda
Takuto Yamamoto
Taisuke Kobayashi
Lana Sinapayen
Eiji Watanabe
BDL
19
0
0
03 Sep 2021
spectrai: A deep learning framework for spectral data
spectrai: A deep learning framework for spectral data
C. Horgan
Mads S. Bergholt
VLM
21
4
0
17 Aug 2021
perf4sight: A toolflow to model CNN training performance on Edge GPUs
perf4sight: A toolflow to model CNN training performance on Edge GPUs
A. Rajagopal
C. Bouganis
23
7
0
12 Aug 2021
Improve Agents without Retraining: Parallel Tree Search with Off-Policy
  Correction
Improve Agents without Retraining: Parallel Tree Search with Off-Policy Correction
Assaf Hallak
Gal Dalal
Steven Dalton
I. Frosio
Shie Mannor
Gal Chechik
OffRL
OnRL
35
9
0
04 Jul 2021
Randomness In Neural Network Training: Characterizing The Impact of
  Tooling
Randomness In Neural Network Training: Characterizing The Impact of Tooling
Donglin Zhuang
Xingyao Zhang
Shuaiwen Leon Song
Sara Hooker
25
75
0
22 Jun 2021
HELP: Hardware-Adaptive Efficient Latency Prediction for NAS via
  Meta-Learning
HELP: Hardware-Adaptive Efficient Latency Prediction for NAS via Meta-Learning
Hayeon Lee
Sewoong Lee
Song Chong
Sung Ju Hwang
21
26
0
16 Jun 2021
Automated Parking Space Detection Using Convolutional Neural Networks
Automated Parking Space Detection Using Convolutional Neural Networks
Julien Nyambal
Richard Klein
22
48
0
14 Jun 2021
Post-Training Sparsity-Aware Quantization
Post-Training Sparsity-Aware Quantization
Gil Shomron
F. Gabbay
Samer Kurzum
U. Weiser
MQ
41
33
0
23 May 2021
Content-adaptive Representation Learning for Fast Image Super-resolution
Content-adaptive Representation Learning for Fast Image Super-resolution
Yukai Shi
Jinghui Qin
SupR
26
0
0
20 May 2021
Dual-side Sparse Tensor Core
Dual-side Sparse Tensor Core
Yang-Feng Wang
Chen Zhang
Zhiqiang Xie
Cong Guo
Yunxin Liu
Jingwen Leng
25
75
0
20 May 2021
Efficient and Generic 1D Dilated Convolution Layer for Deep Learning
Efficient and Generic 1D Dilated Convolution Layer for Deep Learning
Narendra Chaudhary
Sanchit Misra
Dhiraj D. Kalamkar
A. Heinecke
E. Georganas
Barukh Ziv
Menachem Adelman
Bharat Kaul
32
9
0
16 Apr 2021
Tensor Processing Primitives: A Programming Abstraction for Efficiency
  and Portability in Deep Learning & HPC Workloads
Tensor Processing Primitives: A Programming Abstraction for Efficiency and Portability in Deep Learning & HPC Workloads
E. Georganas
Dhiraj D. Kalamkar
Sasikanth Avancha
Menachem Adelman
Deepti Aggarwal
...
Ramanarayan Mohanty
Hans Pabst
Brian Retford
Barukh Ziv
A. Heinecke
37
17
0
12 Apr 2021
LiBRe: A Practical Bayesian Approach to Adversarial Detection
LiBRe: A Practical Bayesian Approach to Adversarial Detection
Zhijie Deng
Xiao Yang
Shizhen Xu
Hang Su
Jun Zhu
BDL
AAML
20
61
0
27 Mar 2021
Enabling Design Methodologies and Future Trends for Edge AI:
  Specialization and Co-design
Enabling Design Methodologies and Future Trends for Edge AI: Specialization and Co-design
Cong Hao
Jordan Dotzel
Jinjun Xiong
Luca Benini
Zhiru Zhang
Deming Chen
58
34
0
25 Mar 2021
Scaling Local Self-Attention for Parameter Efficient Visual Backbones
Scaling Local Self-Attention for Parameter Efficient Visual Backbones
Ashish Vaswani
Prajit Ramachandran
A. Srinivas
Niki Parmar
Blake A. Hechtman
Jonathon Shlens
27
395
0
23 Mar 2021
Proof-of-Learning: Definitions and Practice
Proof-of-Learning: Definitions and Practice
Hengrui Jia
Mohammad Yaghini
Christopher A. Choquette-Choo
Natalie Dullerud
Anvith Thudi
Varun Chandrasekaran
Nicolas Papernot
AAML
25
99
0
09 Mar 2021
Dynamic Precision Analog Computing for Neural Networks
Dynamic Precision Analog Computing for Neural Networks
Sahaj Garg
Joe Lou
Anirudh Jain
Mitchell Nahmias
45
33
0
12 Feb 2021
A Runtime-Based Computational Performance Predictor for Deep Neural
  Network Training
A Runtime-Based Computational Performance Predictor for Deep Neural Network Training
Geoffrey X. Yu
Yubo Gao
P. Golikov
Gennady Pekhimenko
3DH
28
67
0
31 Jan 2021
Clairvoyant Prefetching for Distributed Machine Learning I/O
Clairvoyant Prefetching for Distributed Machine Learning I/O
Nikoli Dryden
Roman Böhringer
Tal Ben-Nun
Torsten Hoefler
31
56
0
21 Jan 2021
RepVGG: Making VGG-style ConvNets Great Again
RepVGG: Making VGG-style ConvNets Great Again
Xiaohan Ding
Xinming Zhang
Ningning Ma
Jungong Han
Guiguang Ding
Jian Sun
136
1,550
0
11 Jan 2021
Hardware and Software Optimizations for Accelerating Deep Neural
  Networks: Survey of Current Trends, Challenges, and the Road Ahead
Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead
Maurizio Capra
Beatrice Bussolino
Alberto Marchisio
Guido Masera
Maurizio Martina
Muhammad Shafique
BDL
59
140
0
21 Dec 2020
Quantum Optical Convolutional Neural Network: A Novel Image Recognition
  Framework for Quantum Computing
Quantum Optical Convolutional Neural Network: A Novel Image Recognition Framework for Quantum Computing
Rishab Parthasarathy
Rohan T. Bhowmik
19
26
0
19 Dec 2020
Results of the 2020 fastMRI Challenge for Machine Learning MR Image
  Reconstruction
Results of the 2020 fastMRI Challenge for Machine Learning MR Image Reconstruction
Matthew Muckley
Bruno Riemenschneider
A. Radmanesh
Sunwoo Kim
Geunu Jeong
...
Anuroop Sriram
Zhengnan Huang
N. Yakubova
Yvonne W. Lui
Florian Knoll
OOD
21
45
0
09 Dec 2020
Nimble: Lightweight and Parallel GPU Task Scheduling for Deep Learning
Nimble: Lightweight and Parallel GPU Task Scheduling for Deep Learning
Woosuk Kwon
Gyeong-In Yu
Eunji Jeong
Byung-Gon Chun
32
68
0
04 Dec 2020
GPUTreeShap: Massively Parallel Exact Calculation of SHAP Scores for
  Tree Ensembles
GPUTreeShap: Massively Parallel Exact Calculation of SHAP Scores for Tree Ensembles
Rory Mitchell
E. Frank
G. Holmes
14
55
0
27 Oct 2020
Pruning Convolutional Filters using Batch Bridgeout
Pruning Convolutional Filters using Batch Bridgeout
Najeeb Khan
Ian Stavness
28
3
0
23 Sep 2020
Review: Deep Learning in Electron Microscopy
Review: Deep Learning in Electron Microscopy
Jeffrey M. Ede
36
79
0
17 Sep 2020
Identifying Flux Rope Signatures Using a Deep Neural Network
Identifying Flux Rope Signatures Using a Deep Neural Network
L. F. G. dos Santos
A. Narock
T. Nieves-chinchilla
Marlon Núñez
M. Kirk
9
17
0
30 Aug 2020
Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise
  Sparsity
Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise Sparsity
Cong Guo
B. Hsueh
Jingwen Leng
Yuxian Qiu
Yue Guan
Zehuan Wang
Xiaoying Jia
Xipeng Li
M. Guo
Yuhao Zhu
35
83
0
29 Aug 2020
Self-Organized Operational Neural Networks for Severe Image Restoration
  Problems
Self-Organized Operational Neural Networks for Severe Image Restoration Problems
Junaid Malik
S. Kiranyaz
Moncef Gabbouj
16
69
0
29 Aug 2020
SparseRT: Accelerating Unstructured Sparsity on GPUs for Deep Learning
  Inference
SparseRT: Accelerating Unstructured Sparsity on GPUs for Deep Learning Inference
Ziheng Wang
40
66
0
26 Aug 2020
Hardware Accelerator for Adversarial Attacks on Deep Learning Neural
  Networks
Hardware Accelerator for Adversarial Attacks on Deep Learning Neural Networks
Haoqiang Guo
Lu Peng
Jian Zhang
Fang Qi
Lide Duan
AAML
16
6
0
03 Aug 2020
The Case for Strong Scaling in Deep Learning: Training Large 3D CNNs
  with Hybrid Parallelism
The Case for Strong Scaling in Deep Learning: Training Large 3D CNNs with Hybrid Parallelism
Yosuke Oyama
N. Maruyama
Nikoli Dryden
Erin McCarthy
P. Harrington
J. Balewski
Satoshi Matsuoka
Peter Nugent
B. Van Essen
3DV
AI4CE
32
37
0
25 Jul 2020
ICA-UNet: ICA Inspired Statistical UNet for Real-time 3D Cardiac Cine
  MRI Segmentation
ICA-UNet: ICA Inspired Statistical UNet for Real-time 3D Cardiac Cine MRI Segmentation
Tianchen Wang
Xiaowei Xu
Jinjun Xiong
Qianjun Jia
Haiyun Yuan
Meiping Huang
Jian Zhuang
Yiyu Shi
14
21
0
18 Jul 2020
Layer-Parallel Training with GPU Concurrency of Deep Residual Neural
  Networks via Nonlinear Multigrid
Layer-Parallel Training with GPU Concurrency of Deep Residual Neural Networks via Nonlinear Multigrid
Andrew Kirby
S. Samsi
Michael Jones
Albert Reuther
J. Kepner
V. Gadepally
25
12
0
14 Jul 2020
Data Movement Is All You Need: A Case Study on Optimizing Transformers
Data Movement Is All You Need: A Case Study on Optimizing Transformers
A. Ivanov
Nikoli Dryden
Tal Ben-Nun
Shigang Li
Torsten Hoefler
36
131
0
30 Jun 2020
Sparse GPU Kernels for Deep Learning
Sparse GPU Kernels for Deep Learning
Trevor Gale
Matei A. Zaharia
C. Young
Erich Elsen
17
228
0
18 Jun 2020
BiQGEMM: Matrix Multiplication with Lookup Table For Binary-Coding-based
  Quantized DNNs
BiQGEMM: Matrix Multiplication with Lookup Table For Binary-Coding-based Quantized DNNs
Yongkweon Jeon
Baeseong Park
S. Kwon
Byeongwook Kim
Jeongin Yun
Dongsoo Lee
MQ
33
30
0
20 May 2020
Energy-Aware DNN Graph Optimization
Energy-Aware DNN Graph Optimization
Yu Wang
Rong Ge
Shuang Qiu
GNN
22
2
0
12 May 2020
FT-CNN: Algorithm-Based Fault Tolerance for Convolutional Neural
  Networks
FT-CNN: Algorithm-Based Fault Tolerance for Convolutional Neural Networks
Kai Zhao
Sheng Di
Sihuan Li
Xin Liang
Yujia Zhai
Jieyang Chen
Kaiming Ouyang
Franck Cappello
Zizhong Chen
30
80
0
27 Mar 2020
Pipelined Backpropagation at Scale: Training Large Models without
  Batches
Pipelined Backpropagation at Scale: Training Large Models without Batches
Atli Kosson
Vitaliy Chiley
Abhinav Venigalla
Joel Hestness
Urs Koster
35
33
0
25 Mar 2020
Previous
12345
Next