ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1410.0759
  4. Cited By
cuDNN: Efficient Primitives for Deep Learning

cuDNN: Efficient Primitives for Deep Learning

3 October 2014
Sharan Chetlur
Cliff Woolley
Philippe Vandermersch
Jonathan M. Cohen
J. Tran
Bryan Catanzaro
Evan Shelhamer
ArXivPDFHTML

Papers citing "cuDNN: Efficient Primitives for Deep Learning"

50 / 236 papers shown
Title
Machine Learning enabled Spectrum Sharing in Dense LTE-U/Wi-Fi
  Coexistence Scenarios
Machine Learning enabled Spectrum Sharing in Dense LTE-U/Wi-Fi Coexistence Scenarios
Adam Dziedzic
V. Sathya
M. I. Rochman
M. Ghosh
S. Krishnan
24
19
0
18 Mar 2020
Exploiting Verified Neural Networks via Floating Point Numerical Error
Exploiting Verified Neural Networks via Floating Point Numerical Error
Kai Jia
Martin Rinard
AAML
37
34
0
06 Mar 2020
Advances in Deep Space Exploration via Simulators & Deep Learning
Advances in Deep Space Exploration via Simulators & Deep Learning
James Bird
Linda R. Petzold
P. Lubin
Dulia Deacon
8
15
0
10 Feb 2020
Deep Learning on Image Denoising: An overview
Deep Learning on Image Denoising: An overview
Chunwei Tian
Lunke Fei
Wenxian Zheng
Yong-mei Xu
W. Zuo
Chia-Wen Lin
35
815
0
31 Dec 2019
Pipelined Training with Stale Weights of Deep Convolutional Neural
  Networks
Pipelined Training with Stale Weights of Deep Convolutional Neural Networks
Lifu Zhang
T. Abdelrahman
21
0
0
29 Dec 2019
Array Languages Make Neural Networks Fast
Array Languages Make Neural Networks Fast
Artjoms Šinkarovs
Hans-Nikolai Vießmann
S. Scholz
20
5
0
11 Dec 2019
A Multigrid Method for Efficiently Training Video Models
A Multigrid Method for Efficiently Training Video Models
Chaoxia Wu
Ross B. Girshick
Kaiming He
Christoph Feichtenhofer
Philipp Krahenbuhl
21
94
0
02 Dec 2019
Enabling Highly Efficient Capsule Networks Processing Through A
  PIM-Based Architecture Design
Enabling Highly Efficient Capsule Networks Processing Through A PIM-Based Architecture Design
Xingyao Zhang
Shuaiwen Leon Song
Chenhao Xie
Jing Wang
Wei-gong Zhang
Xin Fu
25
20
0
07 Nov 2019
MLPerf Inference Benchmark
MLPerf Inference Benchmark
Vijayarāghava Reḍḍī
C. Cheng
David Kanter
Pete H Mattson
Guenther Schmuelling
...
Bing Yu
George Y. Yuan
Aaron Zhong
P. Zhang
Yuchen Zhou
31
487
0
06 Nov 2019
Depth-wise Decomposition for Accelerating Separable Convolutions in
  Efficient Convolutional Neural Networks
Depth-wise Decomposition for Accelerating Separable Convolutions in Efficient Convolutional Neural Networks
Yihui He
Jianing Qian
Jianren Wang
Cindy X. Le
Congrui Hetang
Qi Lyu
Wenping Wang
Tianwei Yue
48
11
0
21 Oct 2019
AI Benchmark: All About Deep Learning on Smartphones in 2019
AI Benchmark: All About Deep Learning on Smartphones in 2019
Andrey D. Ignatov
Radu Timofte
Andrei Kulik
Seungsoo Yang
Ke Wang
Felix Baum
Max Wu
Lirong Xu
Luc Van Gool
ELM
23
218
0
15 Oct 2019
MLPerf Training Benchmark
MLPerf Training Benchmark
Arya D. McCarthy
Christine Cheng
Cody Coleman
Greg Diamos
Paulius Micikevicius
...
Carole-Jean Wu
Lingjie Xu
Masafumi Yamazaki
C. Young
Matei A. Zaharia
38
305
0
02 Oct 2019
MIOpen: An Open Source Library For Deep Learning Primitives
MIOpen: An Open Source Library For Deep Learning Primitives
Jehandad Khan
Paul Fultz
Artem Tamazov
Daniel Lowell
Chao-Jung Liu
...
Vasilii Filippov
Jing Zhang
Jing Zhou
Bragadeesh Natarajan
Mayank Daga
VLM
MoE
20
38
0
30 Sep 2019
Serving Recurrent Neural Networks Efficiently with a Spatial Accelerator
Serving Recurrent Neural Networks Efficiently with a Spatial Accelerator
Tian Zhao
Yaqi Zhang
K. Olukotun
27
16
0
26 Sep 2019
Exascale Deep Learning for Scientific Inverse Problems
Exascale Deep Learning for Scientific Inverse Problems
N. Laanait
Josh Romero
Junqi Yin
M. T. Young
Sean Treichler
V. Starchenko
A. Borisevich
Alexander Sergeev
Michael A. Matheson
FedML
BDL
35
29
0
24 Sep 2019
$360^o$ Surface Regression with a Hyper-Sphere Loss
360o360^o360o Surface Regression with a Hyper-Sphere Loss
Antonis Karakottas
N. Zioulis
Stamatis Samaras
Dimitrios Ataloglou
V. Gkitsas
D. Zarpalas
P. Daras
3DH
22
8
0
16 Sep 2019
rlpyt: A Research Code Base for Deep Reinforcement Learning in PyTorch
rlpyt: A Research Code Base for Deep Reinforcement Learning in PyTorch
Adam Stooke
Pieter Abbeel
OffRL
24
96
0
03 Sep 2019
Optimizing Multi-GPU Parallelization Strategies for Deep Learning
  Training
Optimizing Multi-GPU Parallelization Strategies for Deep Learning Training
Saptadeep Pal
Eiman Ebrahimi
A. Zulfiqar
Yaosheng Fu
Victor Zhang
Szymon Migacz
D. Nellans
Puneet Gupta
34
55
0
30 Jul 2019
DeepCABAC: A Universal Compression Algorithm for Deep Neural Networks
DeepCABAC: A Universal Compression Algorithm for Deep Neural Networks
Simon Wiedemann
H. Kirchhoffer
Stefan Matlage
Paul Haase
Arturo Marbán
...
Ahmed Osman
D. Marpe
H. Schwarz
Thomas Wiegand
Wojciech Samek
49
93
0
27 Jul 2019
AdaCoF: Adaptive Collaboration of Flows for Video Frame Interpolation
AdaCoF: Adaptive Collaboration of Flows for Video Frame Interpolation
Hyeongmin Lee
Taeoh Kim
Tae-Young Chung
Daehyun Pak
Yuseok Ban
Sangyoun Lee
30
235
0
24 Jul 2019
Cross-Domain Car Detection Using Unsupervised Image-to-Image
  Translation: From Day to Night
Cross-Domain Car Detection Using Unsupervised Image-to-Image Translation: From Day to Night
Vinicius F. Arruda
T. M. Paixão
Rodrigo Berriel
Alberto F. de Souza
C. Badue
N. Sebe
Thiago Oliveira-Santos
ViT
15
103
0
19 Jul 2019
Profiling based Out-of-core Hybrid Method for Large Neural Networks
Profiling based Out-of-core Hybrid Method for Large Neural Networks
Yuki Ito
Haruki Imai
Tung D. Le
Yasushi Negishi
K. Kawachiya
R. Matsumiya
Toshio Endo
24
9
0
11 Jul 2019
A Unified Optimization Approach for CNN Model Inference on Integrated
  GPUs
A Unified Optimization Approach for CNN Model Inference on Integrated GPUs
Leyuan Wang
Zhi Chen
Yizhi Liu
Yao Wang
Lianmin Zheng
Mu Li
Yida Wang
31
30
0
03 Jul 2019
A Winograd-based Integrated Photonics Accelerator for Convolutional
  Neural Networks
A Winograd-based Integrated Photonics Accelerator for Convolutional Neural Networks
A. Mehrabian
M. Miscuglio
Yousra Alkabani
V. Sorger
T. El-Ghazawi
11
46
0
25 Jun 2019
Parameterized Structured Pruning for Deep Neural Networks
Parameterized Structured Pruning for Deep Neural Networks
Günther Schindler
Wolfgang Roth
Franz Pernkopf
Holger Froening
24
6
0
12 Jun 2019
DeepCABAC: Context-adaptive binary arithmetic coding for deep neural
  network compression
DeepCABAC: Context-adaptive binary arithmetic coding for deep neural network compression
Simon Wiedemann
H. Kirchhoffer
Stefan Matlage
Paul Haase
Arturo Marbán
...
Ahmed Osman
D. Marpe
H. Schwarz
Thomas Wiegand
Wojciech Samek
MQ
19
21
0
15 May 2019
MobiVSR: A Visual Speech Recognition Solution for Mobile Devices
MobiVSR: A Visual Speech Recognition Solution for Mobile Devices
Nilay Shrivastava
Astitwa Saxena
Yaman Kumar Singla
Preeti Kaur
Debanjan Mahata
R. Shah
27
3
0
10 May 2019
Cross-Platform Performance Portability Using Highly Parametrized SYCL
  Kernels
Cross-Platform Performance Portability Using Highly Parametrized SYCL Kernels
John Lawson
M. Goli
Duncan McBain
Daniel Soutar
Louis Sugy
16
7
0
10 Apr 2019
swCaffe: a Parallel Framework for Accelerating Deep Learning
  Applications on Sunway TaihuLight
swCaffe: a Parallel Framework for Accelerating Deep Learning Applications on Sunway TaihuLight
Jiarui Fang
Liandeng Li
Haohuan Fu
Jinlei Jiang
Wenlai Zhao
Conghui He
Xin You
Guangwen Yang
21
30
0
16 Mar 2019
Improving Strong-Scaling of CNN Training by Exploiting Finer-Grained
  Parallelism
Improving Strong-Scaling of CNN Training by Exploiting Finer-Grained Parallelism
Nikoli Dryden
N. Maruyama
Tom Benson
Tim Moon
M. Snir
B. Van Essen
26
49
0
15 Mar 2019
Accelerating Training of Deep Neural Networks with a Standardization
  Loss
Accelerating Training of Deep Neural Networks with a Standardization Loss
Jasmine Collins
Johannes Ballé
Jonathon Shlens
21
3
0
03 Mar 2019
Real-world Mapping of Gaze Fixations Using Instance Segmentation for
  Road Construction Safety Applications
Real-world Mapping of Gaze Fixations Using Instance Segmentation for Road Construction Safety Applications
Idris Jeelani
Khashayar Asadi
Hariharan Ramshankar
Kevin K. Han
A. Albert
11
5
0
30 Jan 2019
AccUDNN: A GPU Memory Efficient Accelerator for Training Ultra-deep
  Neural Networks
AccUDNN: A GPU Memory Efficient Accelerator for Training Ultra-deep Neural Networks
Jinrong Guo
Wantao Liu
Wang Wang
Q. Lu
Songlin Hu
Jizhong Han
Ruixuan Li
16
9
0
21 Jan 2019
FPGA-based Accelerators of Deep Learning Networks for Learning and
  Classification: A Review
FPGA-based Accelerators of Deep Learning Networks for Learning and Classification: A Review
Ahmad Shawahna
S. M. Sait
A. El-Maleh
28
372
0
01 Jan 2019
An Optical Frontend for a Convolutional Neural Network
An Optical Frontend for a Convolutional Neural Network
S. Colburn
Yiren Chu
Eli Shlizerman
A. Majumdar
27
89
0
23 Dec 2018
wav2letter++: The Fastest Open-source Speech Recognition System
wav2letter++: The Fastest Open-source Speech Recognition System
Vineel Pratap
Awni Y. Hannun
Qiantong Xu
Jeff Cai
Jacob Kahn
Gabriel Synnaeve
Vitaliy Liptchinsky
R. Collobert
VLM
18
156
0
18 Dec 2018
SIMD-X: Programming and Processing of Graph Algorithms on GPUs
SIMD-X: Programming and Processing of Graph Algorithms on GPUs
Hang Liu
Howie Huang
GNN
11
53
0
10 Dec 2018
Analyzing Machine Learning Workloads Using a Detailed GPU Simulator
Analyzing Machine Learning Workloads Using a Detailed GPU Simulator
Jonathan Lew
Deval Shah
Suchita Pati
Shaylin Cattell
Mengchi Zhang
...
Christopher Ng
Negar Goli
Matthew D. Sinclair
Timothy G. Rogers
Tor M. Aamodt
29
65
0
18 Nov 2018
Incremental Deep Learning for Robust Object Detection in Unknown
  Cluttered Environments
Incremental Deep Learning for Robust Object Detection in Unknown Cluttered Environments
Dongwha Shin
M. Ahmed
P. Rhee
ObjD
26
20
0
13 Oct 2018
LIRS: Enabling efficient machine learning on NVM-based storage via a
  lightweight implementation of random shuffling
LIRS: Enabling efficient machine learning on NVM-based storage via a lightweight implementation of random shuffling
Zhi-Lin Ke
Hsiang-Yun Cheng
Chia-Lin Yang
17
9
0
10 Oct 2018
Co-Learning Feature Fusion Maps from PET-CT Images of Lung Cancer
Co-Learning Feature Fusion Maps from PET-CT Images of Lung Cancer
Ashnil Kumar
M. Fulham
D. Feng
Jinman Kim
MedIm
27
165
0
05 Oct 2018
AI Benchmark: Running Deep Neural Networks on Android Smartphones
AI Benchmark: Running Deep Neural Networks on Android Smartphones
Andrey D. Ignatov
Radu Timofte
William Chou
Ke Wang
Max Wu
Tim Hartley
Luc Van Gool
ELM
21
321
0
02 Oct 2018
Mini-batch Serialization: CNN Training with Inter-layer Data Reuse
Mini-batch Serialization: CNN Training with Inter-layer Data Reuse
Sangkug Lym
Armand Behroozi
W. Wen
Ge Li
Yongkee Kwon
M. Erez
12
25
0
30 Sep 2018
CBinfer: Exploiting Frame-to-Frame Locality for Faster Convolutional
  Network Inference on Video Streams
CBinfer: Exploiting Frame-to-Frame Locality for Faster Convolutional Network Inference on Video Streams
Lukas Cavigelli
Luca Benini
27
26
0
15 Aug 2018
A Domain Guided CNN Architecture for Predicting Age from Structural
  Brain Images
A Domain Guided CNN Architecture for Predicting Age from Structural Brain Images
Pascal Sturmfels
S. Rutherford
Mike Angstadt
Mark Peterson
Chandra S. Sripada
Jenna Wiens
MedIm
27
23
0
11 Aug 2018
ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture
  Design
ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design
Ningning Ma
Xiangyu Zhang
Haitao Zheng
Jian Sun
51
4,929
0
30 Jul 2018
Recent Advances in Deep Learning: An Overview
Recent Advances in Deep Learning: An Overview
Matiur Rahman Minar
Jibon Naher
VLM
24
116
0
21 Jul 2018
Scheduling Computation Graphs of Deep Learning Models on Manycore CPUs
Scheduling Computation Graphs of Deep Learning Models on Manycore CPUs
Linpeng Tang
Yida Wang
Theodore L. Willke
Kai Li
GNN
21
22
0
16 Jul 2018
Beyond Data and Model Parallelism for Deep Neural Networks
Beyond Data and Model Parallelism for Deep Neural Networks
Zhihao Jia
Matei A. Zaharia
A. Aiken
GNN
AI4CE
38
497
0
14 Jul 2018
A Large-Scale Study on Regularization and Normalization in GANs
A Large-Scale Study on Regularization and Normalization in GANs
Karol Kurach
Mario Lucic
Xiaohua Zhai
Marcin Michalski
Sylvain Gelly
AI4CE
33
155
0
12 Jul 2018
Previous
12345
Next