ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1511.00175
  4. Cited By
FireCaffe: near-linear acceleration of deep neural network training on
  compute clusters

FireCaffe: near-linear acceleration of deep neural network training on compute clusters

31 October 2015
F. Iandola
Khalid Ashraf
Matthew W. Moskewicz
Kurt Keutzer
ArXivPDFHTML

Papers citing "FireCaffe: near-linear acceleration of deep neural network training on compute clusters"

28 / 28 papers shown
Title
Analysing the Influence of Attack Configurations on the Reconstruction
  of Medical Images in Federated Learning
Analysing the Influence of Attack Configurations on the Reconstruction of Medical Images in Federated Learning
M. Dahlgaard
Morten Wehlast Jorgensen
N. Fuglsang
Hiba Nassar
FedML
AAML
28
2
0
25 Apr 2022
FedDC: Federated Learning with Non-IID Data via Local Drift Decoupling
  and Correction
FedDC: Federated Learning with Non-IID Data via Local Drift Decoupling and Correction
Liang Gao
H. Fu
Li Li
Yingwen Chen
Minghua Xu
Chengzhong Xu
FedML
21
241
0
22 Mar 2022
FA-GAN: Fused Attentive Generative Adversarial Networks for MRI Image
  Super-Resolution
FA-GAN: Fused Attentive Generative Adversarial Networks for MRI Image Super-Resolution
M. Jiang
Min Zhi
Liying Wei
Xiaocheng Yang
Jucheng Zhang
Yongming Li
Pin Wang
Jiahao Huang
Guang Yang
MedIm
17
83
0
09 Aug 2021
Concurrent Adversarial Learning for Large-Batch Training
Concurrent Adversarial Learning for Large-Batch Training
Yong Liu
Xiangning Chen
Minhao Cheng
Cho-Jui Hsieh
Yang You
ODL
28
13
0
01 Jun 2021
Partitioning sparse deep neural networks for scalable training and
  inference
Partitioning sparse deep neural networks for scalable training and inference
G. Demirci
Hakan Ferhatosmanoglu
18
11
0
23 Apr 2021
See through Gradients: Image Batch Recovery via GradInversion
See through Gradients: Image Batch Recovery via GradInversion
Hongxu Yin
Arun Mallya
Arash Vahdat
J. Álvarez
Jan Kautz
Pavlo Molchanov
FedML
23
459
0
15 Apr 2021
Towards a Scalable and Distributed Infrastructure for Deep Learning
  Applications
Towards a Scalable and Distributed Infrastructure for Deep Learning Applications
Bita Hasheminezhad
S. Shirzad
Nanmiao Wu
Patrick Diehl
Hannes Schulz
Hartmut Kaiser
GNN
AI4CE
11
4
0
06 Oct 2020
PSO-PS: Parameter Synchronization with Particle Swarm Optimization for
  Distributed Training of Deep Neural Networks
PSO-PS: Parameter Synchronization with Particle Swarm Optimization for Distributed Training of Deep Neural Networks
Qing Ye
Y. Han
Yanan Sun
Jiancheng Lv
23
3
0
06 Sep 2020
CLAN: Continuous Learning using Asynchronous Neuroevolution on Commodity
  Edge Devices
CLAN: Continuous Learning using Asynchronous Neuroevolution on Commodity Edge Devices
Parth Mannan
A. Samajdar
T. Krishna
15
2
0
27 Aug 2020
ZynqNet: An FPGA-Accelerated Embedded Convolutional Neural Network
ZynqNet: An FPGA-Accelerated Embedded Convolutional Neural Network
David Gschwend
13
64
0
14 May 2020
Characterizing and Modeling Distributed Training with Transient Cloud
  GPU Servers
Characterizing and Modeling Distributed Training with Transient Cloud GPU Servers
Shijian Li
R. Walls
Tian Guo
15
23
0
07 Apr 2020
Parallelizing Training of Deep Generative Models on Massive Scientific
  Datasets
Parallelizing Training of Deep Generative Models on Massive Scientific Datasets
S. A. Jacobs
B. Van Essen
D. Hysom
Jae-Seung Yeom
Tim Moon
...
J. Gaffney
Tom Benson
Peter B. Robinson
L. Peterson
B. Spears
BDL
AI4CE
14
17
0
05 Oct 2019
PowerSGD: Practical Low-Rank Gradient Compression for Distributed
  Optimization
PowerSGD: Practical Low-Rank Gradient Compression for Distributed Optimization
Thijs Vogels
Sai Praneeth Karimireddy
Martin Jaggi
17
316
0
31 May 2019
MATCHA: Speeding Up Decentralized SGD via Matching Decomposition
  Sampling
MATCHA: Speeding Up Decentralized SGD via Matching Decomposition Sampling
Jianyu Wang
Anit Kumar Sahu
Zhouyi Yang
Gauri Joshi
S. Kar
13
159
0
23 May 2019
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
Yang You
Jing Li
Sashank J. Reddi
Jonathan Hseu
Sanjiv Kumar
Srinadh Bhojanapalli
Xiaodan Song
J. Demmel
Kurt Keutzer
Cho-Jui Hsieh
ODL
25
978
0
01 Apr 2019
swCaffe: a Parallel Framework for Accelerating Deep Learning
  Applications on Sunway TaihuLight
swCaffe: a Parallel Framework for Accelerating Deep Learning Applications on Sunway TaihuLight
Jiarui Fang
Liandeng Li
H. Fu
Jinlei Jiang
Wenlai Zhao
Conghui He
Xin You
Guangwen Yang
21
30
0
16 Mar 2019
Pipe-SGD: A Decentralized Pipelined SGD Framework for Distributed Deep
  Net Training
Pipe-SGD: A Decentralized Pipelined SGD Framework for Distributed Deep Net Training
Youjie Li
Hang Qiu
Songze Li
A. Avestimehr
N. Kim
A. Schwing
FedML
16
103
0
08 Nov 2018
Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural
  Networks
Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks
Kang Liu
Brendan Dolan-Gavitt
S. Garg
AAML
4
1,017
0
30 May 2018
Demystifying Parallel and Distributed Deep Learning: An In-Depth
  Concurrency Analysis
Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis
Tal Ben-Nun
Torsten Hoefler
GNN
22
701
0
26 Feb 2018
Efficient Training of Convolutional Neural Nets on Large Distributed
  Systems
Efficient Training of Convolutional Neural Nets on Large Distributed Systems
Sameer Kumar
D. Sreedhar
Vaibhav Saxena
Yogish Sabharwal
Ashish Verma
25
4
0
02 Nov 2017
Distributed Training Large-Scale Deep Architectures
Distributed Training Large-Scale Deep Architectures
Shang-Xuan Zou
Chun-Yen Chen
Jui-Lin Wu
Chun-Nan Chou
Chia-Chin Tsao
Kuan-Chieh Tung
Ting-Wei Lin
Cheng-Lung Sung
Edward Y. Chang
16
22
0
10 Aug 2017
Scaling Deep Learning on GPU and Knights Landing clusters
Scaling Deep Learning on GPU and Knights Landing clusters
Yang You
A. Buluç
J. Demmel
GNN
15
75
0
09 Aug 2017
Optimized Broadcast for Deep Learning Workloads on Dense-GPU InfiniBand
  Clusters: MPI or NCCL?
Optimized Broadcast for Deep Learning Workloads on Dense-GPU InfiniBand Clusters: MPI or NCCL?
A. A. Awan
Ching-Hsiang Chu
Hari Subramoni
D. Panda
GNN
33
46
0
28 Jul 2017
How to scale distributed deep learning?
How to scale distributed deep learning?
Peter H. Jin
Qiaochu Yuan
F. Iandola
Kurt Keutzer
3DH
16
136
0
14 Nov 2016
Distributed Training of Deep Neural Networks: Theoretical and Practical
  Limits of Parallel Scalability
Distributed Training of Deep Neural Networks: Theoretical and Practical Limits of Parallel Scalability
J. Keuper
Franz-Josef Pfreundt
GNN
47
97
0
22 Sep 2016
A Convolutional Autoencoder for Multi-Subject fMRI Data Aggregation
A Convolutional Autoencoder for Multi-Subject fMRI Data Aggregation
Po-Hsuan Chen
Xia Zhu
Hejia Zhang
Javier S. Turek
Janice Chen
Theodore L. Willke
Uri Hasson
Peter J. Ramadge
16
24
0
17 Aug 2016
Omnivore: An Optimizer for Multi-device Deep Learning on CPUs and GPUs
Omnivore: An Optimizer for Multi-device Deep Learning on CPUs and GPUs
Stefan Hadjis
Ce Zhang
Ioannis Mitliagkas
Dan Iter
Christopher Ré
11
65
0
14 Jun 2016
The Effects of Hyperparameters on SGD Training of Neural Networks
The Effects of Hyperparameters on SGD Training of Neural Networks
Thomas Breuel
64
63
0
12 Aug 2015
1