96

Deep Learning and Machine Learning with GPGPU and CUDA: Unlocking the Power of Parallel Computing

Ming Li
Ziqian Bi
Tianyang Wang
Yizhu Wen
Qian Niu
Junyu Liu
Benji Peng
Sen Zhang
Jiawei Xu
Jinlang Wang
Keyu Chen
Caitlyn Heqi Yin
Pohsun Feng
Ming Liu
Main:104 Pages
3 Figures
Bibliography:1 Pages
Appendix:1 Pages
Abstract

This book presents a comprehensive exploration of GPGPU (General Purpose Graphics Processing Unit) and its applications in deep learning and machine learning. It focuses on how parallel computing, particularly through the use of CUDA (Compute Unified Device Architecture), can unlock unprecedented computational power for complex tasks. The book provides detailed discussions on CPU and GPU architectures, data flow in deep learning, and advanced GPU features like streams, concurrency, and dynamic parallelism. Furthermore, it delves into practical applications of GPGPU in various domains such as scientific computing, machine learning acceleration, real-time rendering, and cryptocurrency mining. The authors also emphasize the importance of selecting the right parallel architecture (e.g., GPU, FPGA, TPU, ASIC) based on specific tasks, offering insights into optimizing algorithms for these platforms. The book also provides practical examples with popular machine learning frameworks like PyTorch, TensorFlow, and XGBoost, demonstrating how to efficiently leverage GPU resources in both training and inference. This resource is valuable for both beginners and advanced readers who are looking to deepen their understanding of GPU-based parallel computing and its significant role in modern machine learning and AI applications.

View on arXiv
Comments on this paper