Democratizing AI: Open-source Scalable LLM Training on GPU-based SupercomputersInternational Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2024 |
FedDCT: Federated Learning of Large Convolutional Neural Networks on
Resource Constrained Devices using Divide and Collaborative TrainingIEEE Transactions on Network and Service Management (IEEE TNSM), 2022 |
PERKS: a Locality-Optimized Execution Model for Iterative Memory-bound
GPU ApplicationsInternational Conference on Supercomputing (ICS), 2022 |
An Oracle for Guiding Large-Scale Model/Hybrid Parallel Training of
Convolutional Neural NetworksIEEE International Symposium on High-Performance Parallel Distributed Computing (HPDC), 2020 |