The Ultimate DataFlow for Ultimate SuperComputers-on-a-Chip, for
Scientific Computing, Geo Physics, Complex Mathematics, and Information
Processing
Mediterranean Conference on Embedded Computing (MECO), 2020
Abstract
This article starts from the assumption that near future 100BTransistor SuperComputers-on-a-Chip will include N big multi-core processors, 1000N small many-core processors, a TPU-like fixed-structure systolic array accelerator for the most frequently used Machine Learning algorithms needed in bandwidth-bound applications and a flexible-structure reprogrammable accelerator for less frequently used Machine Learning algorithms needed in latency-critical applications.
View on arXivComments on this paper
