6
6

The Ultimate DataFlow for Ultimate SuperComputers-on-a-Chip, for Scientific Computing, Geo Physics, Complex Mathematics, and Information Processing

Abstract

This article starts from the assumption that near future 100BTransistor SuperComputers-on-a-Chip will include N big multi-core processors, 1000N small many-core processors, a TPU-like fixed-structure systolic array accelerator for the most frequently used Machine Learning algorithms needed in bandwidth-bound applications and a flexible-structure reprogrammable accelerator for less frequently used Machine Learning algorithms needed in latency-critical applications.

View on arXiv
Comments on this paper