The Ultimate DataFlow for Ultimate SuperComputers-on-a-Chip, for Scientific Computing, Geo Physics, Complex Mathematics, and Information Processing
V. Milutinovic
Erfan Sadeqi Azer
K. Yoshimoto
Gerhard Klimeck
Miljana Djordjević
Miloš Kotlar
M. Bojovic
Bozidar Miladinovic
Nenad Korolija
S. Stankovic
Nenad Filipović
Z. Babović
Miroslav Kosanic
Akira Tsuda
M. Valero
M. D. Santo
E. Neuhold
Jelena Skoruvcak
L. Dipietro
Ivan Ratković

Abstract
This article starts from the assumption that near future 100BTransistor SuperComputers-on-a-Chip will include N big multi-core processors, 1000N small many-core processors, a TPU-like fixed-structure systolic array accelerator for the most frequently used Machine Learning algorithms needed in bandwidth-bound applications and a flexible-structure reprogrammable accelerator for less frequently used Machine Learning algorithms needed in latency-critical applications.
View on arXivComments on this paper