654

Which scaling rule applies to Artificial Neural Networks

Abstract

Although an ANN is a biology-mimicking system, it is built from components designed/fabricated for use in conventional computing, and it is created by experts trained in conventional computing; all of them are using the classic computing paradigm. As von Neumann in his classic "First Draft" warned, because the data transfer time is neglected in the model he used, using a "too fast processor" vitiates the procedure, furthermore, that using his paradigm for imitating neuronal operations, is unsound. This means that at least doubly unsound to apply his paradigm to describe scaling ANNs. The common experience shows that making actively cooperating and communicating computing systems, using segregated single processors, has severe performance limitations; which fact cannot be explained using his classic paradigm. The achievable payload computing performance of those systems sensitively depends on their workload type, and this effect is only poorly known. The type of the workload that the AI-based systems generate, leads to an exceptionally low payload computational performance. Unfortunately, the initial successes of demo systems that comprise only a few "neurons" and solve simple tasks are misleading: the scaling of processor-based ANN systems is strongly nonlinear. The paper discusses some major limiting factors that affect their performance. It points out that for building biology-mimicking large systems, it is inevitable to perform drastic changes in the present computing paradigm. Namely, instead of neglecting the transfer time, a proper method to consider it shall be developed. The temporal behavior enables us to comprehend the technical implementation of computing components and architectures.

View on arXiv
Comments on this paper