ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.09375
73
17

Looped ReLU MLPs May Be All You Need as Practical Programmable Computers

21 February 2025
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
Yufa Zhou
ArXivPDFHTML
Abstract

Previous work has demonstrated that attention mechanisms are Turing complete. More recently, it has been shown that a looped 9-layer Transformer can function as a universal programmable computer. In contrast, the multi-layer perceptrons with ReLU\mathsf{ReLU}ReLU activation (ReLU\mathsf{ReLU}ReLU-MLP\mathsf{MLP}MLP), one of the most fundamental components of neural networks, is known to be expressive; specifically, a two-layer neural network is a universal approximator given an exponentially large number of hidden neurons. However, it remains unclear whether a ReLU\mathsf{ReLU}ReLU-MLP\mathsf{MLP}MLP can be made into a universal programmable computer using a practical number of weights. In this work, we provide an affirmative answer that a looped 23-layer ReLU\mathsf{ReLU}ReLU-MLP\mathsf{MLP}MLP is capable of performing the basic necessary operations, more efficiently and effectively functioning as a programmable computer than a looped Transformer. This indicates simple modules have stronger expressive power than previously expected and have not been fully explored. Our work provides insights into the mechanisms of neural networks and demonstrates that complex tasks, such as functioning as a programmable computer, do not necessarily require advanced architectures like Transformers.

View on arXiv
@article{liang2025_2410.09375,
  title={ Looped ReLU MLPs May Be All You Need as Practical Programmable Computers },
  author={ Yingyu Liang and Zhizhou Sha and Zhenmei Shi and Zhao Song and Yufa Zhou },
  journal={arXiv preprint arXiv:2410.09375},
  year={ 2025 }
}
Comments on this paper