Gated recurrent neural networks discover attention

4 September 2023

Papers citing "Gated recurrent neural networks discover attention"

8 / 8 papers shown

Title
Learning Randomized Algorithms with Transformers J. Oswald Seijin Kobayashi Yassir Akram Angelika Steger AAML 27 0 0 20 Aug 2024
Fine-grained Analysis of In-context Linear Estimation: Data, Architecture, and Beyond Yingcong Li A. S. Rawat Samet Oymak 21 6 0 13 Jul 2024
Fine-tuned network relies on generic representation to solve unseen cognitive task Dongyan Lin 22 0 0 27 Jun 2024
Understanding the differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks Jerome Sieber Carmen Amo Alonso A. Didier M. Zeilinger Antonio Orvieto AAML 42 7 0 24 May 2024
State Space Model for New-Generation Network Alternative to Transformers: A Survey Xiao Wang Shiao Wang Yuhe Ding Yuehang Li Wentao Wu ... Bowei Jiang Chenglong Li Yaowei Wang Yonghong Tian Jin Tang Mamba 33 48 0 15 Apr 2024
Meta predictive learning model of languages in neural circuits Chan Li Junbin Qiu Haiping Huang MILM 9 1 0 08 Sep 2023
Resurrecting Recurrent Neural Networks for Long Sequences Antonio Orvieto Samuel L. Smith Albert Gu Anushan Fernando Çağlar Gülçehre Razvan Pascanu Soham De 88 258 0 11 Mar 2023
In-context Learning and Induction Heads Catherine Olsson Nelson Elhage Neel Nanda Nicholas Joseph Nova Dassarma ... Tom B. Brown Jack Clark Jared Kaplan Sam McCandlish C. Olah 240 453 0 24 Sep 2022