ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.07839
37
28

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

11 April 2024
Aleksandar Botev
Soham De
Samuel L. Smith
Anushan Fernando
George-Christian Muraru
Ruba Haroun
Leonard Berrada
Razvan Pascanu
Pier Giuseppe Sessa
Robert Dadashi
Léonard Hussenot
Johan Ferret
Sertan Girgin
Olivier Bachem
Alek Andreev
Kathleen Kenealy
Thomas Mesnard
Cassidy Hardin
Surya Bhupatiraju
Shreya Pathak
Laurent Sifre
Morgane Riviere
Mihir Kale
J Christopher Love
P. Tafti
Armand Joulin
Noah Fiedel
Evan Senter
Yutian Chen
S. Srinivasan
Guillaume Desjardins
David Budden
Arnaud Doucet
Sharad Vikram
Adam Paszke
Trevor Gale
Sebastian Borgeaud
Charlie Chen
Andy Brock
Antonia Paterson
Jenny Brennan
Meg Risdal
Raj Gundluru
Nesh Devanathan
Paul Mooney
Nilay Chauhan
Phil Culliton
Luiz GUStavo Martins
Elisa Bandy
David W. Huntsperger
Glenn Cameron
Arthur Zucker
T. Warkentin
Ludovic Peran
Minh Giang
Zoubin Ghahramani
Clement Farabet
Koray Kavukcuoglu
Demis Hassabis
R. Hadsell
Yee Whye Teh
Nando de Frietas
    VLM
    RALM
ArXivPDFHTML
Abstract

We introduce RecurrentGemma, an open language model which uses Google's novel Griffin architecture. Griffin combines linear recurrences with local attention to achieve excellent performance on language. It has a fixed-sized state, which reduces memory use and enables efficient inference on long sequences. We provide a pre-trained model with 2B non-embedding parameters, and an instruction tuned variant. Both models achieve comparable performance to Gemma-2B despite being trained on fewer tokens.

View on arXiv
Comments on this paper