ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.07839
  4. Cited By
RecurrentGemma: Moving Past Transformers for Efficient Open Language
  Models

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

11 April 2024
Aleksandar Botev
Soham De
Samuel L. Smith
Anushan Fernando
George-Christian Muraru
Ruba Haroun
Leonard Berrada
Razvan Pascanu
Pier Giuseppe Sessa
Robert Dadashi
Léonard Hussenot
Johan Ferret
Sertan Girgin
Olivier Bachem
Alek Andreev
Kathleen Kenealy
Thomas Mesnard
Cassidy Hardin
Surya Bhupatiraju
Shreya Pathak
Laurent Sifre
Morgane Riviere
Mihir Kale
J Christopher Love
P. Tafti
Armand Joulin
Noah Fiedel
Evan Senter
Yutian Chen
S. Srinivasan
Guillaume Desjardins
David Budden
Arnaud Doucet
Sharad Vikram
Adam Paszke
Trevor Gale
Sebastian Borgeaud
Charlie Chen
Andy Brock
Antonia Paterson
Jenny Brennan
Meg Risdal
Raj Gundluru
Nesh Devanathan
Paul Mooney
Nilay Chauhan
Phil Culliton
Luiz GUStavo Martins
Elisa Bandy
David W. Huntsperger
Glenn Cameron
Arthur Zucker
T. Warkentin
Ludovic Peran
Minh Giang
Zoubin Ghahramani
Clement Farabet
Koray Kavukcuoglu
Demis Hassabis
R. Hadsell
Yee Whye Teh
Nando de Frietas
    VLM
    RALM
ArXivPDFHTML

Papers citing "RecurrentGemma: Moving Past Transformers for Efficient Open Language Models"

25 / 25 papers shown
Title
Overflow Prevention Enhances Long-Context Recurrent LLMs
Overflow Prevention Enhances Long-Context Recurrent LLMs
Assaf Ben-Kish
Itamar Zimerman
M. Jehanzeb Mirza
James R. Glass
Leonid Karlinsky
Raja Giryes
LRM
37
0
0
12 May 2025
Encoder-Decoder Gemma: Improving the Quality-Efficiency Trade-Off via Adaptation
Encoder-Decoder Gemma: Improving the Quality-Efficiency Trade-Off via Adaptation
Biao Zhang
Fedor Moiseev
Joshua Ainslie
Paul Suganthan
Min Ma
Surya Bhupatiraju
Fede Lebron
Orhan Firat
Armand Joulin
Zhe Dong
AI4CE
33
0
0
08 Apr 2025
xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference
xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference
M. Beck
Korbinian Poppel
Phillip Lippe
Richard Kurle
P. Blies
Günter Klambauer
Sebastian Böck
Sepp Hochreiter
LRM
59
1
0
17 Mar 2025
Can Small Language Models Reliably Resist Jailbreak Attacks? A Comprehensive Evaluation
Wenhui Zhang
Huiyu Xu
Zhibo Wang
Zeqing He
Ziqi Zhu
Kui Ren
AAML
PILM
74
0
0
09 Mar 2025
Optimizing Large Language Models for ESG Activity Detection in Financial Texts
Optimizing Large Language Models for ESG Activity Detection in Financial Texts
Mattia Birti
Francesco Osborne
Andrea Maurino
44
0
0
28 Feb 2025
LongReason: A Synthetic Long-Context Reasoning Benchmark via Context Expansion
LongReason: A Synthetic Long-Context Reasoning Benchmark via Context Expansion
Zhan Ling
Kang Liu
Kai Yan
Yue Yang
Weijian Lin
Ting-Han Fan
Lingfeng Shen
Zhengyin Du
Jiecao Chen
ReLM
ELM
LRM
54
3
0
25 Jan 2025
Marconi: Prefix Caching for the Era of Hybrid LLMs
Marconi: Prefix Caching for the Era of Hybrid LLMs
Rui Pan
Zhuang Wang
Zhen Jia
Can Karakus
Luca Zancato
Tri Dao
Ravi Netravali
Yida Wang
97
4
0
28 Nov 2024
Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination
Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination
Jerry Huang
Prasanna Parthasarathi
Mehdi Rezagholizadeh
Boxing Chen
Sarath Chandar
53
0
0
22 Oct 2024
MatMamba: A Matryoshka State Space Model
MatMamba: A Matryoshka State Space Model
Abhinav Shukla
Sai H. Vemprala
Aditya Kusupati
Ashish Kapoor
Mamba
30
1
0
09 Oct 2024
Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient Attentions
Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient Attentions
Zhihao He
Hang Yu
Zi Gong
Shizhan Liu
Jason Li
Weiyao Lin
VLM
38
1
0
09 Oct 2024
Falcon Mamba: The First Competitive Attention-free 7B Language Model
Falcon Mamba: The First Competitive Attention-free 7B Language Model
Jingwei Zuo
Maksim Velikanov
Dhia Eddine Rhaiem
Ilyas Chahed
Younes Belkada
Guillaume Kunsch
Hakim Hacid
ALM
52
14
0
07 Oct 2024
The Factuality of Large Language Models in the Legal Domain
The Factuality of Large Language Models in the Legal Domain
Rajaa El Hamdani
Thomas Bonald
Fragkiskos D. Malliaros
Nils Holzenberger
Fabian M. Suchanek
AILaw
HILM
36
0
0
18 Sep 2024
Mamba-PTQ: Outlier Channels in Recurrent Large Language Models
Mamba-PTQ: Outlier Channels in Recurrent Large Language Models
Alessandro Pierro
Steven Abreu
MQ
Mamba
45
6
0
17 Jul 2024
How Well Can a Long Sequence Model Model Long Sequences? Comparing Architechtural Inductive Biases on Long-Context Abilities
How Well Can a Long Sequence Model Model Long Sequences? Comparing Architechtural Inductive Biases on Long-Context Abilities
Jerry Huang
57
7
0
11 Jul 2024
KV Cache Compression, But What Must We Give in Return? A Comprehensive
  Benchmark of Long Context Capable Approaches
KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches
Jiayi Yuan
Hongyi Liu
Shaochen
Zhong
Yu-Neng Chuang
...
Hongye Jin
V. Chaudhary
Zhaozhuo Xu
Zirui Liu
Xia Hu
51
18
0
01 Jul 2024
ResumeAtlas: Revisiting Resume Classification with Large-Scale Datasets
  and Large Language Models
ResumeAtlas: Revisiting Resume Classification with Large-Scale Datasets and Large Language Models
Ahmed Heakl
Youssef Mohamed
Noran Mohamed
Aly Elsharkawy
A. Zaky
25
2
0
26 Jun 2024
PARSE-Ego4D: Personal Action Recommendation Suggestions for Egocentric
  Videos
PARSE-Ego4D: Personal Action Recommendation Suggestions for Egocentric Videos
Steven Abreu
Tiffany D. Do
Ruofei Du
Eric J. Gonzalez
Lee Payne
Daniel J. McDuff
Mar Gonzalez-Franco
50
2
0
14 Jun 2024
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Liliang Ren
Yang Liu
Yadong Lu
Yelong Shen
Chen Liang
Weizhu Chen
Mamba
77
57
0
11 Jun 2024
SUBLLM: A Novel Efficient Architecture with Token Sequence Subsampling
  for LLM
SUBLLM: A Novel Efficient Architecture with Token Sequence Subsampling for LLM
Quandong Wang
Yuxuan Yuan
Xiaoyu Yang
Ruike Zhang
Kang Zhao
Wei Liu
Jian Luan
Daniel Povey
Bin Wang
53
0
0
03 Jun 2024
Universal In-Context Approximation By Prompting Fully Recurrent Models
Universal In-Context Approximation By Prompting Fully Recurrent Models
Aleksandar Petrov
Tom A. Lamb
Alasdair Paren
Philip Torr
Adel Bibi
LRM
37
0
0
03 Jun 2024
Pretrained Hybrids with MAD Skills
Pretrained Hybrids with MAD Skills
Nicholas Roberts
Samuel Guo
Zhiqi Gao
Satya Sai Srinath Namburi
Sonia Cromp
Chengjun Wu
Chengyu Duan
Frederic Sala
Mamba
42
0
0
02 Jun 2024
Griffin: Mixing Gated Linear Recurrences with Local Attention for
  Efficient Language Models
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Soham De
Samuel L. Smith
Anushan Fernando
Aleksandar Botev
George-Christian Muraru
...
David Budden
Yee Whye Teh
Razvan Pascanu
Nando de Freitas
Çağlar Gülçehre
Mamba
61
117
0
29 Feb 2024
Transformers and Cortical Waves: Encoders for Pulling In Context Across
  Time
Transformers and Cortical Waves: Encoders for Pulling In Context Across Time
L. Muller
P. Churchland
T. Sejnowski
29
6
0
25 Jan 2024
Unlocking the Potential of ChatGPT: A Comprehensive Exploration of its
  Applications, Advantages, Limitations, and Future Directions in Natural
  Language Processing
Unlocking the Potential of ChatGPT: A Comprehensive Exploration of its Applications, Advantages, Limitations, and Future Directions in Natural Language Processing
Walid Hariri
AI4MH
LM&MA
38
85
0
27 Mar 2023
Resurrecting Recurrent Neural Networks for Long Sequences
Resurrecting Recurrent Neural Networks for Long Sequences
Antonio Orvieto
Samuel L. Smith
Albert Gu
Anushan Fernando
Çağlar Gülçehre
Razvan Pascanu
Soham De
88
271
0
11 Mar 2023
1