ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.00118
  4. Cited By
Gemma 2: Improving Open Language Models at a Practical Size
v1v2 (latest)

Gemma 2: Improving Open Language Models at a Practical Size

31 July 2024
Gemma Team
Gemma Team Morgane Riviere
Shreya Pathak
Pier Giuseppe Sessa
Cassidy Hardin
Surya Bhupatiraju
Léonard Hussenot
Thomas Mesnard
Bobak Shahriari
Alexandre Ramé
Johan Ferret
Peter J. Liu
P. Tafti
Abe Friesen
Michelle Casbon
Sabela Ramos
Ravin Kumar
Charline Le Lan
Sammy Jerome
Anton Tsitsulin
Nino Vieillard
Piotr Stańczyk
Sertan Girgin
Nikola Momchev
Matt Hoffman
S. Thakoor
Jean-Bastien Grill
Behnam Neyshabur
Olivier Bachem
Alanna Walton
Aliaksei Severyn
Alicia Parrish
Aliya Ahmad
Allen Hutchison
Alvin Abdagic
Amanda Carl
Amy Shen
Andy Brock
Andy Coenen
Anthony Laforge
Antonia Paterson
Ben Bastian
Bilal Piot
Boxi Wu
Brandon Royal
Charlie Chen
Chintu Kumar
Chris Perry
Christoper A. Welty
Christopher A. Choquette-Choo
Danila Sinopalnikov
David Weinberger
Dimple Vijaykumar
Dominika Rogoziñska
D. Herbison
Elisa Bandy
Emma Wang
Eric Noland
Erica Moreira
Evan Senter
Evgenii Eltyshev
Francesco Visin
Gabriel Rasskin
Gary Wei
Glenn Cameron
Gus Martins
Hadi Hashemi
Hanna Klimczak-Pluciñska
Harleen Batra
H. Dhand
Ivan Nardini
Jacinda Mein
Jack Zhou
James Svensson
Jeff Stanway
Jetha Chan
Jin Zhou
Joana Carrasqueira
Joana Iljazi
Jocelyn Becker
Joe Fernandez
Joost R. van Amersfoort
Josh Gordon
Josh Lipschultz
Joshua Newlan
Junsong Ji
Kareem Mohamed
Kartikeya Badola
Kat Black
Katie Millican
Keelin McDonell
Kelvin Nguyen
Kiranbir Sodhia
Kish Greene
Lars Lowe Sjoesund
Lauren Usui
Laurent Sifre
L. Heuermann
Leticia Lago
Lilly McNealus
Livio Baldini Soares
Logan Kilpatrick
Lucas Dixon
Luciano Martins
Machel Reid
Manvinder Singh
Mark Iverson
Martin Gorner
Mat Velloso
Mateo Wirth
Matt Davidow
Matt Miller
Matthew Rahtz
Matthew Watson
Meg Risdal
Mehran Kazemi
Michael Moynihan
Ming Zhang
Minsuk Kahng
Minwoo Park
Mofi Rahman
Mohit Khatwani
Natalie Dao
Nenshad Bardoliwalla
Nesh Devanathan
Neta Dumai
Nilay Chauhan
O. Wahltinez
Pankil Botarda
Parker Barnes
P. Barham
Paul Michel
Pengchong Jin
Petko Georgiev
Phil Culliton
Pradeep Kuppala
Ramona Comanescu
Ramona Merhej
Reena Jana
R. Rokni
Rishabh Agarwal
Ryan Mullins
Samaneh Saadat
Sara Mc Carthy
Sarah Perrin
Sébastien Arnold
Sebastian Krause
Shengyang Dai
S. Garg
Shruti Sheth
S. Ronstrom
Susan Chan
Timothy Jordan
Ting-To Yu
Tom Eccles
Tom Hennigan
Tomás Kociský
Tulsee Doshi
Vihan Jain
Vikas Yadav
Vilobh Meshram
Vishal Dharmadhikari
Warren Barkley
Wei Wei
Wenming Ye
Woohyun Han
Woosuk Kwon
Xiang Xu
Zhe Shen
Zhitao Gong
Zichuan Wei
Victor Cotruta
Phoebe Kirk
Anand Rao
Minh Giang
Ludovic Peran
T. Warkentin
Eli Collins
Joelle Barral
Zoubin Ghahramani
R. Hadsell
D. Sculley
Jeanine Banks
Anca Dragan
Slav Petrov
Oriol Vinyals
Jeffrey Dean
Demis Hassabis
Koray Kavukcuoglu
Clement Farabet
Elena Buchatskaya
Sebastian Borgeaud
Noah Fiedel
Armand Joulin
Kathleen Kenealy
Robert Dadashi
Alek Andreev
    VLMMoEOSLM
ArXiv (abs)PDFHTMLHuggingFace (79 upvotes)

Papers citing "Gemma 2: Improving Open Language Models at a Practical Size"

50 / 657 papers shown
Title
Scalable Vision-Language-Action Model Pretraining for Robotic Manipulation with Real-Life Human Activity Videos
Scalable Vision-Language-Action Model Pretraining for Robotic Manipulation with Real-Life Human Activity Videos
Qixiu Li
Yu Deng
Yaobo Liang
L. Luo
Lei Zhou
...
Hao Chen
Lily Sun
Dong Chen
J. Yang
B. Guo
101
3
0
24 Oct 2025
NeSyPr: Neurosymbolic Proceduralization For Efficient Embodied Reasoning
NeSyPr: Neurosymbolic Proceduralization For Efficient Embodied Reasoning
Wonje Choi
Jooyoung Kim
Honguk Woo
LRM
88
0
0
22 Oct 2025
Fast Inference via Hierarchical Speculative Decoding
Fast Inference via Hierarchical Speculative Decoding
Clara Mohri
Haim Kaplan
Tal Schuster
Yishay Mansour
Amir Globerson
80
0
0
22 Oct 2025
Data-Centric Lessons To Improve Speech-Language Pretraining
Data-Centric Lessons To Improve Speech-Language Pretraining
Vishaal Udandarao
Zhiyun Lu
Xuankai Chang
Yongqiang Wang
Violet Z. Yao
Albin Madapally Jose
Fartash Faghri
Josh Gardner
Chung-Cheng Chiu
108
0
0
22 Oct 2025
Scaling Laws Meet Model Architecture: Toward Inference-Efficient LLMs
Scaling Laws Meet Model Architecture: Toward Inference-Efficient LLMs
S. Bian
Tao Yu
Shivaram Venkataraman
Youngsuk Park
66
0
0
21 Oct 2025
EduAdapt: A Question Answer Benchmark Dataset for Evaluating Grade-Level Adaptability in LLMs
EduAdapt: A Question Answer Benchmark Dataset for Evaluating Grade-Level Adaptability in LLMs
Numaan Naeem
Abdellah El Mekki
Muhammad Abdul-Mageed
AI4EdELM
154
0
0
20 Oct 2025
DETree: DEtecting Human-AI Collaborative Texts via Tree-Structured Hierarchical Representation Learning
DETree: DEtecting Human-AI Collaborative Texts via Tree-Structured Hierarchical Representation Learning
Yongxin He
Shan Zhang
Yixuan Cao
Lei Ma
Ping Luo
DeLMO
156
0
0
20 Oct 2025
ChiKhaPo: A Large-Scale Multilingual Benchmark for Evaluating Lexical Comprehension and Generation in Large Language Models
ChiKhaPo: A Large-Scale Multilingual Benchmark for Evaluating Lexical Comprehension and Generation in Large Language Models
Emily Chang
Niyati Bafna
ELM
63
0
0
19 Oct 2025
In Generative AI We (Dis)Trust? Computational Analysis of Trust and Distrust in Reddit Discussions
In Generative AI We (Dis)Trust? Computational Analysis of Trust and Distrust in Reddit Discussions
Aria Pessianzadeh
Naima Sultana
Hildegarde Van den Bulck
David Gefen
Shahin Jabari
Rezvaneh Rezapour
48
0
0
17 Oct 2025
Towards Low-Resource Alignment to Diverse Perspectives with Sparse Feedback
Towards Low-Resource Alignment to Diverse Perspectives with Sparse Feedback
Chu Fei Luo
Samuel Dahan
Xiaodan Zhu
60
0
0
17 Oct 2025
CORE: Reducing UI Exposure in Mobile Agents via Collaboration Between Cloud and Local LLMs
CORE: Reducing UI Exposure in Mobile Agents via Collaboration Between Cloud and Local LLMs
Gucongcong Fan
Chaoyue Niu
Chengfei Lyu
Fan Wu
Guihai Chen
80
1
0
17 Oct 2025
Intent Clustering with Shared Pseudo-Labels
Intent Clustering with Shared Pseudo-Labels
I-Fan Lin
Faegheh Hasibi
Suzan Verberne
VLM
114
0
0
16 Oct 2025
To Steer or Not to Steer? Mechanistic Error Reduction with Abstention for Language Models
To Steer or Not to Steer? Mechanistic Error Reduction with Abstention for Language Models
Anna Hedström
Salim I. Amoukou
Tom Bewley
Saumitra Mishra
Manuela Veloso
LLMSV
136
2
0
15 Oct 2025
VaultGemma: A Differentially Private Gemma Model
VaultGemma: A Differentially Private Gemma Model
Amer Sinha
Thomas Mesnard
Ryan McKenna
Daogao Liu
Christopher A. Choquette-Choo
...
Borja De Balle Pigem
Prem Eruvbetine
T. Warkentin
Armand Joulin
Ravi KumarAmer Sinha
FedMLMoEVLMMDE
238
2
0
15 Oct 2025
DeepMMSearch-R1: Empowering Multimodal LLMs in Multimodal Web Search
DeepMMSearch-R1: Empowering Multimodal LLMs in Multimodal Web Search
Kartik Narayan
Yang Xu
Tian Cao
Kavya Nerella
Vishal M. Patel
Navid Shiee
Peter Grasch
Chao Jia
Yinfei Yang
Zhe Gan
ObjDKELMVLM
212
3
0
14 Oct 2025
Keep Calm and Avoid Harmful Content: Concept Alignment and Latent Manipulation Towards Safer Answers
Keep Calm and Avoid Harmful Content: Concept Alignment and Latent Manipulation Towards Safer Answers
Ruben Belo
Cláudia Soares
Marta Guimarães
KELM
79
0
0
14 Oct 2025
Analysing Moral Bias in Finetuned LLMs through Mechanistic Interpretability
Analysing Moral Bias in Finetuned LLMs through Mechanistic Interpretability
Bianca Raimondi
Daniela Dalbagno
Maurizio Gabbrielli
AI4CE
25
0
0
14 Oct 2025
Topological Alignment of Shared Vision-Language Embedding Space
Topological Alignment of Shared Vision-Language Embedding Space
Junwon You
Dasol Kang
Jae-Hun Jung
VLM
68
0
0
13 Oct 2025
Investigating Large Language Models' Linguistic Abilities for Text Preprocessing
Investigating Large Language Models' Linguistic Abilities for Text Preprocessing
Marco Braga
Gian Carlo Milanese
G. Pasi
64
0
0
13 Oct 2025
Don't Walk the Line: Boundary Guidance for Filtered Generation
Don't Walk the Line: Boundary Guidance for Filtered Generation
Sarah Ball
Andreas Haupt
76
1
0
13 Oct 2025
The Curious Case of Factual (Mis)Alignment between LLMs' Short- and Long-Form Answers
The Curious Case of Factual (Mis)Alignment between LLMs' Short- and Long-Form Answers
Saad Obaid ul Islam
Anne Lauscher
Goran Glavaš
HILM
138
0
0
13 Oct 2025
FactAppeal: Identifying Epistemic Factual Appeals in News Media
FactAppeal: Identifying Epistemic Factual Appeals in News Media
Guy Mor-Lan
Tamir Sheafer
Shaul R. Shenhav
HILM
96
0
0
12 Oct 2025
Augmenting Dialog with Think-Aloud Utterances for Modeling Individual Personality Traits by LLM
Augmenting Dialog with Think-Aloud Utterances for Modeling Individual Personality Traits by LLM
Seiya Ishikura
Hiroaki Yamada
Tatsuya Hiraoka
Hiroaki Yamada
T. Tokunaga
36
0
0
10 Oct 2025
CoBia: Constructed Conversations Can Trigger Otherwise Concealed Societal Biases in LLMs
CoBia: Constructed Conversations Can Trigger Otherwise Concealed Societal Biases in LLMs
Nafiseh Nikeghbal
Amir Hossein Kargaran
Jana Diesner
88
0
0
10 Oct 2025
RetouchLLM: Training-free Code-based Image Retouching with Vision Language Models
RetouchLLM: Training-free Code-based Image Retouching with Vision Language Models
Moon Ye-Bin
Roy Miles
Tae-Hyun Oh
Ismail Elezi
Jiankang Deng
OffRLVLM
103
0
0
09 Oct 2025
Kelp: A Streaming Safeguard for Large Models via Latent Dynamics-Guided Risk Detection
Kelp: A Streaming Safeguard for Large Models via Latent Dynamics-Guided Risk Detection
Xiaodan Li
Mengjie Wu
Yao Zhu
Yunna Lv
YueFeng Chen
Cen Chen
Jianmei Guo
H. Xue
KELM
131
0
0
09 Oct 2025
EDUMATH: Generating Standards-aligned Educational Math Word Problems
EDUMATH: Generating Standards-aligned Educational Math Word Problems
Bryan R Christ
Penelope Molitz
Jonathan Kropko
Thomas Hartvigsen
66
0
0
08 Oct 2025
Differentially Private Synthetic Text Generation for Retrieval-Augmented Generation (RAG)
Differentially Private Synthetic Text Generation for Retrieval-Augmented Generation (RAG)
Junki Mori
Kazuya Kakizaki
Taiki Miyagawa
Jun Sakuma
SILMSyDa
156
0
0
08 Oct 2025
Where to Begin: Efficient Pretraining via Subnetwork Selection and Distillation
Where to Begin: Efficient Pretraining via Subnetwork Selection and Distillation
Arjun Krishnakumar
R. Sukthanker
Hannan Javed Mahadik
Gabriela Kadlecová
Vladyslav Moroshan
Timur Carstensen
Frank Hutter
Aaron Klein
65
0
0
08 Oct 2025
Latent Representation Learning in Heavy-Ion Collisions with MaskPoint Transformer
Latent Representation Learning in Heavy-Ion Collisions with MaskPoint Transformer
Jing-Zong Zhang
Shuang Guo
Li-Lin Zhu
Lingxiao Wang
Guo-Liang Ma
96
10
0
08 Oct 2025
TWIST: Training-free and Label-free Short Text Clustering through Iterative Vector Updating with LLMs
TWIST: Training-free and Label-free Short Text Clustering through Iterative Vector Updating with LLMs
I-Fan Lin
Faegheh Hasibi
Suzan Verberne
52
0
0
08 Oct 2025
When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA
When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA
Elisei Rykov
Kseniia Petrushina
Maksim Savkin
Valerii Olisov
Artem Vazhentsev
Kseniia Titova
Ilseyar Alimova
Vasily Konovalov
Julia Belikova
HILM
129
2
0
06 Oct 2025
Staircase Streaming for Low-Latency Multi-Agent Inference
Staircase Streaming for Low-Latency Multi-Agent Inference
Junlin Wang
Jue Wang
Zhen
Ben Athiwaratkun
Bhuwan Dhingra
Ce Zhang
James Y. Zou
114
0
0
06 Oct 2025
Activation Steering with a Feedback Controller
Activation Steering with a Feedback Controller
Dung V. Nguyen
Hieu M. Vu
Nhi Y. Pham
Lei Zhang
T. Nguyen
LLMSV
155
0
0
05 Oct 2025
Read the Scene, Not the Script: Outcome-Aware Safety for LLMs
Read the Scene, Not the Script: Outcome-Aware Safety for LLMs
Rui Wu
Yihao Quan
Zeru Shi
Zhenting Wang
Yanshu Li
Ruixiang Tang
88
0
0
05 Oct 2025
Tree-based Dialogue Reinforced Policy Optimization for Red-Teaming Attacks
Tree-based Dialogue Reinforced Policy Optimization for Red-Teaming Attacks
Ruohao Guo
Afshin Oroojlooy
Roshan Sridhar
Miguel Ballesteros
Alan Ritter
Dan Roth
AAML
94
0
0
02 Oct 2025
Syntactic Blind Spots: How Misalignment Leads to LLMs Mathematical Errors
Syntactic Blind Spots: How Misalignment Leads to LLMs Mathematical Errors
Dane Williamson
Yangfeng Ji
Matthew B. Dwyer
LRM
55
1
0
02 Oct 2025
Limited Preference Data? Learning Better Reward Model with Latent Space Synthesis
Limited Preference Data? Learning Better Reward Model with Latent Space Synthesis
Leitian Tao
Xuefeng Du
Shouqing Yang
SyDa
160
0
0
30 Sep 2025
Reliability Crisis of Reference-free Metrics for Grammatical Error Correction
Reliability Crisis of Reference-free Metrics for Grammatical Error Correction
Takumi Goto
Yusuke Sakai
Taro Watanabe
40
0
0
30 Sep 2025
MobileLLM-R1: Exploring the Limits of Sub-Billion Language Model Reasoners with Open Training Recipes
MobileLLM-R1: Exploring the Limits of Sub-Billion Language Model Reasoners with Open Training Recipes
Changsheng Zhao
E. Chang
Zechun Liu
Chia-Jung Chang
Wei Wen
...
Rick Cao
Yuandong Tian
Raghuraman Krishnamoorthi
Yangyang Shi
Vikas Chandra
ReLMLRM
145
2
0
29 Sep 2025
Uni-X: Mitigating Modality Conflict with a Two-End-Separated Architecture for Unified Multimodal Models
Uni-X: Mitigating Modality Conflict with a Two-End-Separated Architecture for Unified Multimodal Models
Jitai Hao
Hao Liu
Xinyan Xiao
Qiang Huang
Jun Yu
116
0
0
29 Sep 2025
AdaDetectGPT: Adaptive Detection of LLM-Generated Text with Statistical Guarantees
AdaDetectGPT: Adaptive Detection of LLM-Generated Text with Statistical Guarantees
Hongyi Zhou
Jin Zhu
Pingfan Su
Kai Ye
Ying Yang
Shakeel A O B Gavioli-Akilagun
Chengchun Shi
DeLMO
353
1
0
29 Sep 2025
Training Agents Inside of Scalable World Models
Training Agents Inside of Scalable World Models
Danijar Hafner
Wilson Yan
Timothy Lillicrap
VGen
111
13
0
29 Sep 2025
Scaling with Collapse: Efficient and Predictable Training of LLM Families
Scaling with Collapse: Efficient and Predictable Training of LLM Families
Shane Bergsma
Bin Claire Zhang
Nolan Dey
Shaheer Muhammad
Gurpreet Gosal
Joel Hestness
108
2
0
29 Sep 2025
Evaluating Program Semantics Reasoning with Type Inference in System F
Evaluating Program Semantics Reasoning with Type Inference in System F
Yifeng He
Luning Yang
Christopher Castro Gaw Gonzalo
Hao Chen
ReLMLRM
391
1
0
28 Sep 2025
LLM Interpretability with Identifiable Temporal-Instantaneous Representation
LLM Interpretability with Identifiable Temporal-Instantaneous Representation
Xiangchen Song
Jiaqi Sun
Zijian Li
Yujia Zheng
Kun Zhang
80
0
0
27 Sep 2025
Multiplayer Nash Preference Optimization
Multiplayer Nash Preference Optimization
Fang Wu
X. Y. Huang
Weihao Xuan
Zhiwei Zhang
Yijia Xiao
...
Xiaomin Li
Bing Hu
Peng Xia
Jure Leskovec
Yejin Choi
88
1
0
27 Sep 2025
OrtSAE: Orthogonal Sparse Autoencoders Uncover Atomic Features
OrtSAE: Orthogonal Sparse Autoencoders Uncover Atomic Features
Anton Korznikov
Andrey V. Galichin
Alexey Dontsov
Oleg Y. Rogov
Elena Tutubalina
Ivan Oseledets
104
0
0
26 Sep 2025
Blockwise Hadamard high-Rank Adaptation for Parameter-Efficient LLM Fine-Tuning
Blockwise Hadamard high-Rank Adaptation for Parameter-Efficient LLM Fine-Tuning
Feng Yu
Jia Hu
Geyong Min
140
0
0
25 Sep 2025
TrustJudge: Inconsistencies of LLM-as-a-Judge and How to Alleviate Them
TrustJudge: Inconsistencies of LLM-as-a-Judge and How to Alleviate Them
Yidong Wang
Yunze Song
Tingyuan Zhu
X. Zhang
Zhuohao Yu
...
Zhen Wu
Xinyu Dai
Yue Zhang
Wei Ye
Shikun Zhang
ALM
170
0
0
25 Sep 2025
Previous
12345...121314
Next