ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.00118
  4. Cited By
Gemma 2: Improving Open Language Models at a Practical Size
v1v2 (latest)

Gemma 2: Improving Open Language Models at a Practical Size

31 July 2024
Gemma Team
Gemma Team Morgane Riviere
Shreya Pathak
Pier Giuseppe Sessa
Cassidy Hardin
Surya Bhupatiraju
Léonard Hussenot
Thomas Mesnard
Bobak Shahriari
Alexandre Ramé
Johan Ferret
Peter J. Liu
P. Tafti
Abe Friesen
Michelle Casbon
Sabela Ramos
Ravin Kumar
Charline Le Lan
Sammy Jerome
Anton Tsitsulin
Nino Vieillard
Piotr Stańczyk
Sertan Girgin
Nikola Momchev
Matt Hoffman
S. Thakoor
Jean-Bastien Grill
Behnam Neyshabur
Olivier Bachem
Alanna Walton
Aliaksei Severyn
Alicia Parrish
Aliya Ahmad
Allen Hutchison
Alvin Abdagic
Amanda Carl
Amy Shen
Andy Brock
Andy Coenen
Anthony Laforge
Antonia Paterson
Ben Bastian
Bilal Piot
Boxi Wu
Brandon Royal
Charlie Chen
Chintu Kumar
Chris Perry
Christoper A. Welty
Christopher A. Choquette-Choo
Danila Sinopalnikov
David Weinberger
Dimple Vijaykumar
Dominika Rogoziñska
D. Herbison
Elisa Bandy
Emma Wang
Eric Noland
Erica Moreira
Evan Senter
Evgenii Eltyshev
Francesco Visin
Gabriel Rasskin
Gary Wei
Glenn Cameron
Gus Martins
Hadi Hashemi
Hanna Klimczak-Pluciñska
Harleen Batra
H. Dhand
Ivan Nardini
Jacinda Mein
Jack Zhou
James Svensson
Jeff Stanway
Jetha Chan
Jin Zhou
Joana Carrasqueira
Joana Iljazi
Jocelyn Becker
Joe Fernandez
Joost R. van Amersfoort
Josh Gordon
Josh Lipschultz
Joshua Newlan
Junsong Ji
Kareem Mohamed
Kartikeya Badola
Kat Black
Katie Millican
Keelin McDonell
Kelvin Nguyen
Kiranbir Sodhia
Kish Greene
Lars Lowe Sjoesund
Lauren Usui
Laurent Sifre
L. Heuermann
Leticia Lago
Lilly McNealus
Livio Baldini Soares
Logan Kilpatrick
Lucas Dixon
Luciano Martins
Machel Reid
Manvinder Singh
Mark Iverson
Martin Gorner
Mat Velloso
Mateo Wirth
Matt Davidow
Matt Miller
Matthew Rahtz
Matthew Watson
Meg Risdal
Mehran Kazemi
Michael Moynihan
Ming Zhang
Minsuk Kahng
Minwoo Park
Mofi Rahman
Mohit Khatwani
Natalie Dao
Nenshad Bardoliwalla
Nesh Devanathan
Neta Dumai
Nilay Chauhan
O. Wahltinez
Pankil Botarda
Parker Barnes
P. Barham
Paul Michel
Pengchong Jin
Petko Georgiev
Phil Culliton
Pradeep Kuppala
Ramona Comanescu
Ramona Merhej
Reena Jana
R. Rokni
Rishabh Agarwal
Ryan Mullins
Samaneh Saadat
Sara Mc Carthy
Sarah Perrin
Sébastien Arnold
Sebastian Krause
Shengyang Dai
S. Garg
Shruti Sheth
S. Ronstrom
Susan Chan
Timothy Jordan
Ting-To Yu
Tom Eccles
Tom Hennigan
Tomás Kociský
Tulsee Doshi
Vihan Jain
Vikas Yadav
Vilobh Meshram
Vishal Dharmadhikari
Warren Barkley
Wei Wei
Wenming Ye
Woohyun Han
Woosuk Kwon
Xiang Xu
Zhe Shen
Zhitao Gong
Zichuan Wei
Victor Cotruta
Phoebe Kirk
Anand Rao
Minh Giang
Ludovic Peran
T. Warkentin
Eli Collins
Joelle Barral
Zoubin Ghahramani
R. Hadsell
D. Sculley
Jeanine Banks
Anca Dragan
Slav Petrov
Oriol Vinyals
Jeffrey Dean
Demis Hassabis
Koray Kavukcuoglu
Clement Farabet
Elena Buchatskaya
Sebastian Borgeaud
Noah Fiedel
Armand Joulin
Kathleen Kenealy
Robert Dadashi
Alek Andreev
    VLMMoEOSLM
ArXiv (abs)PDFHTMLHuggingFace (79 upvotes)

Papers citing "Gemma 2: Improving Open Language Models at a Practical Size"

50 / 657 papers shown
Title
Can LLMs extract human-like fine-grained evidence for evidence-based fact-checking?
Can LLMs extract human-like fine-grained evidence for evidence-based fact-checking?
Antonín Jarolím
Martin Fajčík
Lucia Makaiová
68
0
0
26 Nov 2025
Chatty-KG: A Multi-Agent AI System for On-Demand Conversational Question Answering over Knowledge Graphs
Chatty-KG: A Multi-Agent AI System for On-Demand Conversational Question Answering over Knowledge Graphs
Reham Omar
Abdelghny Orogat
Ibrahim Abdelaziz
Omij Mangukiya
Panos Kalnis
Essam Mansour
118
0
0
26 Nov 2025
SingingSDS: A Singing-Capable Spoken Dialogue System for Conversational Roleplay Applications
SingingSDS: A Singing-Capable Spoken Dialogue System for Conversational Roleplay Applications
Jionghao Han
Jiatong Shi
Masao Someki
Yuxun Tang
Lan Liu
Yiwen Zhao
Wenhao Feng
Shinji Watanabe
VLM
96
0
0
26 Nov 2025
ParaBlock: Communication-Computation Parallel Block Coordinate Federated Learning for Large Language Models
ParaBlock: Communication-Computation Parallel Block Coordinate Federated Learning for Large Language Models
Yujia Wang
Yuanpu Cao
Jinghui Chen
FedML
198
0
0
25 Nov 2025
On Evaluating LLM Alignment by Evaluating LLMs as Judges
On Evaluating LLM Alignment by Evaluating LLMs as Judges
Yixin Liu
Pengfei Liu
Arman Cohan
ELM
141
0
0
25 Nov 2025
STORE: Semantic Tokenization, Orthogonal Rotation and Efficient Attention for Scaling Up Ranking Models
STORE: Semantic Tokenization, Orthogonal Rotation and Efficient Attention for Scaling Up Ranking ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2025
Y. Xu
Chaofan Fan
J. Hu
Yu Zhang
Zeng Xiaoyi
J. Zhang
120
1
0
24 Nov 2025
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models
Y. Fu
Xin Dong
Shizhe Diao
Matthijs Van Keirsbilck
Hanrong Ye
...
Maksim Khadkevich
A. Keller
Jan Kautz
Y. Lin
Pavlo Molchanov
90
0
0
24 Nov 2025
Findings of the BlackboxNLP 2025 Shared Task: Localizing Circuits and Causal Variables in Language Models
Findings of the BlackboxNLP 2025 Shared Task: Localizing Circuits and Causal Variables in Language Models
Dana Arad
Yonatan Belinkov
Hanjie Chen
Najoung Kim
Hosein Mohebbi
Aaron Mueller
Gabriele Sarti
Martin Tutek
24
0
0
23 Nov 2025
Building Resilient Information Ecosystems: Large LLM-Generated Dataset of Persuasion Attacks
Building Resilient Information Ecosystems: Large LLM-Generated Dataset of Persuasion Attacks
Hsien-Te Kao
Aleksey Panasyuk
Peter Bautista
William Dupree
Gabriel Ganberg
Jeffrey M. Beaubien
Laura Cassani
Svitlana Volkova
AAML
75
0
0
23 Nov 2025
Estonian WinoGrande Dataset: Comparative Analysis of LLM Performance on Human and Machine Translation
Estonian WinoGrande Dataset: Comparative Analysis of LLM Performance on Human and Machine Translation
Marii Ojastu
Hele-Andra Kuulmets
Aleksei Dorkin
Marika Borovikova
Dage Särg
Kairit Sirts
110
0
0
21 Nov 2025
Enhancing Breast Cancer Prediction with LLM-Inferred Confounders
Enhancing Breast Cancer Prediction with LLM-Inferred Confounders
Debmita Roy
AI4CE
132
0
0
20 Nov 2025
Anatomy of an Idiom: Tracing Non-Compositionality in Language Models
Andrew Gomes
112
0
0
20 Nov 2025
Global Resolution: Optimal Multi-Draft Speculative Sampling via Convex Minimization
Rahul Thomas
Arka Pal
44
0
0
19 Nov 2025
Fifty Shades of Greenwashing: The Political Economy of Climate Change Advertising on Social Media
Fifty Shades of Greenwashing: The Political Economy of Climate Change Advertising on Social Media
Robert Kubinec
Aseem Mahajan
40
0
0
18 Nov 2025
Hierarchical Token Prepending: Enhancing Information Flow in Decoder-based LLM Embeddings
Hierarchical Token Prepending: Enhancing Information Flow in Decoder-based LLM Embeddings
Xueying Ding
Xingyue Huang
Mingxuan Ju
Liam Collins
Yozen Liu
Leman Akoglu
Neil Shah
Tong Zhao
67
0
0
18 Nov 2025
Dropouts in Confidence: Moral Uncertainty in Human-LLM Alignment
Dropouts in Confidence: Moral Uncertainty in Human-LLM Alignment
Jea Kwon
L. Vecchietti
Sungwon Park
Meeyoung Cha
44
0
0
17 Nov 2025
Preference Learning from Physics-Based Feedback: Tuning Language Models to Design BCC/B2 Superalloys
Preference Learning from Physics-Based Feedback: Tuning Language Models to Design BCC/B2 Superalloys
Satanu Ghosh
Collin Holgate
Neal R. Brodnik
Doug Downey
Samantha Daly
Tresa M. Pollock
Samuel Carton
28
0
0
15 Nov 2025
Mixture of States: Routing Token-Level Dynamics for Multimodal Generation
Mixture of States: Routing Token-Level Dynamics for Multimodal Generation
Haozhe Liu
Ding Liu
Mingchen Zhuge
Zijian Zhou
Tian Xie
...
Juan-Manuel Perez-Rua
Tao Xiang
Wei Liu
Shikun Liu
Jürgen Schmidhuber
68
0
0
15 Nov 2025
KVSwap: Disk-aware KV Cache Offloading for Long-Context On-device Inference
KVSwap: Disk-aware KV Cache Offloading for Long-Context On-device Inference
H. Zhang
Chunwei Xia
Zheng Wang
SyDa
188
0
0
14 Nov 2025
Defending Unauthorized Model Merging via Dual-Stage Weight Protection
Defending Unauthorized Model Merging via Dual-Stage Weight Protection
Wei-Jia Chen
Min-Yen Tsai
Cheng-Yi Lee
Chia-Mu Yu
MoMeAAML
305
0
0
14 Nov 2025
Bench360: Benchmarking Local LLM Inference from 360°
Bench360: Benchmarking Local LLM Inference from 360°
Linus Stuhlmann
Mauricio Fadel Argerich
Jonathan Fürst
ELM
69
0
0
12 Nov 2025
ParliaBench: An Evaluation and Benchmarking Framework for LLM-Generated Parliamentary Speech
ParliaBench: An Evaluation and Benchmarking Framework for LLM-Generated Parliamentary Speech
Marios Koniaris
Argyro Tsipi
Panayiotis Tsanakas
AILawELM
256
0
0
11 Nov 2025
Surgical Agent Orchestration Platform for Voice-directed Patient Data Interaction
Surgical Agent Orchestration Platform for Voice-directed Patient Data Interaction
Hyeryun Park
Byung Mo Gu
Jun Hee Lee
Byeong Hyeon Choi
Sekeun Kim
Hyun Koo Kim
Kyungsang Kim
209
0
0
10 Nov 2025
More Agents Helps but Adversarial Robustness Gap Persists
More Agents Helps but Adversarial Robustness Gap Persists
Khashayar Alavi
Zhastay Yeltay
Lucie Flek
Akbar Karimi
AAML
80
0
0
10 Nov 2025
FinRpt: Dataset, Evaluation System and LLM-based Multi-agent Framework for Equity Research Report Generation
FinRpt: Dataset, Evaluation System and LLM-based Multi-agent Framework for Equity Research Report Generation
Song Jin
Shuqi Li
Shukun Zhang
Rui Yan
AIFin
409
0
0
10 Nov 2025
Beyond English: Toward Inclusive and Scalable Multilingual Machine Translation with LLMs
Beyond English: Toward Inclusive and Scalable Multilingual Machine Translation with LLMs
Yingfeng Luo
Ziqiang Xu
Yuxuan Ouyang
Murun Yang
Dingyang Lin
...
Bei Li
Peinan Feng
Quan Du
Tong Xiao
Jingbo Zhu
LRM
186
0
0
10 Nov 2025
Routing Manifold Alignment Improves Generalization of Mixture-of-Experts LLMs
Routing Manifold Alignment Improves Generalization of Mixture-of-Experts LLMs
Zhongyang Li
Ziyue Li
Tianyi Zhou
MoEMoMe
491
0
0
10 Nov 2025
Visual Exploration of Feature Relationships in Sparse Autoencoders with Curated Concepts
Visual Exploration of Feature Relationships in Sparse Autoencoders with Curated Concepts
Xinyuan Yan
Shusen Liu
Kowshik Thopalli
Bei Wang
104
0
0
08 Nov 2025
Retrieval-Augmented Generation in Medicine: A Scoping Review of Technical Implementations, Clinical Applications, and Ethical Considerations
Retrieval-Augmented Generation in Medicine: A Scoping Review of Technical Implementations, Clinical Applications, and Ethical Considerations
Rui Yang
Matthew Yu Heng Wong
Huitao Li
Xin Li
Wentao Zhu
...
J. Ong
Douglas Teodoro
Chuan Hong
Daniel Ting
Nan Liu
3DV
233
0
0
08 Nov 2025
Are We Aligned? A Preliminary Investigation of the Alignment of Responsible AI Values between LLMs and Human Judgment
Are We Aligned? A Preliminary Investigation of the Alignment of Responsible AI Values between LLMs and Human Judgment
Asma Z. Yamani
Malak Baslyman
Moataz Ahmed
147
0
0
06 Nov 2025
GEMMA-SQL: A Novel Text-to-SQL Model Based on Large Language Models
GEMMA-SQL: A Novel Text-to-SQL Model Based on Large Language Models
Hari Mohan Pandey
Anshul Gupta
Subham Sarkar
Minakshi Tomer
Schneider Johannes
Yan Gong
VLM
60
0
0
05 Nov 2025
Epidemiology of Large Language Models: A Benchmark for Observational Distribution Knowledge
Epidemiology of Large Language Models: A Benchmark for Observational Distribution Knowledge
Drago Plečko
Patrik Okanovic
Torsten Hoefler
Elias Bareinboim
Elias Bareinboim
112
0
0
04 Nov 2025
In Good GRACEs: Principled Teacher Selection for Knowledge Distillation
In Good GRACEs: Principled Teacher Selection for Knowledge DistillationIEEE computer architecture letters (CAL), 2025
A. Panigrahi
Bingbin Liu
Sadhika Malladi
Sham Kakade
Surbhi Goel
120
0
0
04 Nov 2025
Dynamic Reflections: Probing Video Representations with Text Alignment
Dynamic Reflections: Probing Video Representations with Text Alignment
Tyler Zhu
Tengda Han
Leonidas Guibas
Viorica Patraucean
M. Ovsjanikov
VGen
209
0
0
04 Nov 2025
Improving Romanian LLM Pretraining Data using Diversity and Quality Filtering
Improving Romanian LLM Pretraining Data using Diversity and Quality Filtering
Vlad Negoita
Mihai Masala
Traian Rebedea
66
0
0
02 Nov 2025
OpenSIR: Open-Ended Self-Improving Reasoner
OpenSIR: Open-Ended Self-Improving Reasoner
Wai-Chung Kwan
Joshua Ong Jun Leang
Pavlos Vougiouklis
Jeff Z. Pan
Marco Valentino
Pasquale Minervini
ReLMLRM
196
0
0
01 Nov 2025
Angular Steering: Behavior Control via Rotation in Activation Space
Angular Steering: Behavior Control via Rotation in Activation Space
Hieu M. Vu
T. Nguyen
LLMSV
252
3
0
30 Oct 2025
Retrieval and Argumentation Enhanced Multi-Agent LLMs for Judgmental Forecasting
Retrieval and Argumentation Enhanced Multi-Agent LLMs for Judgmental Forecasting
Deniz Gorur
Antoni Rago
Francesca Toni
123
0
0
28 Oct 2025
Global PIQA: Evaluating Physical Commonsense Reasoning Across 100+ Languages and Cultures
Global PIQA: Evaluating Physical Commonsense Reasoning Across 100+ Languages and Cultures
T. Chang
Catherine Arnett
Abdelrahman Eldesokey
Abdelrahman Sadallah
Abeer Kashar
...
Francesco Orabona
Francesco Periti
Gbenga Kayode Solomon
Gia Nghia Ngo
Gloria Udhehdhe-oze
LRMELM
116
1
0
28 Oct 2025
HACK: Hallucinations Along Certainty and Knowledge Axes
HACK: Hallucinations Along Certainty and Knowledge Axes
Adi Simhi
Jonathan Herzig
Itay Itzhak
Dana Arad
Zorik Gekhman
Roi Reichart
Fazl Barez
Gabriel Stanovsky
Idan Szpektor
Yonatan Belinkov
84
0
0
28 Oct 2025
Long-Context Modeling with Dynamic Hierarchical Sparse Attention for On-Device LLMs
Long-Context Modeling with Dynamic Hierarchical Sparse Attention for On-Device LLMs
Siheng Xiong
Joe Zou
Faramarz Fekri
Yae Jee Cho
36
0
0
28 Oct 2025
Breaking the Benchmark: Revealing LLM Bias via Minimal Contextual Augmentation
Breaking the Benchmark: Revealing LLM Bias via Minimal Contextual Augmentation
Kaveh Eskandari Miandoab
M. Kamruzzaman
Arshia Gharooni
Gene Louis Kim
Vasanth Sarathy
Ninareh Mehrabi
68
0
0
27 Oct 2025
Incentivizing Agentic Reasoning in LLM Judges via Tool-Integrated Reinforcement Learning
Incentivizing Agentic Reasoning in LLM Judges via Tool-Integrated Reinforcement Learning
Ran Xu
Jingjing Chen
Jiayu Ye
Yu Wu
Jun Yan
Carl Yang
Hongkun Yu
ELMLRM
170
2
0
27 Oct 2025
FARMER: Flow AutoRegressive Transformer over Pixels
FARMER: Flow AutoRegressive Transformer over Pixels
Guangting Zheng
Qinyu Zhao
Tao Yang
Fei Xiao
Zhijie Lin
Jie Wu
Jiajun Deng
Y. Zhang
Rui Zhu
VGen
178
4
0
27 Oct 2025
Beyond Understanding: Evaluating the Pragmatic Gap in LLMs' Cultural Processing of Figurative Language
Beyond Understanding: Evaluating the Pragmatic Gap in LLMs' Cultural Processing of Figurative Language
Mena Attia
Aashiq Muhamed
Mai AlKhamissi
Thamar Solorio
Mona Diab
53
0
0
27 Oct 2025
A Comprehensive Dataset for Human vs. AI Generated Text Detection
A Comprehensive Dataset for Human vs. AI Generated Text Detection
Rajarshi Roy
Nasrin Imanpour
Ashhar Aziz
Shashwat Bajpai
Gurpreet Singh
...
Vasu Sharma
Aishwarya N. Reganti
Vinija Jain
Aman Chadha
Amitava Das
DeLMO
332
0
0
26 Oct 2025
Evaluating LLMs' Reasoning Over Ordered Procedural Steps
Evaluating LLMs' Reasoning Over Ordered Procedural Steps
Adrita Anika
Md Messal Monem Miah
LRM
57
0
0
25 Oct 2025
Scalable Vision-Language-Action Model Pretraining for Robotic Manipulation with Real-Life Human Activity Videos
Scalable Vision-Language-Action Model Pretraining for Robotic Manipulation with Real-Life Human Activity Videos
Qixiu Li
Yu Deng
Yaobo Liang
L. Luo
Lei Zhou
...
Hao Chen
Lily Sun
Dong Chen
J. Yang
B. Guo
101
3
0
24 Oct 2025
Flight Delay Prediction via Cross-Modality Adaptation of Large Language Models and Aircraft Trajectory Representation
Flight Delay Prediction via Cross-Modality Adaptation of Large Language Models and Aircraft Trajectory Representation
Thaweerath Phisannupawong
J. J. Damanik
Han-Lim Choi
111
0
0
24 Oct 2025
Large Language Models as Model Organisms for Human Associative Learning
Large Language Models as Model Organisms for Human Associative Learning
Camila Kolling
Vy A. Vo
Mariya Toneva
KELM
148
0
0
24 Oct 2025
1234...121314
Next