ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.00118
  4. Cited By
Gemma 2: Improving Open Language Models at a Practical Size
v1v2 (latest)

Gemma 2: Improving Open Language Models at a Practical Size

31 July 2024
Gemma Team
Gemma Team Morgane Riviere
Shreya Pathak
Pier Giuseppe Sessa
Cassidy Hardin
Surya Bhupatiraju
Léonard Hussenot
Thomas Mesnard
Bobak Shahriari
Alexandre Ramé
Johan Ferret
Peter J. Liu
P. Tafti
Abe Friesen
Michelle Casbon
Sabela Ramos
Ravin Kumar
Charline Le Lan
Sammy Jerome
Anton Tsitsulin
Nino Vieillard
Piotr Stańczyk
Sertan Girgin
Nikola Momchev
Matt Hoffman
S. Thakoor
Jean-Bastien Grill
Behnam Neyshabur
Olivier Bachem
Alanna Walton
Aliaksei Severyn
Alicia Parrish
Aliya Ahmad
Allen Hutchison
Alvin Abdagic
Amanda Carl
Amy Shen
Andy Brock
Andy Coenen
Anthony Laforge
Antonia Paterson
Ben Bastian
Bilal Piot
Boxi Wu
Brandon Royal
Charlie Chen
Chintu Kumar
Chris Perry
Christoper A. Welty
Christopher A. Choquette-Choo
Danila Sinopalnikov
David Weinberger
Dimple Vijaykumar
Dominika Rogoziñska
D. Herbison
Elisa Bandy
Emma Wang
Eric Noland
Erica Moreira
Evan Senter
Evgenii Eltyshev
Francesco Visin
Gabriel Rasskin
Gary Wei
Glenn Cameron
Gus Martins
Hadi Hashemi
Hanna Klimczak-Pluciñska
Harleen Batra
H. Dhand
Ivan Nardini
Jacinda Mein
Jack Zhou
James Svensson
Jeff Stanway
Jetha Chan
Jin Zhou
Joana Carrasqueira
Joana Iljazi
Jocelyn Becker
Joe Fernandez
Joost R. van Amersfoort
Josh Gordon
Josh Lipschultz
Joshua Newlan
Junsong Ji
Kareem Mohamed
Kartikeya Badola
Kat Black
Katie Millican
Keelin McDonell
Kelvin Nguyen
Kiranbir Sodhia
Kish Greene
Lars Lowe Sjoesund
Lauren Usui
Laurent Sifre
L. Heuermann
Leticia Lago
Lilly McNealus
Livio Baldini Soares
Logan Kilpatrick
Lucas Dixon
Luciano Martins
Machel Reid
Manvinder Singh
Mark Iverson
Martin Gorner
Mat Velloso
Mateo Wirth
Matt Davidow
Matt Miller
Matthew Rahtz
Matthew Watson
Meg Risdal
Mehran Kazemi
Michael Moynihan
Ming Zhang
Minsuk Kahng
Minwoo Park
Mofi Rahman
Mohit Khatwani
Natalie Dao
Nenshad Bardoliwalla
Nesh Devanathan
Neta Dumai
Nilay Chauhan
O. Wahltinez
Pankil Botarda
Parker Barnes
P. Barham
Paul Michel
Pengchong Jin
Petko Georgiev
Phil Culliton
Pradeep Kuppala
Ramona Comanescu
Ramona Merhej
Reena Jana
R. Rokni
Rishabh Agarwal
Ryan Mullins
Samaneh Saadat
Sara Mc Carthy
Sarah Perrin
Sébastien Arnold
Sebastian Krause
Shengyang Dai
S. Garg
Shruti Sheth
S. Ronstrom
Susan Chan
Timothy Jordan
Ting-To Yu
Tom Eccles
Tom Hennigan
Tomás Kociský
Tulsee Doshi
Vihan Jain
Vikas Yadav
Vilobh Meshram
Vishal Dharmadhikari
Warren Barkley
Wei Wei
Wenming Ye
Woohyun Han
Woosuk Kwon
Xiang Xu
Zhe Shen
Zhitao Gong
Zichuan Wei
Victor Cotruta
Phoebe Kirk
Anand Rao
Minh Giang
Ludovic Peran
T. Warkentin
Eli Collins
Joelle Barral
Zoubin Ghahramani
R. Hadsell
D. Sculley
Jeanine Banks
Anca Dragan
Slav Petrov
Oriol Vinyals
Jeffrey Dean
Demis Hassabis
Koray Kavukcuoglu
Clement Farabet
Elena Buchatskaya
Sebastian Borgeaud
Noah Fiedel
Armand Joulin
Kathleen Kenealy
Robert Dadashi
Alek Andreev
    VLMMoEOSLM
ArXiv (abs)PDFHTMLHuggingFace (79 upvotes)

Papers citing "Gemma 2: Improving Open Language Models at a Practical Size"

50 / 657 papers shown
Title
Token Prepending: A Training-Free Approach for Eliciting Better Sentence Embeddings from LLMs
Token Prepending: A Training-Free Approach for Eliciting Better Sentence Embeddings from LLMsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Yuchen Fu
Zifeng Cheng
Zhiwei Jiang
Zhonghui Wang
Yafeng Yin
Zhengliang Li
Qing Gu
LLMAG
208
6
0
16 Dec 2024
RoLargeSum: A Large Dialect-Aware Romanian News Dataset for Summary,
  Headline, and Keyword Generation
RoLargeSum: A Large Dialect-Aware Romanian News Dataset for Summary, Headline, and Keyword GenerationInternational Conference on Computational Linguistics (COLING), 2024
Andrei-Marius Avram
Mircea Timpuriu
Andreea Iuga
Vlad-Cristian Matei
Iulian-Marius Taiatu
Tudor Găină
Dumitru-Clementin Cercel
Florin-Catalin Pop
Mihaela-Claudia Cercel
308
2
0
15 Dec 2024
Neptune: The Long Orbit to Benchmarking Long Video Understanding
Arsha Nagrani
Ruotong Wang
Ramin Mehran
Rachel Hornung
N. B. Gundavarapu
...
Boqing Gong
Cordelia Schmid
Mikhail Sirotenko
Yukun Zhu
Tobias Weyand
377
14
0
12 Dec 2024
Leveraging Audio and Text Modalities in Mental Health: A Study of LLMs Performance
Leveraging Audio and Text Modalities in Mental Health: A Study of LLMs Performance
Abdelrahman A. Ali
Aya E. Fouda
Radwa J. Hanafy
Mohammed E. Fouda
AI4MH
293
7
0
09 Dec 2024
Reinforcement Learning Enhanced LLMs: A Survey
Reinforcement Learning Enhanced LLMs: A Survey
Shuhe Wang
Shengyu Zhang
Jing Zhang
Runyi Hu
Xiaoya Li
Minlie Huang
Jiwei Li
Leilei Gan
G. Wang
Eduard H. Hovy
OffRL
586
46
0
05 Dec 2024
PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation
PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation
Ao Wang
Hui Chen
Jianchao Tan
Jianchao Tan
Xunliang Cai
Zijia Lin
Jiawei Han
Jungong Han
Guiguang Ding
VLM
420
4
0
04 Dec 2024
Medchain: Bridging the Gap Between LLM Agents and Clinical Practice with Interactive Sequence
Medchain: Bridging the Gap Between LLM Agents and Clinical Practice with Interactive Sequence
Jie Liu
Wenxuan Wang
Zizhan Ma
Guolin Huang
Yihang Su
Kao-Jung Chang
Wenting Chen
Haoliang Li
Linlin Shen
Michael R. Lyu
287
12
0
02 Dec 2024
SelfPrompt: Autonomously Evaluating LLM Robustness via
  Domain-Constrained Knowledge Guidelines and Refined Adversarial Prompts
SelfPrompt: Autonomously Evaluating LLM Robustness via Domain-Constrained Knowledge Guidelines and Refined Adversarial PromptsInternational Conference on Computational Linguistics (COLING), 2024
Aihua Pei
Zehua Yang
Shunan Zhu
Ruoxi Cheng
Ju Jia
AAML
317
5
0
01 Dec 2024
Initialization using Update Approximation is a Silver Bullet for Extremely Efficient Low-Rank Fine-Tuning
Initialization using Update Approximation is a Silver Bullet for Extremely Efficient Low-Rank Fine-Tuning
Kaustubh Ponkshe
Raghav Singhal
Eduard A. Gorbunov
Alexey Tumanov
Samuel Horváth
Praneeth Vepakomma
639
11
0
29 Nov 2024
Profiling Bias in LLMs: Stereotype Dimensions in Contextual Word Embeddings
Profiling Bias in LLMs: Stereotype Dimensions in Contextual Word Embeddings
Carolin M. Schuster
Maria-Alexandra Dinisor
Shashwat Ghatiwala
Georg Groh
307
3
0
25 Nov 2024
Context Awareness Gate For Retrieval Augmented Generation
Context Awareness Gate For Retrieval Augmented GenerationConference on Information and Knowledge Technology (IKT), 2024
Mohammad Hassan Heydari
Arshia Hemmat
Erfan Naman
Afsaneh Fatemi
RALM
300
2
0
25 Nov 2024
Visual-Word Tokenizer: Beyond Fixed Sets of Tokens in Vision Transformers
Visual-Word Tokenizer: Beyond Fixed Sets of Tokens in Vision Transformers
Leonidas Gee
Wing Yan Li
V. Sharmanska
Novi Quadrianto
ViT
581
0
0
23 Nov 2024
Tiny-Align: Bridging Automatic Speech Recognition and Large Language Model on the Edge
Tiny-Align: Bridging Automatic Speech Recognition and Large Language Model on the Edge
Ruiyang Qin
Dancheng Liu
Gelei Xu
Zheyu Yan
Chenhui Xu
Yuting Hu
Xiaolin Hu
Jinjun Xiong
Yiyu Shi
Y. Shi
AuLLM
439
2
0
21 Nov 2024
Are Large Language Models Memorizing Bug Benchmarks?
Are Large Language Models Memorizing Bug Benchmarks?
Daniel Ramos
Claudia Mamede
Kush Jain
Paulo Canelas
Catarina Gamboa
Claire Le Goues
PILMELM
441
14
0
20 Nov 2024
Reward Modeling with Ordinal Feedback: Wisdom of the Crowd
Reward Modeling with Ordinal Feedback: Wisdom of the Crowd
Shang Liu
Yu Pan
Guanting Chen
Xiaocheng Li
286
3
0
19 Nov 2024
Legal Evalutions and Challenges of Large Language Models
Legal Evalutions and Challenges of Large Language Models
Yuan Liu
Huan Zhao
Zhiyong Yang
Peng Shu
Jianfei Chen
...
Tianli Ding
Yu Bao
Tianming Liu
Xi Jiang
Shanghang Zhang
AILawELM
133
3
0
15 Nov 2024
Are LLMs Prescient? A Continuous Evaluation using Daily News as the Oracle
Are LLMs Prescient? A Continuous Evaluation using Daily News as the Oracle
Hui Dai
Ryan Teehan
Mengye Ren
KELMAIFinELM
220
7
0
13 Nov 2024
Towards Low-bit Communication for Tensor Parallel LLM Inference
Towards Low-bit Communication for Tensor Parallel LLM Inference
Harry Dong
Tyler Johnson
Minsik Cho
Emad Soroush
MQ
68
3
0
12 Nov 2024
Stronger Models are NOT Stronger Teachers for Instruction Tuning
Stronger Models are NOT Stronger Teachers for Instruction Tuning
Zhangchen Xu
Fengqing Jiang
Luyao Niu
Bill Yuchen Lin
Radha Poovendran
ALM
342
13
0
11 Nov 2024
Towards Unifying Interpretability and Control: Evaluation via Intervention
Towards Unifying Interpretability and Control: Evaluation via Intervention
Usha Bhalla
Suraj Srinivas
Asma Ghandeharioun
Himabindu Lakkaraju
311
17
0
07 Nov 2024
CUIfy the XR: An Open-Source Package to Embed LLM-powered Conversational Agents in XR
CUIfy the XR: An Open-Source Package to Embed LLM-powered Conversational Agents in XR
Kadir Burak Buldu
Suleyman Ozdel
Ka Hei Carrie Lau
Mengdi Wang
Daniel Saad
Sofie Schönborn
Auxane Boch
Enkelejda Kasneci
Efe Bozkir
281
8
0
07 Nov 2024
Attention Tracker: Detecting Prompt Injection Attacks in LLMs
Attention Tracker: Detecting Prompt Injection Attacks in LLMsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Kuo-Han Hung
Ching-Yun Ko
Ambrish Rawat
I-Hsin Chung
Winston H. Hsu
Pin-Yu Chen
343
43
0
01 Nov 2024
Generative AI-Powered Plugin for Robust Federated Learning in Heterogeneous IoT Networks
Generative AI-Powered Plugin for Robust Federated Learning in Heterogeneous IoT Networks
Youngjoon Lee
J. Gong
Joonhyuk Kang
249
0
0
31 Oct 2024
Constraint Back-translation Improves Complex Instruction Following of Large Language Models
Constraint Back-translation Improves Complex Instruction Following of Large Language Models
Yunjia Qi
Hao Peng
Xinyu Wang
Bin Xu
Lei Hou
Juanzi Li
341
5
0
31 Oct 2024
Likelihood approximations via Gaussian approximate inference
Likelihood approximations via Gaussian approximate inference
Thang D. Bui
126
1
0
28 Oct 2024
Group-SAE: Efficient Training of Sparse Autoencoders for Large Language Models via Layer Groups
Group-SAE: Efficient Training of Sparse Autoencoders for Large Language Models via Layer Groups
Davide Ghilardi
Federico Belotti
Marco Molinari
Tao Ma
Matteo Palmonari
185
9
0
28 Oct 2024
End-to-end Training for Recommendation with Language-based User Profiles
End-to-end Training for Recommendation with Language-based User Profiles
Zhaolin Gao
Joyce Zhou
Yijia Dai
Thorsten Joachims
AI4Ed
337
11
0
24 Oct 2024
Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms
Ferret-UI 2: Mastering Universal User Interface Understanding Across PlatformsInternational Conference on Learning Representations (ICLR), 2024
Zhangheng Li
Keen You
Hao Zhang
Di Feng
Harsh Agrawal
Xiujun Li
Mohana Prasad Sathya Moorthy
Jeff Nichols
Yue Yang
Zhe Gan
MLLM
400
41
0
24 Oct 2024
ZIP-FIT: Embedding-Free Data Selection via Compression-Based Alignment
ZIP-FIT: Embedding-Free Data Selection via Compression-Based Alignment
Elyas Obbad
Iddah Mlauzi
Alycia Lee
Rylan Schaeffer
Kamal Obbad
Suhana Bedi
Sanmi Koyejo
CVBM
277
0
0
23 Oct 2024
AUTALIC: A Dataset for Anti-AUTistic Ableist Language In Context
AUTALIC: A Dataset for Anti-AUTistic Ableist Language In ContextAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Naba Rizvi
Harper Strickland
Daniel Gitelman
Tristan Cooper
Alexis Morales-Flores
...
Haaset Owens
Saleha Ahmedi
Isha Khirwadkar
Imani Munyaka
Nedjma Ousidhoum
343
3
0
21 Oct 2024
Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation EngineeringNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Yu Zhao
Alessio Devoto
Giwon Hong
Xiaotang Du
Aryo Pradipta Gema
Hongru Wang
Xuanli He
Kam-Fai Wong
Pasquale Minervini
KELMLLMSV
266
44
0
21 Oct 2024
Electrocardiogram-Language Model for Few-Shot Question Answering with Meta Learning
Electrocardiogram-Language Model for Few-Shot Question Answering with Meta LearningACM Conference on Health, Inference, and Learning (CHIL), 2024
Jialu Tang
Tong Xia
Yuan Lu
Cecilia Mascolo
Aaqib Saeed
AI4MH
301
4
0
18 Oct 2024
Adapting Multilingual LLMs to Low-Resource Languages using Continued Pre-training and Synthetic Corpus
Adapting Multilingual LLMs to Low-Resource Languages using Continued Pre-training and Synthetic Corpus
Raviraj Joshi
Kanishk Singla
Anusha Kamath
Raunak Kalani
Rakesh Paul
Utkarsh Vaidya
Sanjay Singh Chauhan
Niranjan Wartikar
Eileen Long
SyDaCLL
320
19
0
18 Oct 2024
Decomposing The Dark Matter of Sparse Autoencoders
Decomposing The Dark Matter of Sparse Autoencoders
Joshua Engels
Logan Riggs
Max Tegmark
LLMSV
274
29
0
18 Oct 2024
SLM-Mod: Small Language Models Surpass LLMs at Content Moderation
SLM-Mod: Small Language Models Surpass LLMs at Content ModerationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Xianyang Zhan
Agam Goyal
Yilun Chen
Eshwar Chandrasekharan
Koustuv Saha
AI4MH
857
16
0
17 Oct 2024
MSc-SQL: Multi-Sample Critiquing Small Language Models For Text-To-SQL Translation
MSc-SQL: Multi-Sample Critiquing Small Language Models For Text-To-SQL TranslationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
S. Gorti
Ilan Gofman
Zhaoyan Liu
Jiapeng Wu
Noël Vouitsis
Guangwei Yu
Jesse C. Cresswell
Rasa Hosseinzadeh
SyDa
341
19
0
16 Oct 2024
Conformity in Large Language Models
Conformity in Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Xiaochen Zhu
Caiqi Zhang
Tom Stafford
Nigel Collier
Andreas Vlachos
449
6
0
16 Oct 2024
Interpreting token compositionality in LLMs: A robustness analysis
Interpreting token compositionality in LLMs: A robustness analysis
Nura Aljaafari
Danilo S. Carvalho
André Freitas
397
3
0
16 Oct 2024
Survey and Evaluation of Converging Architecture in LLMs based on
  Footsteps of Operations
Survey and Evaluation of Converging Architecture in LLMs based on Footsteps of OperationsIEEE Open Journal of the Computer Society (JOCS), 2024
Seongho Kim
Jihyun Moon
Juntaek Oh
Insu Choi
Joon-Sung Yang
72
0
0
15 Oct 2024
Bias Similarity Measurement: A Black-Box Audit of Fairness Across LLMs
Bias Similarity Measurement: A Black-Box Audit of Fairness Across LLMs
Hyejun Jeong
Shiqing Ma
Amir Houmansadr
374
0
0
15 Oct 2024
In-context KV-Cache Eviction for LLMs via Attention-Gate
In-context KV-Cache Eviction for LLMs via Attention-Gate
Zihao Zeng
Bokai Lin
Tianqi Hou
Hao Zhang
Zhijie Deng
241
7
0
15 Oct 2024
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family ExpertsInternational Conference on Learning Representations (ICLR), 2024
Guorui Zheng
Xidong Wang
Juhao Liang
Nuo Chen
Yuping Zheng
Benyou Wang
MoE
256
9
0
14 Oct 2024
Locality Alignment Improves Vision-Language Models
Locality Alignment Improves Vision-Language ModelsInternational Conference on Learning Representations (ICLR), 2024
Ian Covert
Tony Sun
James Zou
Tatsunori Hashimoto
VLM
497
11
0
14 Oct 2024
ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple Domains
ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple DomainsInternational Conference on Learning Representations (ICLR), 2024
Yein Park
Chanwoong Yoon
Jungwoo Park
Donghyeon Lee
Minbyul Jeong
Jaewoo Kang
KELM
365
3
0
13 Oct 2024
FB-Bench: A Fine-Grained Multi-Task Benchmark for Evaluating LLMs' Responsiveness to Human Feedback
FB-Bench: A Fine-Grained Multi-Task Benchmark for Evaluating LLMs' Responsiveness to Human Feedback
Yongbin Li
Miao Zheng
Fan Yang
Bin Cui
Tengjiao Wang
Xin Wu
Guosheng Dong
Wentao Zhang
ALM
265
10
0
12 Oct 2024
The Geometry of Concepts: Sparse Autoencoder Feature Structure
The Geometry of Concepts: Sparse Autoencoder Feature Structure
Yuxiao Li
Eric J. Michaud
David D. Baek
Joshua Engels
Xiaoqing Sun
Max Tegmark
244
30
0
10 Oct 2024
Extracting and Combining Abilities For Building Multi-lingual Ability-enhanced Large Language Models
Extracting and Combining Abilities For Building Multi-lingual Ability-enhanced Large Language Models
Zhipeng Chen
Liang Song
K. Zhou
Wayne Xin Zhao
Binghai Wang
Weipeng Chen
Ji-Rong Wen
294
0
0
10 Oct 2024
Reward-Augmented Data Enhances Direct Preference Alignment of LLMs
Reward-Augmented Data Enhances Direct Preference Alignment of LLMs
Shenao Zhang
Zhihan Liu
Boyi Liu
Yanzhe Zhang
Yingxiang Yang
Yunxing Liu
Liyu Chen
Tao Sun
Ziyi Wang
467
5
0
10 Oct 2024
Efficient Dictionary Learning with Switch Sparse Autoencoders
Efficient Dictionary Learning with Switch Sparse AutoencodersInternational Conference on Learning Representations (ICLR), 2024
Anish Mudide
Joshua Engels
Eric J. Michaud
Max Tegmark
Christian Schroeder de Witt
187
28
0
10 Oct 2024
Context-Augmented Code Generation Using Programming Knowledge Graphs
Context-Augmented Code Generation Using Programming Knowledge Graphs
Iman Saberi
Fatemeh H. Fard
198
4
0
09 Oct 2024
Previous
123...11121314
Next