Fast Vocabulary Transfer for Language Model Compression

15 February 2024

Papers citing "Fast Vocabulary Transfer for Language Model Compression"

22 / 22 papers shown

Title
Optimizing LLMs for Italian: Reducing Token Fertility and Enhancing Efficiency Through Vocabulary Adaptation Luca Moroni Giovanni Puccetti Pere-Lluís Huguet Cabot Andrei Stefan Bejgu Edoardo Barba Alessio Miaschi F. Dell’Orletta Andrea Esuli Roberto Navigli 30 0 0 23 Apr 2025
Overcoming Vocabulary Constraints with Pixel-level Fallback Jonas F. Lotz Hendra Setiawan Stephan Peitz Yova Kementchedjhieva 38 0 0 02 Apr 2025
Cross-Tokenizer Distillation via Approximate Likelihood Matching Benjamin Minixhofer Ivan Vulić E. Ponti 99 0 0 25 Mar 2025
Florenz: Scaling Laws for Systematic Generalization in Vision-Language Models Julian Spravil Sebastian Houben Sven Behnke VLM 68 0 0 12 Mar 2025
Prune or Retrain: Optimizing the Vocabulary of Multilingual Models for Estonian Aleksei Dorkin Taido Purason Kairit Sirts 28 0 0 05 Jan 2025
Efficient Continual Pre-training of LLMs for Low-resource Languages Arijit Nag Soumen Chakrabarti Animesh Mukherjee Niloy Ganguly 77 0 0 13 Dec 2024
Efficient Online Inference of Vision Transformers by Training-Free Tokenization Leonidas Gee Wing Yan Li V. Sharmanska Novi Quadrianto ViT 88 0 0 23 Nov 2024
Adapters for Altering LLM Vocabularies: What Languages Benefit the Most? HyoJung Han Akiko Eriguchi Haoran Xu Hieu T. Hoang Marine Carpuat Huda Khayrallah VLM 32 2 0 12 Oct 2024
Generation with Dynamic Vocabulary Yanting Liu Tao Ji Changzhi Sun Yuanbin Wu Xiaoling Wang 40 0 0 11 Oct 2024
From Tokens to Words: On the Inner Lexicon of LLMs Guy Kaplan Matanel Oren Yuval Reif Roy Schwartz 41 12 0 08 Oct 2024
Gradient Routing: Masking Gradients to Localize Computation in Neural Networks Alex Cloud Jacob Goldman-Wetzler Evžen Wybitul Joseph Miller Alexander Matt Turner 21 2 0 06 Oct 2024
Mitigating Catastrophic Forgetting in Language Transfer via Model Merging Anton Alexandrov Veselin Raychev Mark Niklas Muller Ce Zhang Martin Vechev Kristina Toutanova MoMe CLL KELM 38 13 0 11 Jul 2024
An Empirical Comparison of Vocabulary Expansion and Initialization Approaches for Language Models Nandini Mundra Aditya Nanda Kishore Raj Dabre Ratish Puduppully Anoop Kunchukuttan Mitesh Khapra 30 3 0 08 Jul 2024
Exploring Design Choices for Building Language-Specific LLMs Atula Tejaswi Nilesh Gupta Eunsol Choi 27 10 0 20 Jun 2024
Zero-Shot Tokenizer Transfer Benjamin Minixhofer E. Ponti Ivan Vulić VLM 44 9 0 13 May 2024
Are Compressed Language Models Less Subgroup Robust? Leonidas Gee Andrea Zugarini Novi Quadrianto 28 1 0 26 Mar 2024
Multi-word Tokenization for Sequence Compression Leonidas Gee Leonardo Rigutini Marco Ernandes Andrea Zugarini 18 8 0 15 Feb 2024
Getting the most out of your tokenizer for pre-training and domain adaptation Gautier Dagan Gabriele Synnaeve Baptiste Rozière 32 20 0 01 Feb 2024
An energy-based comparative analysis of common approaches to text classification in the Legal domain S. Gultekin Achille Globo Andrea Zugarini Marco Ernandes Leonardo Rigutini 18 1 0 02 Nov 2023
LexGLUE: A Benchmark Dataset for Legal Language Understanding in English Ilias Chalkidis Abhik Jana D. Hartung M. Bommarito Ion Androutsopoulos Daniel Martin Katz Nikolaos Aletras AILaw ELM 123 247 0 03 Oct 2021
Distill, Adapt, Distill: Training Small, In-Domain Models for Neural Machine Translation Mitchell A. Gordon Kevin Duh CLL VLM 24 13 0 05 Mar 2020
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT Sheng Shen Zhen Dong Jiayu Ye Linjian Ma Z. Yao A. Gholami Michael W. Mahoney Kurt Keutzer MQ 225 574 0 12 Sep 2019