ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.17377
  4. Cited By
Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens

Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens

30 January 2024
Jiacheng Liu
Sewon Min
Luke Zettlemoyer
Yejin Choi
Hannaneh Hajishirzi
ArXivPDFHTML

Papers citing "Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens"

42 / 42 papers shown
Title
Safety Pretraining: Toward the Next Generation of Safe AI
Safety Pretraining: Toward the Next Generation of Safe AI
Pratyush Maini
Sachin Goyal
Dylan Sam
Alex Robey
Yash Savani
Yiding Jiang
Andy Zou
Zacharcy C. Lipton
J. Zico Kolter
47
0
0
23 Apr 2025
Bigram Subnetworks: Mapping to Next Tokens in Transformer Language Models
Bigram Subnetworks: Mapping to Next Tokens in Transformer Language Models
Tyler A. Chang
Benjamin Bergen
38
0
0
21 Apr 2025
ParaPO: Aligning Language Models to Reduce Verbatim Reproduction of Pre-training Data
ParaPO: Aligning Language Models to Reduce Verbatim Reproduction of Pre-training Data
Tong Chen
Faeze Brahman
Jiacheng Liu
Niloofar Mireshghallah
Weijia Shi
Pang Wei Koh
Luke Zettlemoyer
Hannaneh Hajishirzi
26
0
0
20 Apr 2025
Beyond Memorization: Mapping the Originality-Quality Frontier of Language Models
Beyond Memorization: Mapping the Originality-Quality Frontier of Language Models
Vishakh Padmakumar
Chen Yueh-Han
Jane Pan
Valerie Chen
He He
35
0
0
13 Apr 2025
On Language Models' Sensitivity to Suspicious Coincidences
On Language Models' Sensitivity to Suspicious Coincidences
Sriram Padmanabhan
Kanishka Misra
Kyle Mahowald
Eunsol Choi
ReLM
LRM
32
0
0
13 Apr 2025
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens
Jiacheng Liu
Taylor Blanton
Yanai Elazar
Sewon Min
YenSung Chen
...
Sophie Lebrecht
Yejin Choi
Hannaneh Hajishirzi
Ali Farhadi
Jesse Dodge
19
1
0
09 Apr 2025
Not All Data Are Unlearned Equally
Not All Data Are Unlearned Equally
Aravind Krishnan
Siva Reddy
Marius Mosbach
MU
39
0
0
07 Apr 2025
SuperBPE: Space Travel for Language Models
SuperBPE: Space Travel for Language Models
Alisa Liu
J. Hayase
Valentin Hofmann
Sewoong Oh
Noah A. Smith
Yejin Choi
38
1
0
17 Mar 2025
Synthesizing Privacy-Preserving Text Data via Finetuning without Finetuning Billion-Scale LLMs
Synthesizing Privacy-Preserving Text Data via Finetuning without Finetuning Billion-Scale LLMs
Bowen Tan
Zheng Xu
Eric P. Xing
Zhiting Hu
Shanshan Wu
SyDa
80
0
0
16 Mar 2025
Data Caricatures: On the Representation of African American Language in Pretraining Corpora
Nicholas Deas
Blake Vente
Amith Ananthram
Jessica A. Grieser
D. Patton
Shana Kleiner
James Shepard
Kathleen McKeown
31
0
0
13 Mar 2025
Transferring Extreme Subword Style Using Ngram Model-Based Logit Scaling
Craig Messner
Tom Lippincott
46
0
0
11 Mar 2025
Understanding the Limits of Lifelong Knowledge Editing in LLMs
Lukas Thede
Karsten Roth
Matthias Bethge
Zeynep Akata
Tom Hartvigsen
KELM
CLL
62
2
0
07 Mar 2025
Shades of Zero: Distinguishing Impossibility from Inconceivability
Shades of Zero: Distinguishing Impossibility from Inconceivability
Jennifer Hu
Felix Sosa
T. Ullman
26
0
0
27 Feb 2025
Theoretical Benefit and Limitation of Diffusion Language Model
Theoretical Benefit and Limitation of Diffusion Language Model
Guhao Feng
Yihan Geng
Jian-Yu Guan
Wei Yu Wu
Liwei Wang
Di He
DiffM
41
0
0
13 Feb 2025
A General Framework for Inference-time Scaling and Steering of Diffusion Models
A General Framework for Inference-time Scaling and Steering of Diffusion Models
R. Singhal
Zachary Horvitz
Ryan Teehan
Mengye Ren
Zhou Yu
Kathleen McKeown
Rajesh Ranganath
DiffM
56
15
0
17 Jan 2025
A Statistical and Multi-Perspective Revisiting of the Membership
  Inference Attack in Large Language Models
A Statistical and Multi-Perspective Revisiting of the Membership Inference Attack in Large Language Models
Bowen Chen
Namgi Han
Yusuke Miyao
89
0
0
18 Dec 2024
QUENCH: Measuring the gap between Indic and Non-Indic Contextual General
  Reasoning in LLMs
QUENCH: Measuring the gap between Indic and Non-Indic Contextual General Reasoning in LLMs
Mohammad Aflah Khan
Neemesh Yadav
Sarah Masud
Md. Shad Akhtar
64
0
0
16 Dec 2024
N-Gram Induction Heads for In-Context RL: Improving Stability and Reducing Data Needs
N-Gram Induction Heads for In-Context RL: Improving Stability and Reducing Data Needs
Ilya Zisman
Alexander Nikulin
Andrei Polubarov
Nikita Lyubaykin
Vladislav Kurenkov
Andrei Polubarov
Igor Kiselev
Vladislav Kurenkov
OffRL
36
1
0
04 Nov 2024
Interpretable Language Modeling via Induction-head Ngram Models
Interpretable Language Modeling via Induction-head Ngram Models
Eunji Kim
Sriya Mantena
Weiwei Yang
Chandan Singh
Sungroh Yoon
Jianfeng Gao
36
0
0
31 Oct 2024
Jet Expansions of Residual Computation
Jet Expansions of Residual Computation
Yihong Chen
Xiangxiang Xu
Yao Lu
Pontus Stenetorp
Luca Franceschi
11
2
0
08 Oct 2024
Probing Language Models on Their Knowledge Source
Probing Language Models on Their Knowledge Source
Zineddine Tighidet
Andrea Mogini
Jiali Mei
Benjamin Piwowarski
Patrick Gallinari
KELM
22
1
0
08 Oct 2024
DAPE V2: Process Attention Score as Feature Map for Length Extrapolation
DAPE V2: Process Attention Score as Feature Map for Length Extrapolation
Chuanyang Zheng
Yihang Gao
Han Shi
Jing Xiong
Jiankai Sun
...
Xiaozhe Ren
Michael Ng
Xin Jiang
Zhenguo Li
Yu Li
16
1
0
07 Oct 2024
AI as Humanity's Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text
AI as Humanity's Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text
Ximing Lu
Melanie Sclar
Skyler Hallinan
Niloofar Mireshghallah
Jiacheng Liu
...
Allyson Ettinger
Liwei Jiang
Khyathi Raghavi Chandu
Nouha Dziri
Yejin Choi
DeLMO
32
11
0
05 Oct 2024
Can Transformers Learn $n$-gram Language Models?
Can Transformers Learn nnn-gram Language Models?
Anej Svete
Nadav Borenstein
M. Zhou
Isabelle Augenstein
Ryan Cotterell
24
6
0
03 Oct 2024
Wait, but Tylenol is Acetaminophen... Investigating and Improving
  Language Models' Ability to Resist Requests for Misinformation
Wait, but Tylenol is Acetaminophen... Investigating and Improving Language Models' Ability to Resist Requests for Misinformation
Shan Chen
Mingye Gao
Kuleen Sasse
Thomas Hartvigsen
Brian Anthony
Lizhou Fan
Hugo J. W. L. Aerts
Jack Gallifant
Danielle S. Bitterman
LM&MA
13
0
0
30 Sep 2024
Gender, Race, and Intersectional Bias in Resume Screening via Language
  Model Retrieval
Gender, Race, and Intersectional Bias in Resume Screening via Language Model Retrieval
Kyra Wilson
Aylin Caliskan
20
11
0
29 Jul 2024
Demystifying Verbatim Memorization in Large Language Models
Demystifying Verbatim Memorization in Large Language Models
Jing Huang
Diyi Yang
Christopher Potts
ELM
PILM
MU
29
1
0
25 Jul 2024
SPIN: Hierarchical Segmentation with Subpart Granularity in Natural
  Images
SPIN: Hierarchical Segmentation with Subpart Granularity in Natural Images
Josh Myers-Dean
Jarek Reynolds
Brian Price
Yifei Fan
Danna Gurari
26
2
0
12 Jul 2024
From Loops to Oops: Fallback Behaviors of Language Models Under Uncertainty
From Loops to Oops: Fallback Behaviors of Language Models Under Uncertainty
Maor Ivgi
Ori Yoran
Jonathan Berant
Mor Geva
HILM
38
8
0
08 Jul 2024
Understanding Transformers via N-gram Statistics
Understanding Transformers via N-gram Statistics
Timothy Nguyen
12
9
0
30 Jun 2024
Can LLM Graph Reasoning Generalize beyond Pattern Memorization?
Can LLM Graph Reasoning Generalize beyond Pattern Memorization?
Yizhuo Zhang
Heng Wang
Shangbin Feng
Zhaoxuan Tan
Xiaochuang Han
Tianxing He
Yulia Tsvetkov
LRM
33
16
0
23 Jun 2024
Evaluating $n$-Gram Novelty of Language Models Using Rusty-DAWG
Evaluating nnn-Gram Novelty of Language Models Using Rusty-DAWG
William Merrill
Noah A. Smith
Yanai Elazar
ELM
TDI
29
9
0
18 Jun 2024
Language Models are Surprisingly Fragile to Drug Names in Biomedical
  Benchmarks
Language Models are Surprisingly Fragile to Drug Names in Biomedical Benchmarks
Jack Gallifant
Shan Chen
Pedro Moreira
Nikolaj Munch
Mingye Gao
Jackson Pond
Leo Anthony Celi
Hugo J. W. L. Aerts
Thomas Hartvigsen
Danielle S. Bitterman
23
2
0
17 Jun 2024
Superposed Decoding: Multiple Generations from a Single Autoregressive
  Inference Pass
Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass
Ethan Shen
Alan Fan
Sarah M Pratt
Jae Sung Park
Matthew Wallingford
Sham Kakade
Ari Holtzman
Ranjay Krishna
Ali Farhadi
Aditya Kusupati
27
2
0
28 May 2024
Cross-Care: Assessing the Healthcare Implications of Pre-training Data
  on Language Model Bias
Cross-Care: Assessing the Healthcare Implications of Pre-training Data on Language Model Bias
Shan Chen
Jack Gallifant
Mingye Gao
Pedro Moreira
Nikolaj Munch
...
Hugo J. W. L. Aerts
Brian Anthony
Leo Anthony Celi
William G. La Cava
Danielle S. Bitterman
24
8
0
09 May 2024
TAXI: Evaluating Categorical Knowledge Editing for Language Models
TAXI: Evaluating Categorical Knowledge Editing for Language Models
Derek Powell
Walter Gerych
Thomas Hartvigsen
KELM
24
7
0
23 Apr 2024
The Role of $n$-gram Smoothing in the Age of Neural Networks
The Role of nnn-gram Smoothing in the Age of Neural Networks
Luca Malagutti
Andrius Buinovskij
Anej Svete
Clara Meister
Afra Amini
Ryan Cotterell
20
6
0
25 Mar 2024
OLMo: Accelerating the Science of Language Models
OLMo: Accelerating the Science of Language Models
Dirk Groeneveld
Iz Beltagy
Pete Walsh
Akshita Bhagia
Rodney Michael Kinney
...
Jesse Dodge
Kyle Lo
Luca Soldaini
Noah A. Smith
Hanna Hajishirzi
OSLM
124
349
0
01 Feb 2024
Data Portraits: Recording Foundation Model Training Data
Data Portraits: Recording Foundation Model Training Data
Marc Marone
Benjamin Van Durme
129
30
0
06 Mar 2023
Training Language Models with Memory Augmentation
Training Language Models with Memory Augmentation
Zexuan Zhong
Tao Lei
Danqi Chen
RALM
221
126
0
25 May 2022
Deduplicating Training Data Makes Language Models Better
Deduplicating Training Data Makes Language Models Better
Katherine Lee
Daphne Ippolito
A. Nystrom
Chiyuan Zhang
Douglas Eck
Chris Callison-Burch
Nicholas Carlini
SyDa
234
447
0
14 Jul 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
236
1,508
0
31 Dec 2020
1