Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.00946
Cited By
Exposing Attention Glitches with Flip-Flop Language Modeling
1 June 2023
Bingbin Liu
Jordan T. Ash
Surbhi Goel
A. Krishnamurthy
Cyril Zhang
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Exposing Attention Glitches with Flip-Flop Language Modeling"
18 / 18 papers shown
Title
TRA: Better Length Generalisation with Threshold Relative Attention
Mattia Opper
Roland Fernandez
P. Smolensky
Jianfeng Gao
39
0
0
29 Mar 2025
Investigating Recurrent Transformers with Dynamic Halt
Jishnu Ray Chowdhury
Cornelia Caragea
34
1
0
01 Feb 2024
Hallucination is Inevitable: An Innate Limitation of Large Language Models
Ziwei Xu
Sanjay Jain
Mohan S. Kankanhalli
HILM
LRM
60
192
0
22 Jan 2024
How Language Model Hallucinations Can Snowball
Muru Zhang
Ofir Press
William Merrill
Alisa Liu
Noah A. Smith
HILM
LRM
78
246
0
22 May 2023
Unlimiformer: Long-Range Transformers with Unlimited Length Input
Amanda Bertsch
Uri Alon
Graham Neubig
Matthew R. Gormley
RALM
94
122
0
02 May 2023
Learning to Reason and Memorize with Self-Notes
Jack Lanchantin
Shubham Toshniwal
Jason Weston
Arthur Szlam
Sainbayar Sukhbaatar
ReLM
LRM
LLMAG
88
27
0
01 May 2023
Do Transformers Parse while Predicting the Masked Word?
Haoyu Zhao
A. Panigrahi
Rong Ge
Sanjeev Arora
74
29
0
14 Mar 2023
Resurrecting Recurrent Neural Networks for Long Sequences
Antonio Orvieto
Samuel L. Smith
Albert Gu
Anushan Fernando
Çağlar Gülçehre
Razvan Pascanu
Soham De
88
258
0
11 Mar 2023
How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding
Yuchen Li
Yuan-Fang Li
Andrej Risteski
101
61
0
07 Mar 2023
Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-Thought
Abulhair Saparov
He He
ELM
LRM
ReLM
116
270
0
03 Oct 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
297
3,163
0
21 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
Entity-Based Knowledge Conflicts in Question Answering
Shayne Longpre
Kartik Perisetla
Anthony Chen
Nikhil Ramesh
Chris DuBois
Sameer Singh
HILM
235
236
0
10 Sep 2021
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
234
690
0
27 Aug 2021
LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning
Yuhuai Wu
M. Rabe
Wenda Li
Jimmy Ba
Roger C. Grosse
Christian Szegedy
AIMat
LRM
61
51
0
15 Jan 2021
Language Models as Knowledge Bases?
Fabio Petroni
Tim Rocktaschel
Patrick Lewis
A. Bakhtin
Yuxiang Wu
Alexander H. Miller
Sebastian Riedel
KELM
AI4MH
396
2,576
0
03 Sep 2019
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,435
0
26 Sep 2016
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
214
7,687
0
17 Aug 2015
1