Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2010.11939
Cited By
v1
v2
v3 (latest)
Limitations of Autoregressive Models and Their Alternatives
22 October 2020
Chu-cheng Lin
Aaron Jaech
Xin Li
Matthew R. Gormley
Jason Eisner
Re-assign community
ArXiv (abs)
PDF
HTML
Github (917★)
Papers citing
"Limitations of Autoregressive Models and Their Alternatives"
42 / 42 papers shown
Type-Compliant Adaptation Cascades: Adapting Programmatic LM Workflows to Data
Chu-Cheng Lin
Daiyi Peng
Yifeng Lu
Ming Zhang
Eugene Ie
231
1
0
25 Aug 2025
Meta-R1: Empowering Large Reasoning Models with Metacognition
Haonan Dong
Haoran Ye
Wenhao Zhu
Kehan Jiang
Guojie Song
ReLM
LRM
AI4CE
175
2
0
24 Aug 2025
DLM-One: Diffusion Language Models for One-Step Sequence Generation
Tianqi Chen
Shujian Zhang
Mingyuan Zhou
290
11
0
30 May 2025
Attend or Perish: Benchmarking Attention in Algorithmic Reasoning
Michal Spiegel
Michal Štefánik
Marek Kadlcík
Josef Kuchař
378
1
0
28 Feb 2025
Diffusion Models for Tabular Data: Challenges, Current Progress, and Future Directions
Zhong Li
Qi Huang
Lincen Yang
Jiayang Shi
Zhao Yang
Niki van Stein
Thomas Bäck
M. Leeuwen
DiffM
351
13
0
24 Feb 2025
A Tutorial on LLM Reasoning: Relevant Methods behind ChatGPT o1
Jun Wang
LRM
KELM
373
20
0
15 Feb 2025
SetLexSem Challenge: Using Set Operations to Evaluate the Lexical and Semantic Robustness of Language Models
Neural Information Processing Systems (NeurIPS), 2024
Bardiya Akhbari
Manish Gawali
Nicholas A. Dronen
AAML
350
0
0
11 Nov 2024
OCEAN: Offline Chain-of-thought Evaluation and Alignment in Large Language Models
Junda Wu
Xintong Li
Ruoyu Wang
Yu Xia
Yuxin Xiong
...
Xiang Chen
Branislav Kveton
Lina Yao
Jingbo Shang
Julian McAuley
OffRL
LRM
273
7
0
31 Oct 2024
Scaling Diffusion Language Models via Adaptation from Autoregressive Models
International Conference on Learning Representations (ICLR), 2024
Shansan Gong
Shivam Agarwal
Yizhe Zhang
Jiacheng Ye
Lin Zheng
...
Peilin Zhao
W. Bi
Jiawei Han
Yuan Yao
Dianbo Sui
AI4CE
486
189
0
23 Oct 2024
Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning
International Conference on Learning Representations (ICLR), 2024
Jiacheng Ye
Lei Li
Shansan Gong
Lin Zheng
Xin Jiang
Zhiyu Li
Dianbo Sui
DiffM
LRM
883
99
0
18 Oct 2024
Online Multi-modal Root Cause Identification in Microservice Systems
Lecheng Zheng
Zhengzhang Chen
Haifeng Chen
279
1
0
13 Oct 2024
Guaranteed Generation from Large Language Models
International Conference on Learning Representations (ICLR), 2024
Minbeom Kim
Thibaut Thonet
Jos Rozen
Hwaran Lee
Kyomin Jung
Marc Dymetman
455
6
0
09 Oct 2024
Detecting Machine-Generated Long-Form Content with Latent-Space Variables
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Yufei Tian
Zeyu Pan
Nanyun Peng
DeLMO
367
1
0
04 Oct 2024
Local Attention Mechanism: Boosting the Transformer Architecture for Long-Sequence Time Series Forecasting
Ignacio Aguilera-Martos
Andrés Herrera-Poyatos
Julián Luengo
Francisco Herrera
AI4TS
315
6
0
04 Oct 2024
Promises and Pitfalls of Generative Masked Language Modeling: Theoretical Framework and Practical Guidelines
Yuchen Li
Alexandre Kirchmeyer
Aashay Mehta
Yilong Qin
Boris Dadachev
Kishore Papineni
Sanjiv Kumar
Andrej Risteski
426
5
0
22 Jul 2024
A Diffusion Model Framework for Unsupervised Neural Combinatorial Optimization
Sebastian Sanokowski
Sepp Hochreiter
Sebastian Lehner
455
59
0
03 Jun 2024
The pitfalls of next-token prediction
International Conference on Machine Learning (ICML), 2024
Gregor Bachmann
Vaishnavh Nagarajan
625
160
0
11 Mar 2024
Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models
Jiacheng Ye
Shansan Gong
Liheng Chen
Lin Zheng
Jiahui Gao
...
Chuan Wu
Xin Jiang
Zhenguo Li
Wei Bi
Lingpeng Kong
DiffM
LRM
AI4CE
360
25
0
12 Feb 2024
Towards Efficient Exact Optimization of Language Model Alignment
Haozhe Ji
Cheng Lu
Yilin Niu
Pei Ke
Hongning Wang
Jun Zhu
Jie Tang
Shiyu Huang
350
32
0
01 Feb 2024
Understanding User Experience in Large Language Model Interactions
Jiayin Wang
Weizhi Ma
Peijie Sun
Min Zhang
Jian-yun Nie
223
81
0
16 Jan 2024
Principled Gradient-based Markov Chain Monte Carlo for Text Generation
Li Du
Afra Amini
Lucas Torroba Hennigen
Xinyan Velocity Yu
Jason Eisner
Holden Lee
Robert Bamler
BDL
277
1
0
29 Dec 2023
Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning
Filippos Christianos
Georgios Papoudakis
Matthieu Zimmer
Thomas Coste
Zhihao Wu
...
Yicheng Luo
Jianye Hao
Youssef Attia El Hili
Haitham Bou-Ammar
Jun Wang
305
32
0
22 Dec 2023
LinguaLinked: A Distributed Large Language Model Inference System for Mobile Devices
Junchen Zhao
Yurun Song
Simeng Liu
Ian G. Harris
Sangeetha Abdu Jyothi
268
10
0
01 Dec 2023
What Formal Languages Can Transformers Express? A Survey
Transactions of the Association for Computational Linguistics (TACL), 2023
Lena Strobl
William Merrill
Gail Weiss
David Chiang
Dana Angluin
AI4CE
558
113
0
01 Nov 2023
Recurrent Neural Language Models as Probabilistic Finite-state Automata
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Anej Svete
Robert Bamler
634
6
0
08 Oct 2023
Language Model Decoding as Direct Metrics Optimization
International Conference on Learning Representations (ICLR), 2023
Haozhe Ji
Pei Ke
Hongning Wang
Shiyu Huang
392
8
0
02 Oct 2023
Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning
Jiasheng Ye
Zaixiang Zheng
Yu Bao
Lihua Qian
Quanquan Gu
DiffM
789
37
0
23 Aug 2023
Mini-Giants: "Small" Language Models and Open Source Win-Win
Zhengping Zhou
Lezhi Li
Xinxi Chen
Andy Li
SyDa
ALM
MoE
415
11
0
17 Jul 2023
Likelihood-Based Diffusion Language Models
Neural Information Processing Systems (NeurIPS), 2023
Ishaan Gulrajani
Tatsunori B. Hashimoto
DiffM
389
132
0
30 May 2023
Faith and Fate: Limits of Transformers on Compositionality
Neural Information Processing Systems (NeurIPS), 2023
Nouha Dziri
Ximing Lu
Melanie Sclar
Xiang Lorraine Li
Liwei Jian
...
Sean Welleck
Xiang Ren
Allyson Ettinger
Zaïd Harchaoui
Yejin Choi
ReLM
LRM
724
573
0
29 May 2023
Autoregressive Modeling with Lookahead Attention
Li Du
Hongyuan Mei
Jason Eisner
269
7
0
20 May 2023
Stochastic Code Generation
Swapnil Sharma
Nikita Anand
V. KranthiKiranG.
SyDa
155
1
0
14 Apr 2023
The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine Learning
International Conference on Machine Learning (ICML), 2023
Micah Goldblum
Marc Finzi
K. Rowan
A. Wilson
UQCV
FedML
696
73
0
11 Apr 2023
Parallel Vertex Diffusion for Unified Visual Grounding
AAAI Conference on Artificial Intelligence (AAAI), 2023
Ze-Long Cheng
Kehan Li
Peng Jin
Xiang Ji
Li-ming Yuan
Chang-rui Liu
Jie Chen
DiffM
324
40
0
13 Mar 2023
Imitating Human Behaviour with Diffusion Models
International Conference on Learning Representations (ICLR), 2023
Tim Pearce
Tabish Rashid
Anssi Kanervisto
David Bignell
Mingfei Sun
...
Sergio Valcarcel Macua
Shan Zheng Tan
Ida Momennejad
Katja Hofmann
Sam Devlin
DiffM
497
290
0
25 Jan 2023
A Measure-Theoretic Characterization of Tight Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Li Du
Lucas Torroba Hennigen
Tiago Pimentel
Clara Meister
Jason Eisner
Robert Bamler
378
34
0
20 Dec 2022
Language Models as Agent Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Jacob Andreas
LLMAG
322
182
0
03 Dec 2022
Self-learning locally-optimal hypertuning using maximum entropy, and comparison of machine learning approaches for estimating fatigue life in composite materials
Engineering structures (Eng. Struct.), 2022
I. Ben-Yelun
Miguel Diaz-Lago
Luis Saucedo-Mora
M. Sanz
Ricardo Callado
F. Montáns
75
16
0
19 Oct 2022
HYPRO: A Hybridly Normalized Probabilistic Model for Long-Horizon Prediction of Event Sequences
Neural Information Processing Systems (NeurIPS), 2022
Siqiao Xue
Xiaoming Shi
James Y. Zhang
Hongyuan Mei
AI4TS
240
59
0
04 Oct 2022
Language modeling via stochastic processes
International Conference on Learning Representations (ICLR), 2022
Rose E. Wang
Esin Durmus
Noah D. Goodman
Tatsunori Hashimoto
BDL
AI4TS
255
28
0
21 Mar 2022
Sampling from Discrete Energy-Based Models with Quality/Efficiency Trade-offs
B. Eikema
Germán Kruszewski
Hady ElSahar
Marc Dymetman
272
3
0
10 Dec 2021
Sequence-to-Sequence Learning with Latent Neural Grammars
Yoon Kim
773
43
0
02 Sep 2021
1
Page 1 of 1