v1v2v3 (latest)

Limitations of Autoregressive Models and Their Alternatives

22 October 2020

ArXiv (abs)PDF HTML Github (917★)

Papers citing "Limitations of Autoregressive Models and Their Alternatives"

42 / 42 papers shown

Type-Compliant Adaptation Cascades: Adapting Programmatic LM Workflows to Data

231

25 Aug 2025

Meta-R1: Empowering Large Reasoning Models with Metacognition

175

24 Aug 2025

DLM-One: Diffusion Language Models for One-Step Sequence Generation

Tianqi Chen

Shujian Zhang

Mingyuan Zhou

290

30 May 2025

Attend or Perish: Benchmarking Attention in Algorithmic Reasoning

378

28 Feb 2025

Diffusion Models for Tabular Data: Challenges, Current Progress, and Future Directions

351

24 Feb 2025

A Tutorial on LLM Reasoning: Relevant Methods behind ChatGPT o1

Jun Wang

LRM KELM

373

15 Feb 2025

SetLexSem Challenge: Using Set Operations to Evaluate the Lexical and Semantic Robustness of Language ModelsNeural Information Processing Systems (NeurIPS), 2024

350

11 Nov 2024

OCEAN: Offline Chain-of-thought Evaluation and Alignment in Large Language Models

Junda Wu

...

Xiang Chen

273

31 Oct 2024

Scaling Diffusion Language Models via Adaptation from Autoregressive ModelsInternational Conference on Learning Representations (ICLR), 2024

...

486

189

23 Oct 2024

Beyond Autoregression: Discrete Diffusion for Complex Reasoning and PlanningInternational Conference on Learning Representations (ICLR), 2024

883

18 Oct 2024

Online Multi-modal Root Cause Identification in Microservice Systems

Lecheng Zheng

Zhengzhang Chen

Haifeng Chen

279

13 Oct 2024

Guaranteed Generation from Large Language ModelsInternational Conference on Learning Representations (ICLR), 2024

455

09 Oct 2024

Detecting Machine-Generated Long-Form Content with Latent-Space VariablesConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Yufei Tian

Zeyu Pan

Nanyun Peng

DeLMO

367

04 Oct 2024

Local Attention Mechanism: Boosting the Transformer Architecture for Long-Sequence Time Series Forecasting

Ignacio Aguilera-Martos

Andrés Herrera-Poyatos

Julián Luengo

Francisco Herrera

AI4TS

315

04 Oct 2024

Promises and Pitfalls of Generative Masked Language Modeling: Theoretical Framework and Practical Guidelines

Sanjiv Kumar

Andrej Risteski

426

22 Jul 2024

A Diffusion Model Framework for Unsupervised Neural Combinatorial Optimization

Sebastian Sanokowski

Sepp Hochreiter

Sebastian Lehner

455

03 Jun 2024

The pitfalls of next-token predictionInternational Conference on Machine Learning (ICML), 2024

Gregor Bachmann

Vaishnavh Nagarajan

625

160

11 Mar 2024

Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models

Jiahui Gao

...

Chuan Wu

Xin Jiang

Zhenguo Li

Wei Bi

Lingpeng Kong

DiffM LRM AI4CE

360

12 Feb 2024

Towards Efficient Exact Optimization of Language Model Alignment

Jun Zhu

350

01 Feb 2024

Understanding User Experience in Large Language Model Interactions

Min Zhang

223

16 Jan 2024

Principled Gradient-based Markov Chain Monte Carlo for Text Generation

Li Du

Afra Amini

Lucas Torroba Hennigen

277

29 Dec 2023

Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning

...

Jianye Hao

Youssef Attia El Hili

Haitham Bou-Ammar

Jun Wang

305

22 Dec 2023

LinguaLinked: A Distributed Large Language Model Inference System for Mobile Devices

Sangeetha Abdu Jyothi

268

01 Dec 2023

What Formal Languages Can Transformers Express? A SurveyTransactions of the Association for Computational Linguistics (TACL), 2023

558

113

01 Nov 2023

Recurrent Neural Language Models as Probabilistic Finite-state AutomataConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Anej Svete

Robert Bamler

634

08 Oct 2023

Language Model Decoding as Direct Metrics OptimizationInternational Conference on Learning Representations (ICLR), 2023

392

02 Oct 2023

Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning

Jiasheng Ye

Quanquan Gu

789

23 Aug 2023

Mini-Giants: "Small" Language Models and Open Source Win-Win

415

17 Jul 2023

Likelihood-Based Diffusion Language ModelsNeural Information Processing Systems (NeurIPS), 2023

Ishaan Gulrajani

Tatsunori B. Hashimoto

DiffM

389

132

30 May 2023

Faith and Fate: Limits of Transformers on CompositionalityNeural Information Processing Systems (NeurIPS), 2023

Xiang Lorraine Li

...

Xiang Ren

Yejin Choi

724

573

29 May 2023

Autoregressive Modeling with Lookahead Attention

Li Du

Hongyuan Mei

Jason Eisner

269

20 May 2023

Stochastic Code Generation

155

14 Apr 2023

The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine LearningInternational Conference on Machine Learning (ICML), 2023

696

11 Apr 2023

Parallel Vertex Diffusion for Unified Visual GroundingAAAI Conference on Artificial Intelligence (AAAI), 2023

324

13 Mar 2023

Imitating Human Behaviour with Diffusion ModelsInternational Conference on Learning Representations (ICLR), 2023

...

Sergio Valcarcel Macua

497

290

25 Jan 2023

A Measure-Theoretic Characterization of Tight Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Li Du

Lucas Torroba Hennigen

378

20 Dec 2022

Language Models as Agent ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Jacob Andreas

LLMAG

322

182

03 Dec 2022

Self-learning locally-optimal hypertuning using maximum entropy, and comparison of machine learning approaches for estimating fatigue life in composite materialsEngineering structures (Eng. Struct.), 2022

19 Oct 2022

HYPRO: A Hybridly Normalized Probabilistic Model for Long-Horizon Prediction of Event SequencesNeural Information Processing Systems (NeurIPS), 2022

240

04 Oct 2022

Language modeling via stochastic processesInternational Conference on Learning Representations (ICLR), 2022

Rose E. Wang

Esin Durmus

Noah D. Goodman

Tatsunori Hashimoto

BDL AI4TS

255

21 Mar 2022

Sampling from Discrete Energy-Based Models with Quality/Efficiency Trade-offs

272

10 Dec 2021

Sequence-to-Sequence Learning with Latent Neural Grammars

Yoon Kim

773

02 Sep 2021