ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.11939
  4. Cited By
Limitations of Autoregressive Models and Their Alternatives
v1v2v3 (latest)

Limitations of Autoregressive Models and Their Alternatives

22 October 2020
Chu-cheng Lin
Aaron Jaech
Xin Li
Matthew R. Gormley
Jason Eisner
ArXiv (abs)PDFHTMLGithub (917★)

Papers citing "Limitations of Autoregressive Models and Their Alternatives"

42 / 42 papers shown
Type-Compliant Adaptation Cascades: Adapting Programmatic LM Workflows to Data
Type-Compliant Adaptation Cascades: Adapting Programmatic LM Workflows to Data
Chu-Cheng Lin
Daiyi Peng
Yifeng Lu
Ming Zhang
Eugene Ie
231
1
0
25 Aug 2025
Meta-R1: Empowering Large Reasoning Models with Metacognition
Meta-R1: Empowering Large Reasoning Models with Metacognition
Haonan Dong
Haoran Ye
Wenhao Zhu
Kehan Jiang
Guojie Song
ReLMLRMAI4CE
175
2
0
24 Aug 2025
DLM-One: Diffusion Language Models for One-Step Sequence Generation
DLM-One: Diffusion Language Models for One-Step Sequence Generation
Tianqi Chen
Shujian Zhang
Mingyuan Zhou
290
11
0
30 May 2025
Attend or Perish: Benchmarking Attention in Algorithmic Reasoning
Attend or Perish: Benchmarking Attention in Algorithmic Reasoning
Michal Spiegel
Michal Štefánik
Marek Kadlcík
Josef Kuchař
378
1
0
28 Feb 2025
Diffusion Models for Tabular Data: Challenges, Current Progress, and Future Directions
Diffusion Models for Tabular Data: Challenges, Current Progress, and Future Directions
Zhong Li
Qi Huang
Lincen Yang
Jiayang Shi
Zhao Yang
Niki van Stein
Thomas Bäck
M. Leeuwen
DiffM
351
13
0
24 Feb 2025
A Tutorial on LLM Reasoning: Relevant Methods behind ChatGPT o1
A Tutorial on LLM Reasoning: Relevant Methods behind ChatGPT o1
Jun Wang
LRMKELM
373
20
0
15 Feb 2025
SetLexSem Challenge: Using Set Operations to Evaluate the Lexical and
  Semantic Robustness of Language Models
SetLexSem Challenge: Using Set Operations to Evaluate the Lexical and Semantic Robustness of Language ModelsNeural Information Processing Systems (NeurIPS), 2024
Bardiya Akhbari
Manish Gawali
Nicholas A. Dronen
AAML
350
0
0
11 Nov 2024
OCEAN: Offline Chain-of-thought Evaluation and Alignment in Large
  Language Models
OCEAN: Offline Chain-of-thought Evaluation and Alignment in Large Language Models
Junda Wu
Xintong Li
Ruoyu Wang
Yu Xia
Yuxin Xiong
...
Xiang Chen
Branislav Kveton
Lina Yao
Jingbo Shang
Julian McAuley
OffRLLRM
273
7
0
31 Oct 2024
Scaling Diffusion Language Models via Adaptation from Autoregressive Models
Scaling Diffusion Language Models via Adaptation from Autoregressive ModelsInternational Conference on Learning Representations (ICLR), 2024
Shansan Gong
Shivam Agarwal
Yizhe Zhang
Jiacheng Ye
Lin Zheng
...
Peilin Zhao
W. Bi
Jiawei Han
Yuan Yao
Dianbo Sui
AI4CE
486
189
0
23 Oct 2024
Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning
Beyond Autoregression: Discrete Diffusion for Complex Reasoning and PlanningInternational Conference on Learning Representations (ICLR), 2024
Jiacheng Ye
Lei Li
Shansan Gong
Lin Zheng
Xin Jiang
Zhiyu Li
Dianbo Sui
DiffMLRM
883
99
0
18 Oct 2024
Online Multi-modal Root Cause Identification in Microservice Systems
Online Multi-modal Root Cause Identification in Microservice Systems
Lecheng Zheng
Zhengzhang Chen
Haifeng Chen
279
1
0
13 Oct 2024
Guaranteed Generation from Large Language Models
Guaranteed Generation from Large Language ModelsInternational Conference on Learning Representations (ICLR), 2024
Minbeom Kim
Thibaut Thonet
Jos Rozen
Hwaran Lee
Kyomin Jung
Marc Dymetman
455
6
0
09 Oct 2024
Detecting Machine-Generated Long-Form Content with Latent-Space
  Variables
Detecting Machine-Generated Long-Form Content with Latent-Space VariablesConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Yufei Tian
Zeyu Pan
Nanyun Peng
DeLMO
367
1
0
04 Oct 2024
Local Attention Mechanism: Boosting the Transformer Architecture for Long-Sequence Time Series Forecasting
Local Attention Mechanism: Boosting the Transformer Architecture for Long-Sequence Time Series Forecasting
Ignacio Aguilera-Martos
Andrés Herrera-Poyatos
Julián Luengo
Francisco Herrera
AI4TS
315
6
0
04 Oct 2024
Promises and Pitfalls of Generative Masked Language Modeling:
  Theoretical Framework and Practical Guidelines
Promises and Pitfalls of Generative Masked Language Modeling: Theoretical Framework and Practical Guidelines
Yuchen Li
Alexandre Kirchmeyer
Aashay Mehta
Yilong Qin
Boris Dadachev
Kishore Papineni
Sanjiv Kumar
Andrej Risteski
426
5
0
22 Jul 2024
A Diffusion Model Framework for Unsupervised Neural Combinatorial Optimization
A Diffusion Model Framework for Unsupervised Neural Combinatorial Optimization
Sebastian Sanokowski
Sepp Hochreiter
Sebastian Lehner
455
59
0
03 Jun 2024
The pitfalls of next-token prediction
The pitfalls of next-token predictionInternational Conference on Machine Learning (ICML), 2024
Gregor Bachmann
Vaishnavh Nagarajan
625
160
0
11 Mar 2024
Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language
  Models
Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models
Jiacheng Ye
Shansan Gong
Liheng Chen
Lin Zheng
Jiahui Gao
...
Chuan Wu
Xin Jiang
Zhenguo Li
Wei Bi
Lingpeng Kong
DiffMLRMAI4CE
360
25
0
12 Feb 2024
Towards Efficient Exact Optimization of Language Model Alignment
Towards Efficient Exact Optimization of Language Model Alignment
Haozhe Ji
Cheng Lu
Yilin Niu
Pei Ke
Hongning Wang
Jun Zhu
Jie Tang
Shiyu Huang
350
32
0
01 Feb 2024
Understanding User Experience in Large Language Model Interactions
Understanding User Experience in Large Language Model Interactions
Jiayin Wang
Weizhi Ma
Peijie Sun
Min Zhang
Jian-yun Nie
223
81
0
16 Jan 2024
Principled Gradient-based Markov Chain Monte Carlo for Text Generation
Principled Gradient-based Markov Chain Monte Carlo for Text Generation
Li Du
Afra Amini
Lucas Torroba Hennigen
Xinyan Velocity Yu
Jason Eisner
Holden Lee
Robert Bamler
BDL
277
1
0
29 Dec 2023
Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning
Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning
Filippos Christianos
Georgios Papoudakis
Matthieu Zimmer
Thomas Coste
Zhihao Wu
...
Yicheng Luo
Jianye Hao
Youssef Attia El Hili
Haitham Bou-Ammar
Jun Wang
305
32
0
22 Dec 2023
LinguaLinked: A Distributed Large Language Model Inference System for
  Mobile Devices
LinguaLinked: A Distributed Large Language Model Inference System for Mobile Devices
Junchen Zhao
Yurun Song
Simeng Liu
Ian G. Harris
Sangeetha Abdu Jyothi
268
10
0
01 Dec 2023
What Formal Languages Can Transformers Express? A Survey
What Formal Languages Can Transformers Express? A SurveyTransactions of the Association for Computational Linguistics (TACL), 2023
Lena Strobl
William Merrill
Gail Weiss
David Chiang
Dana Angluin
AI4CE
558
113
0
01 Nov 2023
Recurrent Neural Language Models as Probabilistic Finite-state Automata
Recurrent Neural Language Models as Probabilistic Finite-state AutomataConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Anej Svete
Robert Bamler
634
6
0
08 Oct 2023
Language Model Decoding as Direct Metrics Optimization
Language Model Decoding as Direct Metrics OptimizationInternational Conference on Learning Representations (ICLR), 2023
Haozhe Ji
Pei Ke
Hongning Wang
Shiyu Huang
392
8
0
02 Oct 2023
Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning
Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning
Jiasheng Ye
Zaixiang Zheng
Yu Bao
Lihua Qian
Quanquan Gu
DiffM
789
37
0
23 Aug 2023
Mini-Giants: "Small" Language Models and Open Source Win-Win
Mini-Giants: "Small" Language Models and Open Source Win-Win
Zhengping Zhou
Lezhi Li
Xinxi Chen
Andy Li
SyDaALMMoE
415
11
0
17 Jul 2023
Likelihood-Based Diffusion Language Models
Likelihood-Based Diffusion Language ModelsNeural Information Processing Systems (NeurIPS), 2023
Ishaan Gulrajani
Tatsunori B. Hashimoto
DiffM
389
132
0
30 May 2023
Faith and Fate: Limits of Transformers on Compositionality
Faith and Fate: Limits of Transformers on CompositionalityNeural Information Processing Systems (NeurIPS), 2023
Nouha Dziri
Ximing Lu
Melanie Sclar
Xiang Lorraine Li
Liwei Jian
...
Sean Welleck
Xiang Ren
Allyson Ettinger
Zaïd Harchaoui
Yejin Choi
ReLMLRM
724
573
0
29 May 2023
Autoregressive Modeling with Lookahead Attention
Autoregressive Modeling with Lookahead Attention
Li Du
Hongyuan Mei
Jason Eisner
269
7
0
20 May 2023
Stochastic Code Generation
Stochastic Code Generation
Swapnil Sharma
Nikita Anand
V. KranthiKiranG.
SyDa
155
1
0
14 Apr 2023
The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of
  Inductive Biases in Machine Learning
The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine LearningInternational Conference on Machine Learning (ICML), 2023
Micah Goldblum
Marc Finzi
K. Rowan
A. Wilson
UQCVFedML
696
73
0
11 Apr 2023
Parallel Vertex Diffusion for Unified Visual Grounding
Parallel Vertex Diffusion for Unified Visual GroundingAAAI Conference on Artificial Intelligence (AAAI), 2023
Ze-Long Cheng
Kehan Li
Peng Jin
Xiang Ji
Li-ming Yuan
Chang-rui Liu
Jie Chen
DiffM
324
40
0
13 Mar 2023
Imitating Human Behaviour with Diffusion Models
Imitating Human Behaviour with Diffusion ModelsInternational Conference on Learning Representations (ICLR), 2023
Tim Pearce
Tabish Rashid
Anssi Kanervisto
David Bignell
Mingfei Sun
...
Sergio Valcarcel Macua
Shan Zheng Tan
Ida Momennejad
Katja Hofmann
Sam Devlin
DiffM
497
290
0
25 Jan 2023
A Measure-Theoretic Characterization of Tight Language Models
A Measure-Theoretic Characterization of Tight Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Li Du
Lucas Torroba Hennigen
Tiago Pimentel
Clara Meister
Jason Eisner
Robert Bamler
378
34
0
20 Dec 2022
Language Models as Agent Models
Language Models as Agent ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Jacob Andreas
LLMAG
322
182
0
03 Dec 2022
Self-learning locally-optimal hypertuning using maximum entropy, and
  comparison of machine learning approaches for estimating fatigue life in
  composite materials
Self-learning locally-optimal hypertuning using maximum entropy, and comparison of machine learning approaches for estimating fatigue life in composite materialsEngineering structures (Eng. Struct.), 2022
I. Ben-Yelun
Miguel Diaz-Lago
Luis Saucedo-Mora
M. Sanz
Ricardo Callado
F. Montáns
75
16
0
19 Oct 2022
HYPRO: A Hybridly Normalized Probabilistic Model for Long-Horizon
  Prediction of Event Sequences
HYPRO: A Hybridly Normalized Probabilistic Model for Long-Horizon Prediction of Event SequencesNeural Information Processing Systems (NeurIPS), 2022
Siqiao Xue
Xiaoming Shi
James Y. Zhang
Hongyuan Mei
AI4TS
240
59
0
04 Oct 2022
Language modeling via stochastic processes
Language modeling via stochastic processesInternational Conference on Learning Representations (ICLR), 2022
Rose E. Wang
Esin Durmus
Noah D. Goodman
Tatsunori Hashimoto
BDLAI4TS
255
28
0
21 Mar 2022
Sampling from Discrete Energy-Based Models with Quality/Efficiency
  Trade-offs
Sampling from Discrete Energy-Based Models with Quality/Efficiency Trade-offs
B. Eikema
Germán Kruszewski
Hady ElSahar
Marc Dymetman
272
3
0
10 Dec 2021
Sequence-to-Sequence Learning with Latent Neural Grammars
Sequence-to-Sequence Learning with Latent Neural Grammars
Yoon Kim
773
43
0
02 Sep 2021
1
Page 1 of 1