ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.03321
  4. Cited By
PIXAR: Auto-Regressive Language Modeling in Pixel Space
v1v2 (latest)

PIXAR: Auto-Regressive Language Modeling in Pixel Space

6 January 2024
Yintao Tai
Xiyang Liao
Alessandro Suglia
Antonio Vergari
    MLLM
ArXiv (abs)PDFHTMLHuggingFace (2 upvotes)Github

Papers citing "PIXAR: Auto-Regressive Language Modeling in Pixel Space"

11 / 11 papers shown
Hebrew Diacritics Restoration using Visual Representation
Hebrew Diacritics Restoration using Visual Representation
Yair Elboher
Yuval Pinter
VLM
300
0
0
30 Oct 2025
Enhancing Robustness of Autoregressive Language Models against Orthographic Attacks via Pixel-based Approach
Enhancing Robustness of Autoregressive Language Models against Orthographic Attacks via Pixel-based Approach
Han Yang
Jian Lan
Yihong Liu
Hinrich Schutze
Thomas Seidl
AAML
88
0
0
28 Aug 2025
Understanding Subword Compositionality of Large Language Models
Understanding Subword Compositionality of Large Language Models
Qiwei Peng
Yekun Chai
Anders Søgaard
172
3
0
25 Aug 2025
Beyond Text Compression: Evaluating Tokenizers Across Scales
Beyond Text Compression: Evaluating Tokenizers Across ScalesAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Jonas F. Lotz
António V. Lopes
Stephan Peitz
Hendra Setiawan
Leonardo Emili
338
3
0
03 Jun 2025
Multilingual Pretraining for Pixel Language Models
Multilingual Pretraining for Pixel Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2025
Ilker Kesen
Jonas F. Lotz
Ingo Ziegler
Phillip Rust
Desmond Elliott
MLLMVLM
376
4
0
27 May 2025
Overcoming Vocabulary Constraints with Pixel-level Fallback
Overcoming Vocabulary Constraints with Pixel-level Fallback
Jonas F. Lotz
Hendra Setiawan
Stephan Peitz
Yova Kementchedjhieva
366
4
0
02 Apr 2025
Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models
Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models
Alex Jinpeng Wang
Linjie Li
Zhiyong Yang
Lijuan Wang
Min Li
DiffM
320
2
0
26 Mar 2025
Vision-centric Token Compression in Large Language Model
Vision-centric Token Compression in Large Language Model
Ling Xing
Alex Jinpeng Wang
Rui Yan
Xiangbo Shu
Jinhui Tang
VLM
773
11
0
02 Feb 2025
Everything is a Video: Unifying Modalities through Next-Frame Prediction
Everything is a Video: Unifying Modalities through Next-Frame Prediction
G. Hudson
Dean L. Slack
T. Winterbottom
Jamie Sterling
Chenghao Xiao
Junjie Shentu
Noura Al Moubayed
315
2
0
15 Nov 2024
LLaVA-Read: Enhancing Reading Ability of Multimodal Language Models
LLaVA-Read: Enhancing Reading Ability of Multimodal Language Models
Ruiyi Zhang
Jiuxiang Gu
Jian Chen
Jiuxiang Gu
Changyou Chen
Tongfei Sun
VLM
191
16
0
27 Jul 2024
Improving Language Understanding from Screenshots
Improving Language Understanding from Screenshots
Tianyu Gao
Zirui Wang
Adithya Bhaskar
Danqi Chen
VLM
243
14
0
21 Feb 2024
1
Page 1 of 1