ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2307.06304
  4. Cited By
Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and
  Resolution

Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution

Neural Information Processing Systems (NeurIPS), 2023
12 July 2023
Mostafa Dehghani
Basil Mustafa
Josip Djolonga
Jonathan Heek
Matthias Minderer
Mathilde Caron
Andreas Steiner
J. Puigcerver
Robert Geirhos
Ibrahim Alabdulmohsin
Avital Oliver
Piotr Padlewski
A. Gritsenko
Mario Luvcić
N. Houlsby
    ViT
ArXiv (abs)PDFHTMLHuggingFace (31 upvotes)

Papers citing "Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution"

20 / 120 papers shown
Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT
Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT
Le Zhuo
Ruoyi Du
Han Xiao
Yangguang Li
Dongyang Liu
...
Wanli Ouyang
Ziwei Liu
Ping Luo
Hongsheng Li
Peng Gao
302
104
0
05 Jun 2024
Patch-enhanced Mask Encoder Prompt Image Generation
Patch-enhanced Mask Encoder Prompt Image Generation
Shusong Xu
Peiye Liu
DiffM
170
1
0
29 May 2024
Wavelet-Based Image Tokenizer for Vision Transformers
Wavelet-Based Image Tokenizer for Vision Transformers
Zhenhai Zhu
Radu Soricut
ViT
226
6
0
28 May 2024
MSPE: Multi-Scale Patch Embedding Prompts Vision Transformers to Any
  Resolution
MSPE: Multi-Scale Patch Embedding Prompts Vision Transformers to Any Resolution
Wenzhuo Liu
Fei Zhu
Shijie Ma
Cheng-Lin Liu
221
4
0
28 May 2024
LookHere: Vision Transformers with Directed Attention Generalize and
  Extrapolate
LookHere: Vision Transformers with Directed Attention Generalize and Extrapolate
A. Fuller
Daniel G. Kyrollos
Yousef Yassin
James R. Green
331
4
0
22 May 2024
What matters when building vision-language models?
What matters when building vision-language models?Neural Information Processing Systems (NeurIPS), 2024
Hugo Laurençon
Léo Tronchon
Matthieu Cord
Victor Sanh
VLM
299
274
0
03 May 2024
PathFinder: Attention-Driven Dynamic Non-Line-of-Sight Tracking with a
  Mobile Robot
PathFinder: Attention-Driven Dynamic Non-Line-of-Sight Tracking with a Mobile Robot
Shenbagaraj Kannapiran
Sreenithy Chandran
Suren Jayasuriya
Spring Berman
176
0
0
07 Apr 2024
MambaMixer: Efficient Selective State Space Models with Dual Token and
  Channel Selection
MambaMixer: Efficient Selective State Space Models with Dual Token and Channel Selection
Ali Behrouz
Michele Santacatterina
Ramin Zabih
434
45
0
29 Mar 2024
ViTAR: Vision Transformer with Any Resolution
ViTAR: Vision Transformer with Any Resolution
Qihang Fan
Quanzeng You
Xiaotian Han
Yongfei Liu
Yunzhe Tao
Huaibo Huang
Ran He
Hongxia Yang
ViT
333
20
0
27 Mar 2024
Sora: A Review on Background, Technology, Limitations, and Opportunities
  of Large Vision Models
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
Yixin Liu
Kai Zhang
Yuan Li
Zhiling Yan
Chujie Gao
...
Yue Huang
Hanchi Sun
Jianfeng Gao
Lifang He
Lichao Sun
VLMVGenEGVM
423
488
0
27 Feb 2024
Representing Online Handwriting for Recognition in Large Vision-Language
  Models
Representing Online Handwriting for Recognition in Large Vision-Language Models
Anastasiia Fadeeva
Philippe Schlattner
Andrii Maksai
Mark Collier
Efi Kokiopoulou
Jesse Berent
C. Musat
284
7
0
23 Feb 2024
Neural Circuit Diagrams: Robust Diagrams for the Communication,
  Implementation, and Analysis of Deep Learning Architectures
Neural Circuit Diagrams: Robust Diagrams for the Communication, Implementation, and Analysis of Deep Learning Architectures
Vincent Abbott
143
6
0
08 Feb 2024
MESA: Matching Everything by Segmenting Anything
MESA: Matching Everything by Segmenting Anything
Yesheng Zhang
Xu Zhao
186
17
0
30 Jan 2024
BA-SAM: Scalable Bias-Mode Attention Mask for Segment Anything Model
BA-SAM: Scalable Bias-Mode Attention Mask for Segment Anything ModelComputer Vision and Pattern Recognition (CVPR), 2024
Yiran Song
Qianyu Zhou
Hefei Ling
Deng-Ping Fan
Xuequan Lu
Lizhuang Ma
VLM
502
20
0
04 Jan 2024
Input Compression with Positional Consistency for Efficient Training and
  Inference of Transformer Neural Networks
Input Compression with Positional Consistency for Efficient Training and Inference of Transformer Neural Networks
Amrit Nagarajan
Anand Raghunathan
VLMViT
63
0
0
22 Nov 2023
Navigating Scaling Laws: Compute Optimality in Adaptive Model Training
Navigating Scaling Laws: Compute Optimality in Adaptive Model TrainingInternational Conference on Machine Learning (ICML), 2023
Sotiris Anagnostidis
Gregor Bachmann
Imanol Schlag
Thomas Hofmann
336
2
0
06 Nov 2023
Win-Win: Training High-Resolution Vision Transformers from Two Windows
Win-Win: Training High-Resolution Vision Transformers from Two WindowsInternational Conference on Learning Representations (ICLR), 2023
Vincent Leroy
Jérôme Revaud
Thomas Lucas
Philippe Weinzaepfel
ViT
271
6
0
01 Oct 2023
Beyond Grids: Exploring Elastic Input Sampling for Vision Transformers
Beyond Grids: Exploring Elastic Input Sampling for Vision TransformersIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Adam Pardyl
Grzegorz Kurzejamski
Jan Olszewski
Tomasz Trzciñski
Bartosz Zieliñski
168
4
0
23 Sep 2023
DropCompute: simple and more robust distributed synchronous training via
  compute variance reduction
DropCompute: simple and more robust distributed synchronous training via compute variance reductionNeural Information Processing Systems (NeurIPS), 2023
Niv Giladi
Shahar Gottlieb
Moran Shkolnik
A. Karnieli
Ron Banner
Elad Hoffer
Kfir Y. Levy
Daniel Soudry
346
4
0
18 Jun 2023
Generative AI for Rapid Diffusion MRI with Improved Image Quality,
  Reliability and Generalizability
Generative AI for Rapid Diffusion MRI with Improved Image Quality, Reliability and Generalizability
Amir Sadikov
Xinlei Pan
Hannah L Choi
Lanya T. Cai
P. Mukherjee
DiffMMedIm
186
3
0
10 Mar 2023
Previous
123