ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.16847
  4. Cited By
TokenUnify: Scalable Autoregressive Visual Pre-training with Mixture
  Token Prediction

TokenUnify: Scalable Autoregressive Visual Pre-training with Mixture Token Prediction

27 May 2024
Yinda Chen
Haoyuan Shi
Xiaoyu Liu
Te Shi
Ruobing Zhang
Dong Liu
Zhiwei Xiong
Feng Wu
ArXivPDFHTML

Papers citing "TokenUnify: Scalable Autoregressive Visual Pre-training with Mixture Token Prediction"

9 / 9 papers shown
Title
MSP-MVS: Multi-Granularity Segmentation Prior Guided Multi-View Stereo
MSP-MVS: Multi-Granularity Segmentation Prior Guided Multi-View Stereo
Zhenlong Yuan
Cong Liu
Fei Shen
Zhaoxin Li
Tianlu Mao
Zhaoqi Wang
Zhaoqi Wang
3DV
65
1
0
27 Jul 2024
BIMCV-R: A Landmark Dataset for 3D CT Text-Image Retrieval
BIMCV-R: A Landmark Dataset for 3D CT Text-Image Retrieval
Yinda Chen
Che Liu
Xiaoyu Liu
Rossella Arcucci
Zhiwei Xiong
53
14
0
24 Mar 2024
EAGLE: An Edge-Aware Gradient Localization Enhanced Loss for CT Image
  Reconstruction
EAGLE: An Edge-Aware Gradient Localization Enhanced Loss for CT Image Reconstruction
Yipeng Sun
Yixing Huang
Linda-Sophie Schneider
Mareike Thies
Mingxuan Gu
Siyuan Mei
Siming Bayer
Andreas K. Maier
32
1
0
15 Mar 2024
T3D: Advancing 3D Medical Vision-Language Pre-training by Learning Multi-View Visual Consistency
T3D: Advancing 3D Medical Vision-Language Pre-training by Learning Multi-View Visual Consistency
Che Liu
Ouyang Cheng
Yinda Chen
César Quilodrán-Casas
Lei Ma
Jie Fu
Yike Guo
Anand Shah
Wenjia Bai
Rossella Arcucci
MedIm
54
8
0
03 Dec 2023
Point Cloud Self-supervised Learning via 3D to Multi-view Masked
  Autoencoder
Point Cloud Self-supervised Learning via 3D to Multi-view Masked Autoencoder
Zhimin Chen
Yingwei Li
Longlong Jing
Liang Yang
Bing Li
3DPC
29
9
0
17 Nov 2023
The effectiveness of MAE pre-pretraining for billion-scale pretraining
The effectiveness of MAE pre-pretraining for billion-scale pretraining
Mannat Singh
Quentin Duval
Kalyan Vasudev Alwala
Haoqi Fan
Vaibhav Aggarwal
...
Piotr Dollár
Christoph Feichtenhofer
Ross B. Girshick
Rohit Girdhar
Ishan Misra
LRM
105
63
0
23 Mar 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
247
4,223
0
30 Jan 2023
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
388
4,110
0
28 Jan 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,412
0
11 Nov 2021
1