Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.04126
Cited By
OmniDataComposer: A Unified Data Structure for Multimodal Data Fusion and Infinite Data Generation
8 August 2023
Dongyang Yu
Shihao Wang
Yuan Fang
Wangpeng An
VGen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"OmniDataComposer: A Unified Data Structure for Multimodal Data Fusion and Infinite Data Generation"
5 / 5 papers shown
Title
ChatVideo: A Tracklet-centric Multimodal and Versatile Video Understanding System
Junke Wang
Dongdong Chen
Chong Luo
Xiyang Dai
Lu Yuan
Zuxuan Wu
Yu-Gang Jiang
93
54
0
27 Apr 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
256
4,223
0
30 Jan 2023
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
388
4,110
0
28 Jan 2022
ByteTrack: Multi-Object Tracking by Associating Every Detection Box
Yifu Zhang
Pei Sun
Yi-Xin Jiang
Dongdong Yu
Fucheng Weng
Zehuan Yuan
Ping Luo
Wenyu Liu
Xinggang Wang
VOT
99
1,317
0
13 Oct 2021
Simple Online and Realtime Tracking with a Deep Association Metric
N. Wojke
Alex Bewley
Dietrich Paulus
VOT
223
3,451
0
21 Mar 2017
1