Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.16276
Cited By
AVicuna: Audio-Visual LLM with Interleaver and Context-Boundary Alignment for Temporal Referential Dialogue
24 March 2024
Yunlong Tang
Daiki Shimada
Jing Bi
Chenliang Xu
VGen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"AVicuna: Audio-Visual LLM with Interleaver and Context-Boundary Alignment for Temporal Referential Dialogue"
8 / 8 papers shown
Title
VTimeLLM: Empower LLM to Grasp Video Moments
Bin Huang
Xin Wang
Hong Chen
Zihan Song
Wenwu Zhu
MLLM
82
80
0
30 Nov 2023
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Bin Lin
Yang Ye
Bin Zhu
Jiaxi Cui
Munan Ning
Peng Jin
Li-ming Yuan
VLM
MLLM
194
576
0
16 Nov 2023
LLMVA-GEBC: Large Language Model with Video Adapter for Generic Event Boundary Captioning
Yunlong Tang
Jinrui Zhang
Xiangchen Wang
Teng Wang
Feng Zheng
VLM
64
9
0
17 Jun 2023
Caption Anything: Interactive Image Description with Diverse Multimodal Controls
Teng Wang
Jinrui Zhang
Junjie Fei
Hao Zheng
Yunlong Tang
Zhe Li
Mingqi Gao
Shanshan Zhao
MLLM
102
81
0
04 May 2023
ChatVideo: A Tracklet-centric Multimodal and Versatile Video Understanding System
Junke Wang
Dongdong Chen
Chong Luo
Xiyang Dai
Lu Yuan
Zuxuan Wu
Yu-Gang Jiang
93
54
0
27 Apr 2023
Exploiting Context Information for Generic Event Boundary Captioning
Jinrui Zhang
Teng Wang
Feng Zheng
Ran Cheng
Ping Luo
62
5
0
03 Jul 2022
Procedure Planning in Instructional Videos via Contextual Modeling and Model-based Policy Learning
Jing Bi
Jiebo Luo
Chenliang Xu
61
48
0
05 Oct 2021
Generic Event Boundary Detection: A Benchmark for Event Segmentation
Mike Zheng Shou
Stan Weixian Lei
Weiyao Wang
Deepti Ghadiyaram
Matt Feiszli
VOS
80
76
0
26 Jan 2021
1