Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.09462
Cited By
SViTT-Ego: A Sparse Video-Text Transformer for Egocentric Video
13 June 2024
Hector A. Valdez
Kyle Min
Subarna Tripathi
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SViTT-Ego: A Sparse Video-Text Transformer for Egocentric Video"
2 / 2 papers shown
Title
ANNEXE: Unified Analyzing, Answering, and Pixel Grounding for Egocentric Interaction
Yuejiao Su
Yi Wang
Qiongyang Hu
Chuang Yang
Lap-Pui Chau
45
0
0
02 Apr 2025
Omnivore: A Single Model for Many Visual Modalities
Rohit Girdhar
Mannat Singh
Nikhil Ravi
L. V. D. van der Maaten
Armand Joulin
Ishan Misra
211
225
0
20 Jan 2022
1