Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2302.11217
Cited By
Connecting Vision and Language with Video Localized Narratives
22 February 2023
P. Voigtlaender
Soravit Changpinyo
Jordi Pont-Tuset
Radu Soricut
V. Ferrari
VGen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Connecting Vision and Language with Video Localized Narratives"
7 / 7 papers shown
Title
Quilt-LLaVA: Visual Instruction Tuning by Extracting Localized Narratives from Open-Source Histopathology Videos
M. S. Seyfioglu
Wisdom O. Ikezogwo
Fatemeh Ghezloo
Ranjay Krishna
Linda G. Shapiro
22
31
0
07 Dec 2023
Quilt-1M: One Million Image-Text Pairs for Histopathology
Wisdom O. Ikezogwo
M. S. Seyfioglu
Fatemeh Ghezloo
Dylan Stefan Chan Geva
Fatwir Sheikh Mohammed
Pavan Kumar Anand
Ranjay Krishna
Linda G. Shapiro
CLIP
VLM
125
101
0
20 Jun 2023
Crossmodal-3600: A Massively Multilingual Multimodal Evaluation Dataset
Ashish V. Thapliyal
Jordi Pont-Tuset
Xi Chen
Radu Soricut
VGen
67
71
0
25 May 2022
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
...
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
224
1,017
0
13 Oct 2021
Panoptic Narrative Grounding
Cristina González
Nicolás Ayobi
Isabela Hernández
José Hernández
Jordi Pont-Tuset
Pablo Arbeláez
74
22
0
10 Sep 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
279
39,083
0
01 Sep 2014
1