Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1811.00982
Cited By
The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale
2 November 2018
Alina Kuznetsova
H. Rom
N. Alldrin
J. Uijlings
Ivan Krasin
Jordi Pont-Tuset
Shahab Kamali
S. Popov
Matteo Malloci
Alexander Kolesnikov
Tom Duerig
V. Ferrari
ObjD
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale"
50 / 201 papers shown
Title
ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension
Sanjay Subramanian
William Merrill
Trevor Darrell
Matt Gardner
Sameer Singh
Anna Rohrbach
ObjD
21
125
0
12 Apr 2022
Pre-train, Self-train, Distill: A simple recipe for Supersizing 3D Reconstruction
Kalyan Vasudev Alwala
Abhinav Gupta
Shubham Tulsiani
32
30
0
07 Apr 2022
How stable are Transferability Metrics evaluations?
A. Agostinelli
Michal Pándy
J. Uijlings
Thomas Mensink
V. Ferrari
35
22
0
04 Apr 2022
Data Cards: Purposeful and Transparent Dataset Documentation for Responsible AI
Mahima Pushkarna
Andrew Zaldivar
Oddur Kjartansson
AI4TS
27
197
0
03 Apr 2022
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language
Andy Zeng
Maria Attarian
Brian Ichter
K. Choromanski
Adrian S. Wong
...
Michael S. Ryoo
Vikas Sindhwani
Johnny Lee
Vincent Vanhoucke
Peter R. Florence
ReLM
LRM
13
572
0
01 Apr 2022
Image Retrieval from Contextual Descriptions
Benno Krojer
Vaibhav Adlakha
Vibhav Vineet
Yash Goyal
E. Ponti
Siva Reddy
13
29
0
29 Mar 2022
BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training
Likun Cai
Zhi-Li Zhang
Yi Zhu
Li Zhang
Mu Li
Xiangyang Xue
VLM
ObjD
34
40
0
24 Mar 2022
UNIMO-2: End-to-End Unified Vision-Language Grounded Learning
Wei Li
Can Gao
Guocheng Niu
Xinyan Xiao
Hao Liu
Jiachen Liu
Hua-Hong Wu
Haifeng Wang
MLLM
11
21
0
17 Mar 2022
Bamboo: Building Mega-Scale Vision Dataset Continually with Human-Machine Synergy
Yuanhan Zhang
Qi Sun
Yichun Zhou
Zexin He
Zhen-fei Yin
Kunze Wang
Lu Sheng
Yu Qiao
Jing Shao
Ziwei Liu
ObjD
VLM
21
19
0
15 Mar 2022
Unpaired Image Captioning by Image-level Weakly-Supervised Visual Concept Recognition
Peipei Zhu
Xiao Wang
Yong Luo
Zhenglong Sun
Wei-Shi Zheng
Yaowei Wang
C. L. P. Chen
24
12
0
07 Mar 2022
Attribute Descent: Simulating Object-Centric Datasets on the Content Level and Beyond
Yue Yao
Liang Zheng
Xiaodong Yang
Milind Napthade
Tom Gedeon
26
17
0
28 Feb 2022
Optical flow-based branch segmentation for complex orchard environments
A. You
C. Grimm
J. Davidson
23
9
0
26 Feb 2022
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Peng Wang
An Yang
Rui Men
Junyang Lin
Shuai Bai
Zhikang Li
Jianxin Ma
Chang Zhou
Jingren Zhou
Hongxia Yang
MLLM
ObjD
53
850
0
07 Feb 2022
Keyword localisation in untranscribed speech using visually grounded speech models
Kayode Olaleye
Dan Oneaţă
Herman Kamper
19
7
0
02 Feb 2022
Deep Learning Approaches on Image Captioning: A Review
Taraneh Ghandi
H. Pourreza
H. Mahyar
VLM
13
89
0
31 Jan 2022
RelTR: Relation Transformer for Scene Graph Generation
Yuren Cong
M. Yang
Bodo Rosenhahn
ViT
94
132
0
27 Jan 2022
CrossRectify: Leveraging Disagreement for Semi-supervised Object Detection
Cheng Ma
Xingjia Pan
QiXiang Ye
Fan Tang
Weiming Dong
Changsheng Xu
45
14
0
26 Jan 2022
CLIP-Event: Connecting Text and Images with Event Structures
Manling Li
Ruochen Xu
Shuohang Wang
Luowei Zhou
Xudong Lin
Chenguang Zhu
Michael Zeng
Heng Ji
Shih-Fu Chang
VLM
CLIP
16
123
0
13 Jan 2022
Equalized Focal Loss for Dense Long-Tailed Object Detection
Bo-wen Li
Yongqiang Yao
Jingru Tan
Gang Zhang
F. Yu
Jianwei Lu
Ye Luo
36
96
0
07 Jan 2022
LaTr: Layout-Aware Transformer for Scene-Text VQA
Ali Furkan Biten
Ron Litman
Yusheng Xie
Srikar Appalaraju
R. Manmatha
ViT
26
100
0
23 Dec 2021
HODOR: High-level Object Descriptors for Object Re-segmentation in Video Learned from Static Images
A. Athar
Jonathon Luiten
Alexander Hermans
Deva Ramanan
Bastian Leibe
VOS
24
25
0
16 Dec 2021
Injecting Semantic Concepts into End-to-End Image Captioning
Zhiyuan Fang
Jianfeng Wang
Xiaowei Hu
Lin Liang
Zhe Gan
Lijuan Wang
Yezhou Yang
Zicheng Liu
ViT
VLM
21
86
0
09 Dec 2021
Visual Persuasion in COVID-19 Social Media Content: A Multi-Modal Characterization
Mesut Erhan Unal
Adriana Kovashka
Wen-Ting Chung
Yu-Ru Lin
13
4
0
05 Dec 2021
Optimization of phase-only holograms calculated with scaled diffraction calculation through deep neural networks
Yoshiyuki Ishii
Tomoyoshi Shimobaba
David Blinder
Tobias Birnbaum
P. Schelkens
Takashi Kakue
T. Ito
9
10
0
02 Dec 2021
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Chenfei Wu
Jian Liang
Lei Ji
Fan Yang
Yuejian Fang
Daxin Jiang
Nan Duan
ViT
VGen
18
292
0
24 Nov 2021
UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling
Zhengyuan Yang
Zhe Gan
Jianfeng Wang
Xiaowei Hu
Faisal Ahmed
Zicheng Liu
Yumao Lu
Lijuan Wang
24
111
0
23 Nov 2021
Class-agnostic Object Detection with Multi-modal Transformer
Muhammad Maaz
H. Rasheed
Salman Khan
F. Khan
Rao Muhammad Anwer
Ming Yang
15
91
0
22 Nov 2021
Achieving Human Parity on Visual Question Answering
Ming Yan
Haiyang Xu
Chenliang Li
Junfeng Tian
Bin Bi
...
Ji Zhang
Songfang Huang
Fei Huang
Luo Si
Rong Jin
24
12
0
17 Nov 2021
A Survey of Visual Transformers
Yang Liu
Yao Zhang
Yixin Wang
Feng Hou
Jin Yuan
Jiang Tian
Yang Zhang
Zhongchao Shi
Jianping Fan
Zhiqiang He
3DGS
ViT
71
330
0
11 Nov 2021
Resource-Efficient Federated Learning
A. Abdelmoniem
Atal Narayan Sahu
Marco Canini
Suhaib A. Fahmy
FedML
25
52
0
01 Nov 2021
Multi-label Classification with Partial Annotations using Class-aware Selective Loss
Emanuel Ben-Baruch
T. Ridnik
Itamar Friedman
Avi Ben-Cohen
Nadav Zamir
Asaf Noy
Lihi Zelnik-Manor
30
38
0
21 Oct 2021
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
...
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
226
1,019
0
13 Oct 2021
Aura: Privacy-preserving Augmentation to Improve Test Set Diversity in Speech Enhancement
Xavier Gitiaux
Aditya Khant
Ebrahim Beyrami
Chandan K. A. Reddy
J. Gupchup
Ross Cutler
22
0
0
08 Oct 2021
Inferring Offensiveness In Images From Natural Language Supervision
P. Schramowski
Kristian Kersting
30
2
0
08 Oct 2021
Panoptic Narrative Grounding
Cristina González
Nicolás Ayobi
Isabela Hernández
José Hernández
Jordi Pont-Tuset
Pablo Arbeláez
79
22
0
10 Sep 2021
Learning to Generate Scene Graph from Natural Language Supervision
Yiwu Zhong
Jing Shi
Jianwei Yang
Chenliang Xu
Yin Li
SSL
31
77
0
06 Sep 2021
DVM-CAR: A large-scale automotive dataset for visual marketing research and applications
JingMin Huang
Bowei Chen
Lan Luo
Shigang Yue
I. Ounis
28
15
0
10 Aug 2021
Spatial-Temporal Transformer for Dynamic Scene Graph Generation
Yuren Cong
Wentong Liao
H. Ackermann
Bodo Rosenhahn
M. Yang
ViT
11
122
0
26 Jul 2021
PhotoChat: A Human-Human Dialogue Dataset with Photo Sharing Behavior for Joint Image-Text Modeling
Xiaoxue Zang
Lijuan Liu
Maria Wang
Yang Song
Hao Zhang
Jindong Chen
VLM
27
55
0
06 Jul 2021
CBNet: A Composite Backbone Network Architecture for Object Detection
Tingting Liang
Xiao Chu
Yudong Liu
Yongtao Wang
Zhi Tang
Wei Chu
Jingdong Chen
Haibin Ling
ObjD
13
161
0
01 Jul 2021
Extreme Multi-label Learning for Semantic Matching in Product Search
Wei-Cheng Chang
Daniel Jiang
Hsiang-Fu Yu
C. Teo
Jiong Zhang
...
Qie Hu
Nikhil Shandilya
Vyacheslav Ievgrafov
Japinder Singh
Inderjit S. Dhillon
37
59
0
23 Jun 2021
Tracking Instances as Queries
Shusheng Yang
Yuxin Fang
Xinggang Wang
Yu Li
Ying Shan
Bin Feng
Wenyu Liu
24
10
0
22 Jun 2021
GAIA: A Transfer Learning System of Object Detection that Fits Your Needs
Xingyuan Bu
Junran Peng
Junjie Yan
T. Tan
Zhaoxiang Zhang
ObjD
VLM
28
53
0
21 Jun 2021
MSN: Efficient Online Mask Selection Network for Video Instance Segmentation
Vidit Goel
Jiachen Li
Shubhika Garg
Harsh Maheshwari
Humphrey Shi
19
7
0
19 Jun 2021
Efficient Self-supervised Vision Transformers for Representation Learning
Chunyuan Li
Jianwei Yang
Pengchuan Zhang
Mei Gao
Bin Xiao
Xiyang Dai
Lu Yuan
Jianfeng Gao
ViT
32
209
0
17 Jun 2021
Compositional Sketch Search
Alexander Black
Tu Bui
Long Mai
Hailin Jin
John Collomosse
19
1
0
15 Jun 2021
Provably Robust Detection of Out-of-distribution Data (almost) for free
Alexander Meinke
Julian Bitterwolf
Matthias Hein
OODD
25
22
0
08 Jun 2021
Linguistic Structures as Weak Supervision for Visual Scene Graph Generation
Keren Ye
Adriana Kovashka
21
52
0
28 May 2021
The Challenge of Variable Effort Crowdsourcing and How Visible Gold Can Help
Danula Hettiachchi
M. Schaekermann
Tristan McKinney
Matthew Lease
54
19
0
20 May 2021
Waste detection in Pomerania: non-profit project for detecting waste in environment
Sylwia Majchrowska
Agnieszka Mikołajczyk
M. Ferlin
Zuzanna Klawikowska
Marta A. Plantykow
Arkadiusz Kwasigroch
K. Majek
22
125
0
12 May 2021
Previous
1
2
3
4
5
Next