ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.00982
  4. Cited By
The Open Images Dataset V4: Unified image classification, object
  detection, and visual relationship detection at scale
v1v2 (latest)

The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale

2 November 2018
Alina Kuznetsova
H. Rom
N. Alldrin
J. Uijlings
Ivan Krasin
Jordi Pont-Tuset
Shahab Kamali
S. Popov
Matteo Malloci
Alexander Kolesnikov
Tom Duerig
V. Ferrari
    ObjDVLM
ArXiv (abs)PDFHTML

Papers citing "The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale"

50 / 623 papers shown
Orientation Aware Weapons Detection In Visual Data : A Benchmark Dataset
Orientation Aware Weapons Detection In Visual Data : A Benchmark Dataset
Nazeef Ul Haq
M. Fraz
Tufail Sajjad Shah Hashmi
Muhammad Shahzad
205
10
0
04 Dec 2021
Optimization of phase-only holograms calculated with scaled diffraction
  calculation through deep neural networks
Optimization of phase-only holograms calculated with scaled diffraction calculation through deep neural networks
Yoshiyuki Ishii
Tomoyoshi Shimobaba
David Blinder
Tobias Birnbaum
P. Schelkens
Takashi Kakue
T. Ito
58
13
0
02 Dec 2021
Object-Aware Cropping for Self-Supervised Learning
Object-Aware Cropping for Self-Supervised Learning
Shlok Kumar Mishra
Anshul B. Shah
Ankan Bansal
Abhyuday N. Jagannatha
Janit Anjaria
Abhishek Sharma
David Jacobs
Dilip Krishnan
SSL
411
27
0
01 Dec 2021
Generating More Pertinent Captions by Leveraging Semantics and Style on
  Multi-Source Datasets
Generating More Pertinent Captions by Leveraging Semantics and Style on Multi-Source Datasets
Marcella Cornia
Lorenzo Baraldi
G. Fiameni
Rita Cucchiara
320
14
0
24 Nov 2021
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Chenfei Wu
Jian Liang
Lei Ji
Fan Yang
Yuejian Fang
Daxin Jiang
Nan Duan
ViTVGen
298
343
0
24 Nov 2021
UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language
  Modeling
UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling
Zhengyuan Yang
Zhe Gan
Jianfeng Wang
Xiaowei Hu
Faisal Ahmed
Zicheng Liu
Yumao Lu
Lijuan Wang
348
134
0
23 Nov 2021
Class-agnostic Object Detection with Multi-modal Transformer
Class-agnostic Object Detection with Multi-modal TransformerEuropean Conference on Computer Vision (ECCV), 2021
Muhammad Maaz
H. Rasheed
Salman Khan
Fahad Shahbaz Khan
Rao Muhammad Anwer
Ming-Hsuan Yang
613
116
0
22 Nov 2021
L-Verse: Bidirectional Generation Between Image and Text
L-Verse: Bidirectional Generation Between Image and TextComputer Vision and Pattern Recognition (CVPR), 2021
Taehoon Kim
Gwangmo Song
Sihaeng Lee
Sangyun Kim
Yewon Seo
Soonyoung Lee
S. Kim
Honglak Lee
Kyunghoon Bae
1.0K
28
0
22 Nov 2021
Rethinking Drone-Based Search and Rescue with Aerial Person Detection
Rethinking Drone-Based Search and Rescue with Aerial Person Detection
Pasi Pyrrö
H. Naseri
Alexander Jung
94
6
0
17 Nov 2021
Achieving Human Parity on Visual Question Answering
Achieving Human Parity on Visual Question Answering
Ming Yan
Haiyang Xu
Chenliang Li
Junfeng Tian
Bin Bi
...
Ji Zhang
Songfang Huang
Fei Huang
Luo Si
Rong Jin
147
19
0
17 Nov 2021
INTERN: A New Learning Paradigm Towards General Vision
INTERN: A New Learning Paradigm Towards General Vision
Jing Shao
Siyu Chen
Yangguang Li
Kun Wang
Zhen-fei Yin
...
F. Yu
Junjie Yan
Dahua Lin
Xiaogang Wang
Yu Qiao
237
39
0
16 Nov 2021
Multi-Grained Vision Language Pre-Training: Aligning Texts with Visual
  Concepts
Multi-Grained Vision Language Pre-Training: Aligning Texts with Visual ConceptsInternational Conference on Machine Learning (ICML), 2021
Yan Zeng
Xinsong Zhang
Hang Li
VLMCLIP
335
352
0
16 Nov 2021
A Survey of Visual Transformers
A Survey of Visual TransformersIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2021
Yang Liu
Yao Zhang
Yixin Wang
Feng Hou
Jin Yuan
Jiang Tian
Yang Zhang
Peng Wang
Jianping Fan
Zhiqiang He
3DGSViT
470
477
0
11 Nov 2021
TAGLETS: A System for Automatic Semi-Supervised Learning with Auxiliary
  Data
TAGLETS: A System for Automatic Semi-Supervised Learning with Auxiliary Data
Wasu Top Piriyakulkij
Cristina Menghini
Ross Briden
Nihal V. Nayak
Jeffrey Zhu
Elaheh Raisi
Stephen H. Bach
VLM
368
1
0
08 Nov 2021
Resource-Efficient Federated Learning
Resource-Efficient Federated LearningEuropean Conference on Computer Systems (EuroSys), 2021
A. Abdelmoniem
Atal Narayan Sahu
Marco Canini
Suhaib A. Fahmy
FedML
253
69
0
01 Nov 2021
Multi-label Classification with Partial Annotations using Class-aware
  Selective Loss
Multi-label Classification with Partial Annotations using Class-aware Selective LossComputer Vision and Pattern Recognition (CVPR), 2021
Emanuel Ben-Baruch
T. Ridnik
Itamar Friedman
Avi Ben-Cohen
Nadav Zamir
Asaf Noy
Lihi Zelnik-Manor
172
50
0
21 Oct 2021
Noisy Annotation Refinement for Object Detection
Noisy Annotation Refinement for Object DetectionBritish Machine Vision Conference (BMVC), 2021
Jiafeng Mao
Qing Yu
Yoko Yamakata
Kiyoharu Aizawa
NoLa
261
13
0
20 Oct 2021
Does Data Repair Lead to Fair Models? Curating Contextually Fair Data To
  Reduce Model Bias
Does Data Repair Lead to Fair Models? Curating Contextually Fair Data To Reduce Model BiasIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2021
Sharat Agarwal
Sumanyu Muku
Saket Anand
Chetan Arora
161
15
0
20 Oct 2021
EBJR: Energy-Based Joint Reasoning for Adaptive Inference
EBJR: Energy-Based Joint Reasoning for Adaptive InferenceBritish Machine Vision Conference (BMVC), 2021
Mohammad Akbari
Amin Banitalebi-Dehkordi
Yong Zhang
BDLMQ
156
7
0
20 Oct 2021
The World of an Octopus: How Reporting Bias Influences a Language
  Model's Perception of Color
The World of an Octopus: How Reporting Bias Influences a Language Model's Perception of Color
Cory Paik
Stéphane Aroca-Ouellette
Alessandro Roncone
Katharina Kann
169
38
0
15 Oct 2021
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
...
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
1.0K
1,464
0
13 Oct 2021
Aura: Privacy-preserving Augmentation to Improve Test Set Diversity in
  Speech Enhancement
Aura: Privacy-preserving Augmentation to Improve Test Set Diversity in Speech EnhancementIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Xavier Gitiaux
Aditya Khant
Ebrahim Beyrami
Chandan K. A. Reddy
J. Gupchup
Ross Cutler
165
0
0
08 Oct 2021
Inferring Offensiveness In Images From Natural Language Supervision
Inferring Offensiveness In Images From Natural Language Supervision
P. Schramowski
Kristian Kersting
90
2
0
08 Oct 2021
FooDI-ML: a large multi-language dataset of food, drinks and groceries
  images and descriptions
FooDI-ML: a large multi-language dataset of food, drinks and groceries images and descriptions
David Amat Olóndriz
Ponç Puigdevall
A. S. Palau
VLM
199
11
0
05 Oct 2021
PASS: An ImageNet replacement for self-supervised pretraining without
  humans
PASS: An ImageNet replacement for self-supervised pretraining without humans
Yuki M. Asano
Christian Rupprecht
Andrew Zisserman
Andrea Vedaldi
VLMSSL
214
62
0
27 Sep 2021
PETA: Photo Albums Event Recognition using Transformers Attention
PETA: Photo Albums Event Recognition using Transformers AttentionInternational Conference on Pattern Recognition (ICPR), 2021
Tamar Glaser
Emanuel Ben-Baruch
Gilad Sharir
Nadav Zamir
Asaf Noy
Lihi Zelnik-Manor
ViT
132
2
0
26 Sep 2021
Visual Scene Graphs for Audio Source Separation
Visual Scene Graphs for Audio Source SeparationIEEE International Conference on Computer Vision (ICCV), 2021
Moitreya Chatterjee
Jonathan Le Roux
Narendra Ahuja
A. Cherian
221
41
0
24 Sep 2021
Discovering and Validating AI Errors With Crowdsourced Failure Reports
Discovering and Validating AI Errors With Crowdsourced Failure Reports
Ángel Alexander Cabrera
Abraham J. Druck
Jason I. Hong
Adam Perer
HAI
174
63
0
23 Sep 2021
Pix2seq: A Language Modeling Framework for Object Detection
Pix2seq: A Language Modeling Framework for Object DetectionInternational Conference on Learning Representations (ICLR), 2021
Ting-Li Chen
Saurabh Saxena
Lala Li
David J. Fleet
Geoffrey E. Hinton
MLLMViTVLM
655
408
0
22 Sep 2021
Deep Joint Source-Channel Coding for Multi-Task Network
Deep Joint Source-Channel Coding for Multi-Task Network
Mengyang Wang
Zhicong Zhang
Jiahui Li
Mengyao Ma
Xiaopeng Fan
229
33
0
13 Sep 2021
COSMic: A Coherence-Aware Generation Metric for Image Descriptions
COSMic: A Coherence-Aware Generation Metric for Image Descriptions
Mert Inan
P. Sharma
Baber Khalid
Radu Soricut
Matthew Stone
Malihe Alikhani
EGVM
154
14
0
11 Sep 2021
Panoptic Narrative Grounding
Panoptic Narrative GroundingIEEE International Conference on Computer Vision (ICCV), 2021
Cristina González
Nicolás Ayobi
Isabela Hernández
José Hernández
Jordi Pont-Tuset
Pablo Arbeláez
251
28
0
10 Sep 2021
Learning to Generate Scene Graph from Natural Language Supervision
Learning to Generate Scene Graph from Natural Language Supervision
Yiwu Zhong
Jing Shi
Jianwei Yang
Chenliang Xu
Yin Li
SSL
261
85
0
06 Sep 2021
Identification of Driver Phone Usage Violations via State-of-the-Art
  Object Detection with Tracking
Identification of Driver Phone Usage Violations via State-of-the-Art Object Detection with Tracking
S. Carrell
Amir Atapour-Abarghouei
136
5
0
05 Sep 2021
Evaluating the Single-Shot MultiBox Detector and YOLO Deep Learning
  Models for the Detection of Tomatoes in a Greenhouse
Evaluating the Single-Shot MultiBox Detector and YOLO Deep Learning Models for the Detection of Tomatoes in a Greenhouse
S. Magalhães
Luís Castro
Germano Moreira
F. Santos
Mário Cunha
Jorge Dias
A. Moreira
102
142
0
02 Sep 2021
EKTVQA: Generalized use of External Knowledge to empower Scene Text in
  Text-VQA
EKTVQA: Generalized use of External Knowledge to empower Scene Text in Text-VQAIEEE Access (IEEE Access), 2021
Arka Ujjal Dey
Ernest Valveny
Gaurav Harit
346
3
0
22 Aug 2021
DVM-CAR: A large-scale automotive dataset for visual marketing research
  and applications
DVM-CAR: A large-scale automotive dataset for visual marketing research and applications
JingMin Huang
Bowei Chen
Lan Luo
Shigang Yue
I. Ounis
149
20
0
10 Aug 2021
Pre-trained Models for Sonar Images
Pre-trained Models for Sonar Images
Matias Valdenegro-Toro
Alan Preciado-Grijalva
Bilal Wehbe
VLM
94
20
0
02 Aug 2021
United We Learn Better: Harvesting Learning Improvements From Class
  Hierarchies Across Tasks
United We Learn Better: Harvesting Learning Improvements From Class Hierarchies Across Tasks
Sindi Shkodrani
Yu Wang
M. Manfredi
N. Baka
112
4
0
28 Jul 2021
Spatial-Temporal Transformer for Dynamic Scene Graph Generation
Spatial-Temporal Transformer for Dynamic Scene Graph GenerationIEEE International Conference on Computer Vision (ICCV), 2021
Yuren Cong
Wentong Liao
H. Ackermann
Bodo Rosenhahn
M. Yang
ViT
306
150
0
26 Jul 2021
Fed-ensemble: Improving Generalization through Model Ensembling in
  Federated Learning
Fed-ensemble: Improving Generalization through Model Ensembling in Federated LearningIEEE Transactions on Automation Science and Engineering (T-ASE), 2021
Naichen Shi
Fan Lai
Raed Al Kontar
Mosharaf Chowdhury
FedML
178
41
0
21 Jul 2021
Multi-Label Generalized Zero Shot Learning for the Classification of
  Disease in Chest Radiographs
Multi-Label Generalized Zero Shot Learning for the Classification of Disease in Chest Radiographs
Nasir Hayat
Hazem Lashen
Farah E. Shamout
245
25
0
14 Jul 2021
Exploiting Image Translations via Ensemble Self-Supervised Learning for
  Unsupervised Domain Adaptation
Exploiting Image Translations via Ensemble Self-Supervised Learning for Unsupervised Domain Adaptation
Fabrizio J. Piva
Gijs Dubbelman
127
14
0
13 Jul 2021
EasyCom: An Augmented Reality Dataset to Support Algorithms for Easy
  Communication in Noisy Environments
EasyCom: An Augmented Reality Dataset to Support Algorithms for Easy Communication in Noisy Environments
Jacob Donley
V. Tourbabin
Jung-Suk Lee
Mark Broyles
Hao Jiang
Jie Shen
Maja Pantic
V. Ithapu
Ravish Mehra
182
83
0
09 Jul 2021
PhotoChat: A Human-Human Dialogue Dataset with Photo Sharing Behavior
  for Joint Image-Text Modeling
PhotoChat: A Human-Human Dialogue Dataset with Photo Sharing Behavior for Joint Image-Text Modeling
Xiaoxue Zang
Lijuan Liu
Maria Wang
Yang Song
Hao Zhang
Jindong Chen
VLM
253
65
0
06 Jul 2021
MSE Loss with Outlying Label for Imbalanced Classification
MSE Loss with Outlying Label for Imbalanced Classification
S. Kato
Kazuhiro Hotta
108
15
0
06 Jul 2021
Web-Scale Generic Object Detection at Microsoft Bing
Web-Scale Generic Object Detection at Microsoft Bing
S. Chen
Saurajit Mukherjee
Unmesh Phadke
Tingting Wang
Junwon Park
Ravi Theja Yada
ObjDVLM
180
0
0
05 Jul 2021
CBNet: A Composite Backbone Network Architecture for Object Detection
CBNet: A Composite Backbone Network Architecture for Object Detection
Tingting Liang
Xiao Chu
Yudong Liu
Yongtao Wang
Zhi Tang
Wei Chu
Jingdong Chen
Haibin Ling
ObjD
521
204
0
01 Jul 2021
OPT: Omni-Perception Pre-Trainer for Cross-Modal Understanding and
  Generation
OPT: Omni-Perception Pre-Trainer for Cross-Modal Understanding and Generation
Jing Liu
Xinxin Zhu
Fei Liu
Longteng Guo
Zijia Zhao
...
Weining Wang
Hanqing Lu
Shiyu Zhou
Jiajun Zhang
Jinqiao Wang
299
41
0
01 Jul 2021
Making Images Real Again: A Comprehensive Survey on Deep Image
  Composition
Making Images Real Again: A Comprehensive Survey on Deep Image Composition
Li Niu
Wenyan Cong
Liu Liu
Yan Hong
Bo Zhang
Jing Liang
Liqing Zhang
VLMDiffMCoGe
529
96
0
28 Jun 2021
Previous
123...1011121389
Next