ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2104.12763
  4. Cited By
MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
v1v2 (latest)

MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding

IEEE International Conference on Computer Vision (ICCV), 2021
26 April 2021
Aishwarya Kamath
Mannat Singh
Yann LeCun
Gabriel Synnaeve
Ishan Misra
Nicolas Carion
    ObjDVLM
ArXiv (abs)PDFHTMLGithub (1008★)

Papers citing "MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding"

21 / 671 papers shown
Title
UFO: A UniFied TransfOrmer for Vision-Language Representation Learning
UFO: A UniFied TransfOrmer for Vision-Language Representation Learning
Jianfeng Wang
Xiaowei Hu
Zhe Gan
Zhengyuan Yang
Xiyang Dai
Zicheng Liu
Yumao Lu
Lijuan Wang
ViT
130
60
0
19 Nov 2021
Multi-Grained Vision Language Pre-Training: Aligning Texts with Visual
  Concepts
Multi-Grained Vision Language Pre-Training: Aligning Texts with Visual ConceptsInternational Conference on Machine Learning (ICML), 2021
Yan Zeng
Xinsong Zhang
Hang Li
VLMCLIP
264
347
0
16 Nov 2021
A Survey of Visual Transformers
A Survey of Visual TransformersIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2021
Yang Liu
Yao Zhang
Yixin Wang
Feng Hou
Jin Yuan
Jiang Tian
Yang Zhang
Peng Wang
Jianping Fan
Zhiqiang He
3DGSViT
358
447
0
11 Nov 2021
An Empirical Study of Training End-to-End Vision-and-Language
  Transformers
An Empirical Study of Training End-to-End Vision-and-Language TransformersComputer Vision and Pattern Recognition (CVPR), 2021
Zi-Yi Dou
Yichong Xu
Zhe Gan
Jianfeng Wang
Shuohang Wang
...
Pengchuan Zhang
Lu Yuan
Nanyun Peng
Zicheng Liu
Michael Zeng
VLM
221
425
0
03 Nov 2021
Beyond Accuracy: A Consolidated Tool for Visual Question Answering
  Benchmarking
Beyond Accuracy: A Consolidated Tool for Visual Question Answering BenchmarkingConference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Dirk Vath
Pascal Tilli
Ngoc Thang Vu
125
4
0
11 Oct 2021
CLIPort: What and Where Pathways for Robotic Manipulation
CLIPort: What and Where Pathways for Robotic ManipulationConference on Robot Learning (CoRL), 2021
Mohit Shridhar
Lucas Manuelli
Dieter Fox
LM&Ro
266
791
0
24 Sep 2021
xGQA: Cross-Lingual Visual Question Answering
xGQA: Cross-Lingual Visual Question Answering
Jonas Pfeiffer
Gregor Geigle
Aishwarya Kamath
Jan-Martin O. Steitz
Stefan Roth
Ivan Vulić
Iryna Gurevych
317
76
0
13 Sep 2021
Negative Sample Matters: A Renaissance of Metric Learning for Temporal
  Grounding
Negative Sample Matters: A Renaissance of Metric Learning for Temporal GroundingAAAI Conference on Artificial Intelligence (AAAI), 2021
Zhenzhi Wang
Limin Wang
Tao Wu
Tianhao Li
Gangshan Wu
AI4TS
250
150
0
10 Sep 2021
Vision-and-Language or Vision-for-Language? On Cross-Modal Influence in
  Multimodal Transformers
Vision-and-Language or Vision-for-Language? On Cross-Modal Influence in Multimodal TransformersConference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Stella Frank
Emanuele Bugliarello
Desmond Elliott
137
90
0
09 Sep 2021
TxT: Crossmodal End-to-End Learning with Transformers
TxT: Crossmodal End-to-End Learning with TransformersGerman Conference on Pattern Recognition (DAGM), 2021
Jan-Martin O. Steitz
Jonas Pfeiffer
Iryna Gurevych
Stefan Roth
LRM
89
2
0
09 Sep 2021
SORNet: Spatial Object-Centric Representations for Sequential
  Manipulation
SORNet: Spatial Object-Centric Representations for Sequential ManipulationConference on Robot Learning (CoRL), 2021
Wentao Yuan
Chris Paxton
Karthik Desingh
Dieter Fox
3DPC
433
76
0
08 Sep 2021
INVIGORATE: Interactive Visual Grounding and Grasping in Clutter
INVIGORATE: Interactive Visual Grounding and Grasping in Clutter
Hanbo Zhang
Yunfan Lu
Cunjun Yu
David Hsu
Xuguang Lan
Nanning Zheng
LM&Ro
183
70
0
25 Aug 2021
QVHighlights: Detecting Moments and Highlights in Videos via Natural
  Language Queries
QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries
Jie Lei
Tamara L. Berg
Joey Tianyi Zhou
ViT
279
86
0
20 Jul 2021
LanguageRefer: Spatial-Language Model for 3D Visual Grounding
LanguageRefer: Spatial-Language Model for 3D Visual GroundingConference on Robot Learning (CoRL), 2021
Junha Roh
Karthik Desingh
Ali Farhadi
Dieter Fox
210
110
0
07 Jul 2021
Augmented 2D-TAN: A Two-stage Approach for Human-centric Spatio-Temporal
  Video Grounding
Augmented 2D-TAN: A Two-stage Approach for Human-centric Spatio-Temporal Video Grounding
Chaolei Tan
Zihang Lin
Jianfang Hu
Xiang Li
Weishi Zheng
172
12
0
20 Jun 2021
How Modular Should Neural Module Networks Be for Systematic
  Generalization?
How Modular Should Neural Module Networks Be for Systematic Generalization?
Vanessa D’Amario
Tomotake Sasaki
Xavier Boix
137
18
0
15 Jun 2021
Team RUC_AIM3 Technical Report at ActivityNet 2021: Entities Object
  Localization
Team RUC_AIM3 Technical Report at ActivityNet 2021: Entities Object Localization
Ludan Ruan
Jieting Chen
Yuqing Song
Shizhe Chen
Qin Jin
80
0
0
11 Jun 2021
Referring Transformer: A One-step Approach to Multi-task Visual
  Grounding
Referring Transformer: A One-step Approach to Multi-task Visual GroundingNeural Information Processing Systems (NeurIPS), 2021
Muchen Li
Leonid Sigal
ObjD
220
236
0
06 Jun 2021
VL-NMS: Breaking Proposal Bottlenecks in Two-Stage Visual-Language
  Matching
VL-NMS: Breaking Proposal Bottlenecks in Two-Stage Visual-Language Matching
Chenchi Zhang
Wenbo Ma
Jun Xiao
Hanwang Zhang
Jian Shao
Yueting Zhuang
Long Chen
193
5
0
12 May 2021
Towards General Purpose Vision Systems
Towards General Purpose Vision SystemsComputer Vision and Pattern Recognition (CVPR), 2021
Tanmay Gupta
Amita Kamath
Aniruddha Kembhavi
Derek Hoiem
222
55
0
01 Apr 2021
Transformers in Vision: A Survey
Transformers in Vision: A SurveyACM Computing Surveys (CSUR), 2021
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
Fahad Shahbaz Khan
M. Shah
ViT
773
3,041
0
04 Jan 2021
Previous
123...121314