Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.16518
Cited By
Collaborative Transformers for Grounded Situation Recognition
30 March 2022
Junhyeong Cho
Youngseok Yoon
Suha Kwak
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Collaborative Transformers for Grounded Situation Recognition"
16 / 16 papers shown
Title
Relation-R1: Cognitive Chain-of-Thought Guided Reinforcement Learning for Unified Relational Comprehension
Lin Li
Wei Chen
Jiahui Li
L. Chen
LRM
33
1
0
20 Apr 2025
Dynamic Scene Understanding from Vision-Language Representations
Shahaf Pruss
Morris Alper
Hadar Averbuch-Elor
OCL
77
0
0
20 Jan 2025
Situational Scene Graph for Structured Human-centric Situation Understanding
Chinthani Sugandhika
Chen Li
Deepu Rajan
Basura Fernando
45
1
0
30 Oct 2024
Effectively Leveraging CLIP for Generating Situational Summaries of Images and Videos
Dhruv Verma
Debaditya Roy
Basura Fernando
19
1
0
30 Jul 2024
ViStruct: Visual Structural Knowledge Extraction via Curriculum Guided Code-Vision Representation
Yangyi Chen
Xingyao Wang
Manling Li
Derek Hoiem
Heng Ji
17
5
0
22 Nov 2023
Constructing Holistic Spatio-Temporal Scene Graph for Video Semantic Role Labeling
Yu Zhao
Hao Fei
Yixin Cao
Bobo Li
Meishan Zhang
Jianguo Wei
M. Zhang
Tat-Seng Chua
17
13
0
09 Aug 2023
Open Scene Understanding: Grounded Situation Recognition Meets Segment Anything for Helping People with Visual Impairments
R. Liu
Jiaming Zhang
Kunyu Peng
Junwei Zheng
Ke Cao
Yufan Chen
Kailun Yang
Rainer Stiefelhagen
13
7
0
15 Jul 2023
ClipSitu: Effectively Leveraging CLIP for Conditional Predictions in Situation Recognition
Debaditya Roy
Dhruv Verma
Basura Fernando
VLM
CLIP
8
4
0
02 Jul 2023
Training Multimedia Event Extraction With Generated Images and Captions
Zilin Du
Yunxin Li
Xu Guo
Yidan Sun
Boyang Albert Li
DiffM
13
7
0
15 Jun 2023
Learning Human-Human Interactions in Images from Weak Textual Supervision
Morris Alper
Hadar Averbuch-Elor
VLM
37
2
0
27 Apr 2023
Video Event Extraction via Tracking Visual States of Arguments
Guang Yang
Manling Li
Jiajie Zhang
Xudong Lin
Shih-Fu Chang
Heng Ji
22
9
0
03 Nov 2022
Grounded Video Situation Recognition
Zeeshan Khan
C. V. Jawahar
Makarand Tapaswi
14
13
0
19 Oct 2022
Ambiguous Images With Human Judgments for Robust Visual Event Classification
Kate Sanders
Reno Kriz
Anqi Liu
Benjamin Van Durme
53
12
0
06 Oct 2022
GSRFormer: Grounded Situation Recognition Transformer with Alternate Semantic Attention Refinement
Zhi-Qi Cheng
Qianwen Dai
Siyao Li
Teruko Mitamura
Alexander G. Hauptmann
11
34
0
18 Aug 2022
Cross-Attention of Disentangled Modalities for 3D Human Mesh Recovery with Transformers
Junhyeong Cho
Youwang Kim
Tae-Hyun Oh
ViT
11
88
0
27 Jul 2022
Normalized and Geometry-Aware Self-Attention Network for Image Captioning
Longteng Guo
Jing Liu
Xinxin Zhu
Peng Yao
Shichen Lu
Hanqing Lu
ViT
108
188
0
19 Mar 2020
1