ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.16518
  4. Cited By
Collaborative Transformers for Grounded Situation Recognition

Collaborative Transformers for Grounded Situation Recognition

30 March 2022
Junhyeong Cho
Youngseok Yoon
Suha Kwak
    ViT
ArXivPDFHTML

Papers citing "Collaborative Transformers for Grounded Situation Recognition"

16 / 16 papers shown
Title
Relation-R1: Cognitive Chain-of-Thought Guided Reinforcement Learning for Unified Relational Comprehension
Relation-R1: Cognitive Chain-of-Thought Guided Reinforcement Learning for Unified Relational Comprehension
Lin Li
Wei Chen
Jiahui Li
L. Chen
LRM
33
1
0
20 Apr 2025
Dynamic Scene Understanding from Vision-Language Representations
Dynamic Scene Understanding from Vision-Language Representations
Shahaf Pruss
Morris Alper
Hadar Averbuch-Elor
OCL
77
0
0
20 Jan 2025
Situational Scene Graph for Structured Human-centric Situation Understanding
Situational Scene Graph for Structured Human-centric Situation Understanding
Chinthani Sugandhika
Chen Li
Deepu Rajan
Basura Fernando
45
1
0
30 Oct 2024
Effectively Leveraging CLIP for Generating Situational Summaries of Images and Videos
Effectively Leveraging CLIP for Generating Situational Summaries of Images and Videos
Dhruv Verma
Debaditya Roy
Basura Fernando
19
1
0
30 Jul 2024
ViStruct: Visual Structural Knowledge Extraction via Curriculum Guided
  Code-Vision Representation
ViStruct: Visual Structural Knowledge Extraction via Curriculum Guided Code-Vision Representation
Yangyi Chen
Xingyao Wang
Manling Li
Derek Hoiem
Heng Ji
17
5
0
22 Nov 2023
Constructing Holistic Spatio-Temporal Scene Graph for Video Semantic
  Role Labeling
Constructing Holistic Spatio-Temporal Scene Graph for Video Semantic Role Labeling
Yu Zhao
Hao Fei
Yixin Cao
Bobo Li
Meishan Zhang
Jianguo Wei
M. Zhang
Tat-Seng Chua
17
13
0
09 Aug 2023
Open Scene Understanding: Grounded Situation Recognition Meets Segment
  Anything for Helping People with Visual Impairments
Open Scene Understanding: Grounded Situation Recognition Meets Segment Anything for Helping People with Visual Impairments
R. Liu
Jiaming Zhang
Kunyu Peng
Junwei Zheng
Ke Cao
Yufan Chen
Kailun Yang
Rainer Stiefelhagen
13
7
0
15 Jul 2023
ClipSitu: Effectively Leveraging CLIP for Conditional Predictions in
  Situation Recognition
ClipSitu: Effectively Leveraging CLIP for Conditional Predictions in Situation Recognition
Debaditya Roy
Dhruv Verma
Basura Fernando
VLM
CLIP
8
4
0
02 Jul 2023
Training Multimedia Event Extraction With Generated Images and Captions
Training Multimedia Event Extraction With Generated Images and Captions
Zilin Du
Yunxin Li
Xu Guo
Yidan Sun
Boyang Albert Li
DiffM
13
7
0
15 Jun 2023
Learning Human-Human Interactions in Images from Weak Textual
  Supervision
Learning Human-Human Interactions in Images from Weak Textual Supervision
Morris Alper
Hadar Averbuch-Elor
VLM
37
2
0
27 Apr 2023
Video Event Extraction via Tracking Visual States of Arguments
Video Event Extraction via Tracking Visual States of Arguments
Guang Yang
Manling Li
Jiajie Zhang
Xudong Lin
Shih-Fu Chang
Heng Ji
22
9
0
03 Nov 2022
Grounded Video Situation Recognition
Grounded Video Situation Recognition
Zeeshan Khan
C. V. Jawahar
Makarand Tapaswi
14
13
0
19 Oct 2022
Ambiguous Images With Human Judgments for Robust Visual Event
  Classification
Ambiguous Images With Human Judgments for Robust Visual Event Classification
Kate Sanders
Reno Kriz
Anqi Liu
Benjamin Van Durme
53
12
0
06 Oct 2022
GSRFormer: Grounded Situation Recognition Transformer with Alternate
  Semantic Attention Refinement
GSRFormer: Grounded Situation Recognition Transformer with Alternate Semantic Attention Refinement
Zhi-Qi Cheng
Qianwen Dai
Siyao Li
Teruko Mitamura
Alexander G. Hauptmann
11
34
0
18 Aug 2022
Cross-Attention of Disentangled Modalities for 3D Human Mesh Recovery
  with Transformers
Cross-Attention of Disentangled Modalities for 3D Human Mesh Recovery with Transformers
Junhyeong Cho
Youwang Kim
Tae-Hyun Oh
ViT
11
88
0
27 Jul 2022
Normalized and Geometry-Aware Self-Attention Network for Image
  Captioning
Normalized and Geometry-Aware Self-Attention Network for Image Captioning
Longteng Guo
Jing Liu
Xinxin Zhu
Peng Yao
Shichen Lu
Hanqing Lu
ViT
108
188
0
19 Mar 2020
1