ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.19503
  4. Cited By
Locality-Aware Zero-Shot Human-Object Interaction Detection

Locality-Aware Zero-Shot Human-Object Interaction Detection

26 May 2025
Sanghyun Kim
Deunsol Jung
Minsu Cho
    VLM
ArXivPDFHTML

Papers citing "Locality-Aware Zero-Shot Human-Object Interaction Detection"

50 / 50 papers shown
Title
Exploring Conditional Multi-Modal Prompts for Zero-shot HOI Detection
Exploring Conditional Multi-Modal Prompts for Zero-shot HOI Detection
Ting Lei
Shaofeng Yin
Yuxin Peng
Yang Liu
VLM
62
6
0
05 Aug 2024
UnionDet: Union-Level Detector Towards Real-Time Human-Object
  Interaction Detection
UnionDet: Union-Level Detector Towards Real-Time Human-Object Interaction Detection
Bumsoo Kim
Taeho Choi
Jaewoo Kang
Hyunwoo J. Kim
ObjD
92
146
0
19 Dec 2023
Neural-Logic Human-Object Interaction Detection
Neural-Logic Human-Object Interaction Detection
Liulei Li
Jianan Wei
Wenguan Wang
Yi Yang
68
17
0
16 Nov 2023
CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense
  Prediction
CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
Size Wu
Wenwei Zhang
Lumin Xu
Sheng Jin
Xiangtai Li
Wentao Liu
Chen Change Loy
CLIP
VLM
47
71
0
02 Oct 2023
Exploiting CLIP for Zero-shot HOI Detection Requires Knowledge
  Distillation at Multiple Levels
Exploiting CLIP for Zero-shot HOI Detection Requires Knowledge Distillation at Multiple Levels
Bo Wan
Tinne Tuytelaars
VLM
75
4
0
10 Sep 2023
Efficient Adaptive Human-Object Interaction Detection with
  Concept-guided Memory
Efficient Adaptive Human-Object Interaction Detection with Concept-guided Memory
Ting Lei
Fabian Caba
Qingchao Chen
Hailin Jin
Yuxin Peng
Yang Liu
VLM
62
18
0
07 Sep 2023
CLIPN for Zero-Shot OOD Detection: Teaching CLIP to Say No
CLIPN for Zero-Shot OOD Detection: Teaching CLIP to Say No
Hualiang Wang
Yi Li
Huifeng Yao
Xuelong Li
VLM
OODD
84
103
0
23 Aug 2023
Relational Context Learning for Human-Object Interaction Detection
Relational Context Learning for Human-Object Interaction Detection
Sanghyun Kim
Deunsol Jung
Minsu Cho
68
38
0
11 Apr 2023
HOICLIP: Efficient Knowledge Transfer for HOI Detection with
  Vision-Language Models
HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models
Sha Ning
Longtian Qiu
Yongfei Liu
Xuming He
VLM
63
43
0
28 Mar 2023
ZegCLIP: Towards Adapting CLIP for Zero-shot Semantic Segmentation
ZegCLIP: Towards Adapting CLIP for Zero-shot Semantic Segmentation
Ziqi Zhou
Bowen Zhang
Yinjie Lei
Lingqiao Liu
Yifan Liu
VLM
47
171
0
07 Dec 2022
RLIP: Relational Language-Image Pre-training for Human-Object
  Interaction Detection
RLIP: Relational Language-Image Pre-training for Human-Object Interaction Detection
Hangjie Yuan
Jianwen Jiang
Samuel Albanie
Tao Feng
Ziyuan Huang
Dong Ni
Mingqian Tang
VLM
54
52
0
05 Sep 2022
MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image
  Pretraining
MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image Pretraining
Xiaoyi Dong
Jianmin Bao
Yinglin Zheng
Ting Zhang
Dongdong Chen
...
Weiming Zhang
Lu Yuan
Dong Chen
Fang Wen
Nenghai Yu
CLIP
VLM
68
162
0
25 Aug 2022
Exploring Structure-aware Transformer over Interaction Proposals for
  Human-Object Interaction Detection
Exploring Structure-aware Transformer over Interaction Proposals for Human-Object Interaction Detection
Y. Zhang
Yingwei Pan
Ting Yao
Rui Huang
Tao Mei
C. Chen
ViT
58
69
0
13 Jun 2022
Vision Transformer Adapter for Dense Predictions
Vision Transformer Adapter for Dense Predictions
Zhe Chen
Yuchen Duan
Wenhai Wang
Junjun He
Tong Lu
Jifeng Dai
Yu Qiao
59
552
0
17 May 2022
Neighborhood Attention Transformer
Neighborhood Attention Transformer
Ali Hassani
Steven Walton
Jiacheng Li
Shengjia Li
Humphrey Shi
ViT
AI4TS
63
265
0
14 Apr 2022
GEN-VLKT: Simplify Association and Enhance Interaction Understanding for
  HOI Detection
GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI Detection
Yue Liao
Aixi Zhang
Miao Lu
Yongliang Wang
Xiaobo Li
Si Liu
VLM
40
126
0
26 Mar 2022
Iwin: Human-Object Interaction Detection via Transformer with Irregular
  Windows
Iwin: Human-Object Interaction Detection via Transformer with Irregular Windows
Danyang Tu
Xiongkuo Min
Huiyu Duan
G. Guo
Guangtao Zhai
Wei Shen
ViT
56
25
0
20 Mar 2022
RegionCLIP: Region-based Language-Image Pretraining
RegionCLIP: Region-based Language-Image Pretraining
Yiwu Zhong
Jianwei Yang
Pengchuan Zhang
Chunyuan Li
Noel Codella
...
Luowei Zhou
Xiyang Dai
Lu Yuan
Yin Li
Jianfeng Gao
VLM
CLIP
95
568
0
16 Dec 2021
Efficient Two-Stage Detection of Human-Object Interactions with a Novel
  Unary-Pairwise Transformer
Efficient Two-Stage Detection of Human-Object Interactions with a Novel Unary-Pairwise Transformer
Frederic Z. Zhang
Dylan Campbell
Stephen Gould
ViT
32
104
0
03 Dec 2021
Mining the Benefits of Two-stage and One-stage HOI Detection
Mining the Benefits of Two-stage and One-stage HOI Detection
Aixi Zhang
Yue Liao
Si Liu
Miao Lu
Yongliang Wang
Chen Gao
Xiaobo Li
57
146
0
11 Aug 2021
HOTR: End-to-End Human-Object Interaction Detection with Transformers
HOTR: End-to-End Human-Object Interaction Detection with Transformers
Bumsoo Kim
Junhyun Lee
Jaewoo Kang
Eun-Sol Kim
Hyunwoo J. Kim
ViT
66
254
0
28 Apr 2021
Affordance Transfer Learning for Human-Object Interaction Detection
Affordance Transfer Learning for Human-Object Interaction Detection
Zhi Hou
Baosheng Yu
Yu Qiao
Xiaojiang Peng
Dacheng Tao
49
105
0
07 Apr 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
ViT
288
21,051
0
25 Mar 2021
Detecting Human-Object Interaction via Fabricated Compositional Learning
Detecting Human-Object Interaction via Fabricated Compositional Learning
Zhi Hou
B. Yu
Yu Qiao
Xiaojiang Peng
Dacheng Tao
86
97
0
15 Mar 2021
Reformulating HOI Detection as Adaptive Set Prediction
Reformulating HOI Detection as Adaptive Set Prediction
Mingfei Chen
Yue Liao
Si Liu
Zhiyuan Chen
Fei Wang
Chao Qian
57
143
0
10 Mar 2021
QPIC: Query-Based Pairwise Human-Object Interaction Detection with
  Image-Wide Contextual Information
QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information
Masato Tamura
Hiroki Ohashi
Tomoaki Yoshinaga
58
209
0
09 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
666
28,659
0
26 Feb 2021
Image-to-Image Retrieval by Learning Similarity between Scene Graphs
Image-to-Image Retrieval by Learning Similarity between Scene Graphs
Sangwoong Yoon
Woo-Young Kang
Sungwook Jeon
SeongEun Lee
C. Han
Jonghun Park
Eun-Sol Kim
3DH
61
42
0
29 Dec 2020
Spatially Conditioned Graphs for Detecting Human-Object Interactions
Spatially Conditioned Graphs for Detecting Human-Object Interactions
Frederic Z. Zhang
Dylan Campbell
Stephen Gould
44
126
0
11 Dec 2020
HOI Analysis: Integrating and Decomposing Human-Object Interaction
HOI Analysis: Integrating and Decomposing Human-Object Interaction
Yong-Lu Li
Xinpeng Liu
Xiaoqian Wu
Yizhuo Li
Cewu Lu
30
122
0
30 Oct 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at
  Scale
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
312
40,217
0
22 Oct 2020
Contextual Heterogeneous Graph Network for Human-Object Interaction
  Detection
Contextual Heterogeneous Graph Network for Human-Object Interaction Detection
Hai Wang
Weishi Zheng
Yingbiao Ling
44
88
0
20 Oct 2020
DIRV: Dense Interaction Region Voting for End-to-End Human-Object
  Interaction Detection
DIRV: Dense Interaction Region Voting for End-to-End Human-Object Interaction Detection
Haoshu Fang
Yichen Xie
Dian Shao
Cewu Lu
32
57
0
02 Oct 2020
DRG: Dual Relation Graph for Human-Object Interaction Detection
DRG: Dual Relation Graph for Human-Object Interaction Detection
Chen Gao
Jiarui Xu
Yuliang Zou
Jia-Bin Huang
58
206
0
26 Aug 2020
ConsNet: Learning Consistency Graph for Zero-Shot Human-Object
  Interaction Detection
ConsNet: Learning Consistency Graph for Zero-Shot Human-Object Interaction Detection
Ye Liu
Junsong Yuan
Chang Wen Chen
132
81
0
14 Aug 2020
Visual Compositional Learning for Human-Object Interaction Detection
Visual Compositional Learning for Human-Object Interaction Detection
Zhi Hou
Xiaojiang Peng
Yu Qiao
Dacheng Tao
VLM
66
182
0
24 Jul 2020
End-to-End Object Detection with Transformers
End-to-End Object Detection with Transformers
Nicolas Carion
Francisco Massa
Gabriel Synnaeve
Nicolas Usunier
Alexander Kirillov
Sergey Zagoruyko
ViT
3DV
PINN
275
12,847
0
26 May 2020
VSGNet: Spatial Attention Network for Detecting Human Object
  Interactions Using Graph Convolutions
VSGNet: Spatial Attention Network for Detecting Human Object Interactions Using Graph Convolutions
Oytun Ulutan
A S M Iftekhar
B. S. Manjunath
89
207
0
11 Mar 2020
PPDM: Parallel Point Detection and Matching for Real-time Human-Object
  Interaction Detection
PPDM: Parallel Point Detection and Matching for Real-time Human-Object Interaction Detection
Yue Liao
Si Liu
Fei Wang
Yanjie Chen
Chen Qian
Jiashi Feng
99
266
0
30 Dec 2019
Image Captioning: Transforming Objects into Words
Image Captioning: Transforming Objects into Words
Simão Herdade
Armin Kappeler
K. Boakye
Joao Soares
ViT
74
466
0
14 Jun 2019
Stand-Alone Self-Attention in Vision Models
Stand-Alone Self-Attention in Vision Models
Prajit Ramachandran
Niki Parmar
Ashish Vaswani
Irwan Bello
Anselm Levskaya
Jonathon Shlens
VLM
SLR
ViT
57
1,208
0
13 Jun 2019
Exploring Visual Relationship for Image Captioning
Exploring Visual Relationship for Image Captioning
Ting Yao
Yingwei Pan
Yehao Li
Tao Mei
62
830
0
19 Sep 2018
iCAN: Instance-Centric Attention Network for Human-Object Interaction
  Detection
iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection
Chen Gao
Yuliang Zou
Jia-Bin Huang
40
295
0
30 Aug 2018
Learning Human-Object Interactions by Graph Parsing Neural Networks
Learning Human-Object Interactions by Graph Parsing Neural Networks
Siyuan Qi
Wenguan Wang
Baoxiong Jia
Jianbing Shen
Song-Chun Zhu
GNN
66
535
0
23 Aug 2018
Mask R-CNN
Mask R-CNN
Kaiming He
Georgia Gkioxari
Piotr Dollár
Ross B. Girshick
ObjD
298
27,018
0
20 Mar 2017
Learning to Detect Human-Object Interactions
Learning to Detect Human-Object Interactions
Yu-Wei Chao
Yunfan Liu
Michael Xieyang Liu
Huayi Zeng
Jia Deng
53
504
0
17 Feb 2017
Layer Normalization
Layer Normalization
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
237
10,412
0
21 Jul 2016
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal
  Networks
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Shaoqing Ren
Kaiming He
Ross B. Girshick
Jian Sun
AIMat
ObjD
393
61,900
0
04 Jun 2015
Visual Semantic Role Labeling
Visual Semantic Role Labeling
Saurabh Gupta
Jitendra Malik
47
405
0
17 May 2015
Microsoft COCO: Common Objects in Context
Microsoft COCO: Common Objects in Context
Nayeon Lee
Michael Maire
Serge J. Belongie
Lubomir Bourdev
Ross B. Girshick
James Hays
Pietro Perona
Deva Ramanan
C. L. Zitnick
Piotr Dollár
ObjD
234
43,290
0
01 May 2014
1