Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2008.08072
Cited By
AssembleNet++: Assembling Modality Representations via Attention Connections
18 August 2020
Michael S. Ryoo
A. Piergiovanni
Juhana Kangaspunta
A. Angelova
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"AssembleNet++: Assembling Modality Representations via Attention Connections"
21 / 21 papers shown
Title
Vision Language Models for Dynamic Human Activity Recognition in Healthcare Settings
Abderrazek Abid
Thanh-Cong Ho
Fakhri Karray
VLM
39
0
0
24 Oct 2025
Language-driven Description Generation and Common Sense Reasoning for Video Action Recognition
Xiaodan Hu
Chuhang Zou
Suchen Wang
Jaechul Kim
Narendra Ahuja
LRM
106
0
0
20 Jun 2025
Salient Temporal Encoding for Dynamic Scene Graph Generation
Zhihao Zhu
156
0
0
15 Mar 2025
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Amir Hosein Fadaei
M. Dehaqani
215
0
0
11 Feb 2025
Fusion Matters: Learning Fusion in Deep Click-through Rate Prediction Models
Web Search and Data Mining (WSDM), 2024
Kexin Zhang
Fuyuan Lyu
Xing Tang
Dugang Liu
Chen Ma
Kaize Ding
Xiuqiang He
Xue Liu
173
5
0
24 Nov 2024
Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning
Computer Vision and Pattern Recognition (CVPR), 2022
A. Piergiovanni
Weicheng Kuo
A. Angelova
ViT
146
66
0
06 Dec 2022
Learning Fine-Grained Visual Understanding for Video Question Answering via Decoupling Spatial-Temporal Modeling
British Machine Vision Conference (BMVC), 2022
Hsin-Ying Lee
Hung-Ting Su
Bing-Chen Tsai
Tsung-Han Wu
Jia-Fong Yeh
Winston H. Hsu
153
2
0
08 Oct 2022
ViA: View-invariant Skeleton Action Representation Learning via Motion Retargeting
Di Yang
Yaohui Wang
A. Dantcheva
Lorenzo Garattoni
Gianpiero Francesca
Francois Bremond
159
22
0
31 Aug 2022
Gate-Shift-Fuse for Video Action Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Swathikiran Sudhakaran
Sergio Escalera
Oswald Lanz
221
31
0
16 Mar 2022
Auto-X3D: Ultra-Efficient Video Understanding via Finer-Grained Neural Architecture Search
Lezhi Li
Xinyu Gong
Junru Wu
Humphrey Shi
Zhicheng Yan
Zhangyang Wang
VGen
125
1
0
09 Dec 2021
4D-Net for Learned Multi-Modal Alignment
A. Piergiovanni
Vincent Casser
Michael S. Ryoo
A. Angelova
3DPC
203
66
0
02 Sep 2021
Searching for Two-Stream Models in Multivariate Space for Video Recognition
IEEE International Conference on Computer Vision (ICCV), 2021
Xinyu Gong
Heng Wang
Zheng Shou
Matt Feiszli
Zhangyang Wang
Zhicheng Yan
117
9
0
30 Aug 2021
UNIK: A Unified Framework for Real-world Skeleton-based Action Recognition
British Machine Vision Conference (BMVC), 2021
Di Yang
Yaohui Wang
A. Dantcheva
Lorenzo Garattoni
Gianpiero Francesca
Francois Bremond
144
55
0
19 Jul 2021
TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
Michael S. Ryoo
A. Piergiovanni
Anurag Arnab
Mostafa Dehghani
A. Angelova
ViT
368
146
0
21 Jun 2021
VPN++: Rethinking Video-Pose embeddings for understanding Activities of Daily Living
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Srijan Das
Rui Dai
Di Yang
Francois Bremond
ViT
150
76
0
17 May 2021
Visionary: Vision architecture discovery for robot learning
IEEE International Conference on Robotics and Automation (ICRA), 2021
Iretiayo Akinola
A. Angelova
Yao Lu
Yevgen Chebotar
Dmitry Kalashnikov
Jacob Varley
Julian Ibarz
Michael S. Ryoo
124
10
0
26 Mar 2021
A Comprehensive Study of Deep Video Action Recognition
Yi Zhu
Xinyu Li
Chunhui Liu
Mohammadreza Zolfaghari
Yuanjun Xiong
Chongruo Wu
Zhi-Li Zhang
Joseph Tighe
R. Manmatha
Mu Li
VLM
AI4TS
176
202
0
11 Dec 2020
Selective Spatio-Temporal Aggregation Based Pose Refinement System: Towards Understanding Human Activities in Real-World Videos
Di Yang
Rui Dai
Yaohui Wang
Rupayan Mallick
Luca Minciullo
Gianpiero Francesca
Francois Bremond
136
16
0
10 Nov 2020
Searching Multi-Rate and Multi-Modal Temporal Enhanced Networks for Gesture Recognition
Zitong Yu
Benjia Zhou
Jun Wan
Pichao Wang
Zhaodong Sun
Xin Liu
Stan Z. Li
Guoying Zhao
3DPC
153
108
0
21 Aug 2020
Self-supervising Action Recognition by Statistical Moment and Subspace Descriptors
ACM Multimedia (ACM MM), 2020
Lei Wang
Piotr Koniusz
151
55
0
14 Jan 2020
Tiny Video Networks
Applied AI Letters (AA), 2019
A. Piergiovanni
A. Angelova
Michael S. Ryoo
267
49
0
15 Oct 2019
1