ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2008.08072
  4. Cited By
AssembleNet++: Assembling Modality Representations via Attention
  Connections

AssembleNet++: Assembling Modality Representations via Attention Connections

18 August 2020
Michael S. Ryoo
A. Piergiovanni
Juhana Kangaspunta
A. Angelova
ArXiv (abs)PDFHTML

Papers citing "AssembleNet++: Assembling Modality Representations via Attention Connections"

21 / 21 papers shown
Title
Vision Language Models for Dynamic Human Activity Recognition in Healthcare Settings
Vision Language Models for Dynamic Human Activity Recognition in Healthcare Settings
Abderrazek Abid
Thanh-Cong Ho
Fakhri Karray
VLM
39
0
0
24 Oct 2025
Language-driven Description Generation and Common Sense Reasoning for Video Action Recognition
Language-driven Description Generation and Common Sense Reasoning for Video Action Recognition
Xiaodan Hu
Chuhang Zou
Suchen Wang
Jaechul Kim
Narendra Ahuja
LRM
106
0
0
20 Jun 2025
Salient Temporal Encoding for Dynamic Scene Graph Generation
Salient Temporal Encoding for Dynamic Scene Graph Generation
Zhihao Zhu
156
0
0
15 Mar 2025
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Amir Hosein Fadaei
M. Dehaqani
215
0
0
11 Feb 2025
Fusion Matters: Learning Fusion in Deep Click-through Rate Prediction
  Models
Fusion Matters: Learning Fusion in Deep Click-through Rate Prediction ModelsWeb Search and Data Mining (WSDM), 2024
Kexin Zhang
Fuyuan Lyu
Xing Tang
Dugang Liu
Chen Ma
Kaize Ding
Xiuqiang He
Xue Liu
173
5
0
24 Nov 2024
Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video
  Learning
Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video LearningComputer Vision and Pattern Recognition (CVPR), 2022
A. Piergiovanni
Weicheng Kuo
A. Angelova
ViT
146
66
0
06 Dec 2022
Learning Fine-Grained Visual Understanding for Video Question Answering
  via Decoupling Spatial-Temporal Modeling
Learning Fine-Grained Visual Understanding for Video Question Answering via Decoupling Spatial-Temporal ModelingBritish Machine Vision Conference (BMVC), 2022
Hsin-Ying Lee
Hung-Ting Su
Bing-Chen Tsai
Tsung-Han Wu
Jia-Fong Yeh
Winston H. Hsu
153
2
0
08 Oct 2022
ViA: View-invariant Skeleton Action Representation Learning via Motion
  Retargeting
ViA: View-invariant Skeleton Action Representation Learning via Motion Retargeting
Di Yang
Yaohui Wang
A. Dantcheva
Lorenzo Garattoni
Gianpiero Francesca
Francois Bremond
159
22
0
31 Aug 2022
Gate-Shift-Fuse for Video Action Recognition
Gate-Shift-Fuse for Video Action RecognitionIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Swathikiran Sudhakaran
Sergio Escalera
Oswald Lanz
221
31
0
16 Mar 2022
Auto-X3D: Ultra-Efficient Video Understanding via Finer-Grained Neural
  Architecture Search
Auto-X3D: Ultra-Efficient Video Understanding via Finer-Grained Neural Architecture Search
Lezhi Li
Xinyu Gong
Junru Wu
Humphrey Shi
Zhicheng Yan
Zhangyang Wang
VGen
125
1
0
09 Dec 2021
4D-Net for Learned Multi-Modal Alignment
4D-Net for Learned Multi-Modal Alignment
A. Piergiovanni
Vincent Casser
Michael S. Ryoo
A. Angelova
3DPC
203
66
0
02 Sep 2021
Searching for Two-Stream Models in Multivariate Space for Video
  Recognition
Searching for Two-Stream Models in Multivariate Space for Video RecognitionIEEE International Conference on Computer Vision (ICCV), 2021
Xinyu Gong
Heng Wang
Zheng Shou
Matt Feiszli
Zhangyang Wang
Zhicheng Yan
117
9
0
30 Aug 2021
UNIK: A Unified Framework for Real-world Skeleton-based Action
  Recognition
UNIK: A Unified Framework for Real-world Skeleton-based Action RecognitionBritish Machine Vision Conference (BMVC), 2021
Di Yang
Yaohui Wang
A. Dantcheva
Lorenzo Garattoni
Gianpiero Francesca
Francois Bremond
144
55
0
19 Jul 2021
TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
Michael S. Ryoo
A. Piergiovanni
Anurag Arnab
Mostafa Dehghani
A. Angelova
ViT
368
146
0
21 Jun 2021
VPN++: Rethinking Video-Pose embeddings for understanding Activities of
  Daily Living
VPN++: Rethinking Video-Pose embeddings for understanding Activities of Daily LivingIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Srijan Das
Rui Dai
Di Yang
Francois Bremond
ViT
150
76
0
17 May 2021
Visionary: Vision architecture discovery for robot learning
Visionary: Vision architecture discovery for robot learningIEEE International Conference on Robotics and Automation (ICRA), 2021
Iretiayo Akinola
A. Angelova
Yao Lu
Yevgen Chebotar
Dmitry Kalashnikov
Jacob Varley
Julian Ibarz
Michael S. Ryoo
124
10
0
26 Mar 2021
A Comprehensive Study of Deep Video Action Recognition
A Comprehensive Study of Deep Video Action Recognition
Yi Zhu
Xinyu Li
Chunhui Liu
Mohammadreza Zolfaghari
Yuanjun Xiong
Chongruo Wu
Zhi-Li Zhang
Joseph Tighe
R. Manmatha
Mu Li
VLMAI4TS
176
202
0
11 Dec 2020
Selective Spatio-Temporal Aggregation Based Pose Refinement System:
  Towards Understanding Human Activities in Real-World Videos
Selective Spatio-Temporal Aggregation Based Pose Refinement System: Towards Understanding Human Activities in Real-World Videos
Di Yang
Rui Dai
Yaohui Wang
Rupayan Mallick
Luca Minciullo
Gianpiero Francesca
Francois Bremond
136
16
0
10 Nov 2020
Searching Multi-Rate and Multi-Modal Temporal Enhanced Networks for
  Gesture Recognition
Searching Multi-Rate and Multi-Modal Temporal Enhanced Networks for Gesture Recognition
Zitong Yu
Benjia Zhou
Jun Wan
Pichao Wang
Zhaodong Sun
Xin Liu
Stan Z. Li
Guoying Zhao
3DPC
153
108
0
21 Aug 2020
Self-supervising Action Recognition by Statistical Moment and Subspace
  Descriptors
Self-supervising Action Recognition by Statistical Moment and Subspace DescriptorsACM Multimedia (ACM MM), 2020
Lei Wang
Piotr Koniusz
151
55
0
14 Jan 2020
Tiny Video Networks
Tiny Video NetworksApplied AI Letters (AA), 2019
A. Piergiovanni
A. Angelova
Michael S. Ryoo
267
49
0
15 Oct 2019
1