Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2008.01232
Cited By
Late Temporal Modeling in 3D CNN Architectures with BERT for Action Recognition
3 August 2020
M. E. Kalfaoglu
Sinan Kalkan
Aydin Alatan
3DPC
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Late Temporal Modeling in 3D CNN Architectures with BERT for Action Recognition"
50 / 57 papers shown
Title
MD-BERT: Action Recognition in Dark Videos via Dynamic Multi-Stream Fusion and Temporal Modeling
Sharana Dharshikgan Suresh Dass
H. Barua
Ganesh Krishnasamy
Raveendran Paramesran
Raphael C.-W. Phan
59
0
0
06 Feb 2025
A Survey of Recent Advances and Challenges in Deep Audio-Visual Correlation Learning
Luis Vilaca
Yi Yu
Paula Vinan
73
0
0
24 Nov 2024
Enhancing Action Recognition by Leveraging the Hierarchical Structure of Actions and Textual Context
Manuel Benavent-Lledo
David Mulero-Pérez
David Ortiz-Perez
José García Rodríguez
Antonis Argyros
24
0
0
28 Oct 2024
MPT-PAR:Mix-Parameters Transformer for Panoramic Activity Recognition
Wenqing Gan
Yaoyu Li
Jian Li
Zhangang Lin
ViT
30
0
0
01 Aug 2024
Pose-guided multi-task video transformer for driver action recognition
Ricardo Pizarro
Roberto Valle
L. Bergasa
J. M. Buenaposada
Luis Baumela
ViT
32
0
0
18 Jul 2024
From CNNs to Transformers in Multimodal Human Action Recognition: A Survey
Muhammad Bilal Shaikh
Syed Mohammed Shamsul Islam
Douglas Chai
Naveed Akhtar
35
9
0
22 May 2024
Continuous Sign Language Recognition with Adapted Conformer via Unsupervised Pretraining
Neena Aloysius
M. Geetha
Prema Nedungadi
SLR
19
2
0
20 May 2024
TransNet: A Transfer Learning-Based Network for Human Action Recognition
Khaled Alomar
Xiaohao Cai
19
1
0
13 Sep 2023
Predicting Routine Object Usage for Proactive Robot Assistance
Maithili Patel
Aswin Prakash
Sonia Chernova
AI4TS
29
8
0
12 Sep 2023
IndGIC: Supervised Action Recognition under Low Illumination
Jing-Teng Zeng
27
1
0
29 Aug 2023
Actor-agnostic Multi-label Action Recognition with Multi-modal Query
Anindya Mondal
Sauradip Nag
J. Prada
Xiatian Zhu
Anjan Dutta
21
9
0
20 Jul 2023
UniS-MMC: Multimodal Classification via Unimodality-supervised Multimodal Contrastive Learning
Heqing Zou
Meng Shen
Chen Chen
Yuchen Hu
D. Rajan
Chng Eng Siong
SSL
32
15
0
16 May 2023
Physical Adversarial Attacks for Surveillance: A Survey
Kien Nguyen Thanh
Tharindu Fernando
Clinton Fookes
S. Sridharan
AAML
25
8
0
01 May 2023
Weakly Supervised Detection of Baby Cry
Weijun Tan
Qi Yao
Jingfeng Liu
8
1
0
19 Apr 2023
Multimodal Data Integration for Oncology in the Era of Deep Neural Networks: A Review
Asim Waqas
Aakash Tripathi
Ravichandran Ramachandran
Paul Stewart
Ghulam Rasool
AI4CE
32
31
0
11 Mar 2023
Capsules as viewpoint learners for human pose estimation
Nicola Garau
Nicola Conci
3DH
14
0
0
13 Feb 2023
Triple-stream Deep Metric Learning of Great Ape Behavioural Actions
Otto Brookes
Majid Mirmehdi
H. Kühl
T. Burghardt
19
14
0
06 Jan 2023
Transformers in Action Recognition: A Review on Temporal Modeling
Elham Shabaninia
Hossein Nezamabadi-pour
Fatemeh Shafizadegan
ViT
21
8
0
29 Dec 2022
Simultaneous Multiple Object Detection and Pose Estimation using 3D Model Infusion with Monocular Vision
Cong Li
Shijie Sun
Xiangyu Song
Huansheng Song
Naveed Akhtar
Ajmal Saeed Mian
3DPC
22
1
0
21 Nov 2022
A Unified Multimodal De- and Re-coupling Framework for RGB-D Motion Recognition
Benjia Zhou
Pichao Wang
Jun Wan
Yan-Ni Liang
Fan Wang
29
17
0
16 Nov 2022
Overlooked Video Classification in Weakly Supervised Video Anomaly Detection
Weijun Tan
Qi Yao
Jingfeng Liu
AI4TS
19
10
0
13 Oct 2022
Vision Transformers for Action Recognition: A Survey
Anwaar Ulhaq
Naveed Akhtar
Ganna Pogrebna
Ajmal Saeed Mian
ViT
19
44
0
13 Sep 2022
Robotic Detection of a Human-Comprehensible Gestural Language for Underwater Multi-Human-Robot Collaboration
Sadman Sakib Enan
Michael Fulton
Junaed Sattar
31
8
0
12 Jul 2022
Two-Stage COVID19 Classification Using BERT Features
Weijun Tan
Qi Yao
Jingfeng Liu
14
9
0
29 Jun 2022
Detection of Fights in Videos: A Comparison Study of Anomaly Detection and Action Recognition
Weijun Tan
Jingfeng Liu
11
8
0
23 May 2022
Tragedy Plus Time: Capturing Unintended Human Activities from Weakly-labeled Videos
Arnav Chakravarthy
Zhiyuan Fang
Yezhou Yang
21
2
0
28 Apr 2022
3D Convolutional Networks for Action Recognition: Application to Sport Gesture Recognition
Pierre-Etienne Martin
J. Benois-Pineau
Renaud Péteri
A. Zemmari
J. Morlier
13
5
0
13 Apr 2022
Going Deeper into Recognizing Actions in Dark Environments: A Comprehensive Benchmark Study
Yuecong Xu
Jianfei Yang
Haozhi Cao
Jianxiong Yin
Zhenghua Chen
Xiaoli Li
Zhengguo Li
Qiaoqiao Xu
35
2
0
19 Feb 2022
Video Transformers: A Survey
Javier Selva
A. S. Johansen
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
Albert Clapés
ViT
20
103
0
16 Jan 2022
Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition
Benjia Zhou
Pichao Wang
Jun Wan
Yanyan Liang
Fan Wang
Du Zhang
Zhen Lei
Hao Li
Rong Jin
28
29
0
16 Dec 2021
Evaluating Transformers for Lightweight Action Recognition
Raivo Koot
Markus Hennerbichler
Haiping Lu
ViT
22
8
0
18 Nov 2021
Co-segmentation Inspired Attention Module for Video-based Computer Vision Tasks
Arulkumar Subramaniam
Jayesh Vaidya
Muhammed Ameen
Athira M. Nambiar
Anurag Mittal
14
7
0
14 Nov 2021
Sparse Adversarial Video Attacks with Spatial Transformations
Ronghui Mu
Wenjie Ruan
Leandro Soriano Marcolino
Q. Ni
AAML
17
18
0
10 Nov 2021
Unsupervised View-Invariant Human Posture Representation
Faegheh Sardari
Bjorn Ommer
Majid Mirmehdi
3DH
23
3
0
17 Sep 2021
Deep Learning for Fitness
N. Mahendran
3DH
14
4
0
03 Sep 2021
Multi-Modal Zero-Shot Sign Language Recognition
R. Rastgoo
Kourosh Kiani
Sergio Escalera
Mohammad Sabokrou
SLR
11
5
0
02 Sep 2021
LIGAR: Lightweight General-purpose Action Recognition
Evgeny Izutov
10
3
0
30 Aug 2021
ZS-SLR: Zero-Shot Sign Language Recognition from RGB-D Videos
R. Rastgoo
Kourosh Kiani
Sergio Escalera
SLR
19
10
0
23 Aug 2021
DECA: Deep viewpoint-Equivariant human pose estimation using Capsule Autoencoders
Nicola Garau
N. Bisagno
Piotr Bródka
Nicola Conci
11
27
0
19 Aug 2021
Boosting Salient Object Detection with Transformer-based Asymmetric Bilateral U-Net
Yu Qiu
Yun-Hai Liu
Le Zhang
Jing Xu
ViT
19
30
0
17 Aug 2021
Temporal Action Localization Using Gated Recurrent Units
Hassan Keshvari Khojasteh
Hoda Mohammadzade
H. Behroozi
13
3
0
07 Aug 2021
Federated Action Recognition on Heterogeneous Embedded Devices
Pranjali Jain
Shreyas Goenka
S. Bagchi
Biplab Banerjee
Somali Chaterji
FedML
43
7
0
18 Jul 2021
Training for temporal sparsity in deep neural networks, application in video processing
Amirreza Yousefzadeh
Manolis Sifalakis
14
3
0
15 Jul 2021
Delta Sampling R-BERT for limited data and low-light action recognition
Sanchit Hira
Ritwik Das
Abhinav Modi
D. Pakhomov
75
17
0
12 Jul 2021
Ultrasound Video Transformers for Cardiac Ejection Fraction Estimation
Hadrien Reynaud
Athanasios Vlontzos
Benjamin Hou
A. Beqiri
Paul Leeson
Bernhard Kainz
MedIm
ViT
28
53
0
02 Jul 2021
A 3D CNN Network with BERT For Automatic COVID-19 Diagnosis From CT-Scan Images
Weijun Tan
Jingfeng Liu
3DPC
MedIm
20
17
0
28 Jun 2021
VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning
Hao Tan
Jie Lei
Thomas Wolf
Mohit Bansal
31
65
0
21 Jun 2021
What Makes Multi-modal Learning Better than Single (Provably)
Yu Huang
Chenzhuang Du
Zihui Xue
Xuanyao Chen
Hang Zhao
Longbo Huang
23
247
0
08 Jun 2021
Personalizing Pre-trained Models
Mina Khan
P. Srivatsa
Advait Rane
Shriram Chenniappa
A. Hazariwala
Pattie Maes
VLM
39
5
0
02 Jun 2021
An Image is Worth 16x16 Words, What is a Video Worth?
Gilad Sharir
Asaf Noy
Lihi Zelnik-Manor
ViT
19
120
0
25 Mar 2021
1
2
Next