Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video
LearningComputer Vision and Pattern Recognition (CVPR), 2022 |
ViGAT: Bottom-up event recognition and explanation in video using
factorized graph attention networkIEEE Access (IEEE Access), 2022 |
Searching for Two-Stream Models in Multivariate Space for Video
RecognitionIEEE International Conference on Computer Vision (ICCV), 2021 |