Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2007.09049
Cited By
Learning to Discretely Compose Reasoning Module Networks for Video Captioning
International Joint Conference on Artificial Intelligence (IJCAI), 2020
17 July 2020
Ganchao Tan
Daqing Liu
Meng Wang
Zhengjun Zha
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Github (79★)
Papers citing
"Learning to Discretely Compose Reasoning Module Networks for Video Captioning"
21 / 21 papers shown
Reasoning is All You Need for Video Generalization: A Counterfactual Benchmark with Sub-question Evaluation
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Qiji Zhou
Yifan Gong
Guangsheng Bao
Hongjie Qiu
Jinqiang Li
Xiangrong Zhu
Huajian Zhang
Yue Zhang
LRM
336
3
0
12 Mar 2025
LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
Neural Information Processing Systems (NeurIPS), 2024
Wei Wu
Kecheng Zheng
Shuailei Ma
Fan Lu
Yuxin Guo
Yifei Zhang
Wei Chen
Qingpei Guo
Yujun Shen
Zheng-Jun Zha
VLM
533
28
0
07 Oct 2024
Sentiment-oriented Transformer-based Variational Autoencoder Network for Live Video Commenting
Fengyi Fu
Shancheng Fang
Weidong Chen
Zhendong Mao
ViT
VGen
210
9
0
19 Apr 2024
JRDB-Social: A Multifaceted Robotic Dataset for Understanding of Context and Dynamics of Human Interactions Within Social Groups
Simindokht Jahangard
Zhixi Cai
Shiki Wen
Hamid Rezatofighi
209
19
0
06 Apr 2024
Video Captioning with Aggregated Features Based on Dual Graphs and Gated Fusion
Yutao Jin
Yinan Han
Jing Wang
196
2
0
13 Aug 2023
Valley: Video Assistant with Large Language model Enhanced abilitY
Ruipu Luo
Ziwang Zhao
Min Yang
Junwei Dong
Da Li
Pengcheng Lu
Tao Wang
Linmei Hu
Ming-Hui Qiu
MLLM
712
262
0
12 Jun 2023
ChatBridge: Bridging Modalities with Large Language Model as a Language Catalyst
Zijia Zhao
Longteng Guo
Tongtian Yue
Si-Qing Chen
Shuai Shao
Xinxin Zhu
Zehuan Yuan
Jing Liu
MLLM
385
78
0
25 May 2023
TCR: Short Video Title Generation and Cover Selection with Attention Refinement
Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2023
Yu
Jiuding Yang
Weidong Guo
Hui Liu
Yu-Syuan Xu
Di Niu
177
5
0
25 Apr 2023
A Review of Deep Learning for Video Captioning
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Moloud Abdar
Meenakshi Kollati
Swaraja Kuraparthi
Farhad Pourpanah
Daniel J. McDuff
...
Shuicheng Yan
Abduallah A. Mohamed
Abbas Khosravi
Xiaoshi Zhong
Fatih Porikli
3DV
273
48
0
22 Apr 2023
Spatial-Aware Token for Weakly Supervised Object Localization
IEEE International Conference on Computer Vision (ICCV), 2023
Ping Wu
Wei Zhai
Yang Cao
Jiebo Luo
Zhengjun Zha
WSOL
356
17
0
18 Mar 2023
Grounding 3D Object Affordance from 2D Interactions in Images
IEEE International Conference on Computer Vision (ICCV), 2023
Yuhang Yang
Wei Zhai
Hongcheng Luo
Yang Cao
Jiebo Luo
Zhengjun Zha
378
67
0
18 Mar 2023
Visual Commonsense-aware Representation Network for Video Captioning
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022
Pengpeng Zeng
Haonan Zhang
Lianli Gao
Xiangpeng Li
Jin Qian
Hengtao Shen
195
25
0
17 Nov 2022
Robustness Analysis of Video-Language Models Against Visual and Language Perturbations
Neural Information Processing Systems (NeurIPS), 2022
Madeline Chantry Schiappa
Shruti Vyas
Hamid Palangi
Yogesh S Rawat
Vibhav Vineet
VLM
657
32
0
05 Jul 2022
Support-set based Multi-modal Representation Enhancement for Video Captioning
IEEE International Conference on Multimedia and Expo (ICME), 2022
Xiaoya Chen
Jingkuan Song
Pengpeng Zeng
Lianli Gao
Hengtao Shen
155
5
0
19 May 2022
Tragedy Plus Time: Capturing Unintended Human Activities from Weakly-labeled Videos
Arnav Chakravarthy
Zhiyuan Fang
Yezhou Yang
184
2
0
28 Apr 2022
Video Captioning: a comparative review of where we are and which could be the route
Computer Vision and Image Understanding (CVIU), 2022
Daniela Moctezuma
Tania A. Ramirez-delreal
Guillermo Ruiz
Othón González-Chávez
291
17
0
12 Apr 2022
EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching
Yaya Shi
Xu Yang
Haiyang Xu
Chunfen Yuan
Bing Li
Weiming Hu
Zhengjun Zha
298
44
0
17 Nov 2021
Visual-aware Attention Dual-stream Decoder for Video Captioning
Zhixin Sun
Zhuo Zhou
Shuqin Chen
Lin Li
Luo Zhong
226
4
0
16 Oct 2021
Discriminative Latent Semantic Graph for Video Captioning
ACM Multimedia (ACM MM), 2021
Yang Bai
Junyan Wang
Yang Long
Bingzhang Hu
Yang Song
Maurice Pagnucco
Yu Guan
315
33
0
08 Aug 2021
Neuro-Symbolic Representations for Video Captioning: A Case for Leveraging Inductive Biases for Vision and Language
Hassan Akbari
Hamid Palangi
Jianwei Yang
Sudha Rao
Asli Celikyilmaz
Roland Fernandez
P. Smolensky
Jianfeng Gao
Shih-Fu Chang
259
3
0
18 Nov 2020
Dense Relational Image Captioning via Multi-task Triple-Stream Networks
Dong-Jin Kim
Tae-Hyun Oh
Jinsoo Choi
In So Kweon
397
39
0
08 Oct 2020
1
Page 1 of 1