ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1801.03150
  4. Cited By
Moments in Time Dataset: one million videos for event understanding

Moments in Time Dataset: one million videos for event understanding

9 January 2018
Mathew Monfort
A. Andonian
Bolei Zhou
K. Ramakrishnan
Sarah Adel Bargal
Tom Yan
L. Brown
Quanfu Fan
Dan Gutfreund
Carl Vondrick
A. Oliva
ArXivPDFHTML

Papers citing "Moments in Time Dataset: one million videos for event understanding"

50 / 268 papers shown
Title
Co-STAR: Collaborative Curriculum Self-Training with Adaptive Regularization for Source-Free Video Domain Adaptation
Co-STAR: Collaborative Curriculum Self-Training with Adaptive Regularization for Source-Free Video Domain Adaptation
Amirhossein Dadashzadeh
Parsa Esmati
Majid Mirmehdi
TTA
VLM
48
0
0
15 Apr 2025
SF2T: Self-supervised Fragment Finetuning of Video-LLMs for Fine-Grained Understanding
SF2T: Self-supervised Fragment Finetuning of Video-LLMs for Fine-Grained Understanding
Yangliu Hu
Zikai Song
Na Feng
Yawei Luo
Junqing Yu
Yi-Ping Phoebe Chen
Wei Yang
33
0
0
10 Apr 2025
OwlSight: A Robust Illumination Adaptation Framework for Dark Video Human Action Recognition
OwlSight: A Robust Illumination Adaptation Framework for Dark Video Human Action Recognition
Shihao Cheng
Jinlu Zhang
Yue Liu
Zhigang Tu
VLM
39
0
0
30 Mar 2025
Surg-3M: A Dataset and Foundation Model for Perception in Surgical Settings
Surg-3M: A Dataset and Foundation Model for Perception in Surgical Settings
Chengan Che
Chao Wang
Tom Vercauteren
Sophia Tsoka
Luis C. García-Peraza-Herrera
MedIm
43
0
0
25 Mar 2025
Unbiasing through Textual Descriptions: Mitigating Representation Bias in Video Benchmarks
Unbiasing through Textual Descriptions: Mitigating Representation Bias in Video Benchmarks
Nina Shvetsova
Arsha Nagrani
Bernt Schiele
Hilde Kuehne
Christian Rupprecht
42
0
0
24 Mar 2025
FAVOR-Bench: A Comprehensive Benchmark for Fine-Grained Video Motion Understanding
FAVOR-Bench: A Comprehensive Benchmark for Fine-Grained Video Motion Understanding
Chongjun Tu
Lin Zhang
Pengtao Chen
Peng Ye
Xianfang Zeng
W. Cheng
Gang Yu
Tao Chen
79
0
0
19 Mar 2025
Quantum EigenGame for excited state calculation
Quantum EigenGame for excited state calculation
David Quiroga
Jason Han
Anastasios Kyrillidis
48
0
0
17 Mar 2025
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Amir Hosein Fadaei
M. Dehaqani
42
0
0
11 Feb 2025
GFG -- Gender-Fair Generation: A CALAMITA Challenge
GFG -- Gender-Fair Generation: A CALAMITA Challenge
Simona Frenda
Andrea Piergentili
Beatrice Savoldi
Marco Madeddu
Martina Rosola
Silvia Casola
Chiara Ferrando
V. Patti
Matteo Negri
L. Bentivogli
37
2
0
31 Dec 2024
MotionBank: A Large-scale Video Motion Benchmark with Disentangled
  Rule-based Annotations
MotionBank: A Large-scale Video Motion Benchmark with Disentangled Rule-based Annotations
Liang Xu
Shaoyang Hua
Zili Lin
Yifan Liu
Feipeng Ma
Yichao Yan
Xin Jin
Xiaokang Yang
Wenjun Zeng
VGen
39
3
0
17 Oct 2024
Towards Synthetic Data Generation for Improved Pain Recognition in
  Videos under Patient Constraints
Towards Synthetic Data Generation for Improved Pain Recognition in Videos under Patient Constraints
Jonas Nasimzada
Jens Kleesiek
Ken Herrmann
Alina Roitberg
C. Seibold
13
0
0
24 Sep 2024
Fine-grained length controllable video captioning with ordinal
  embeddings
Fine-grained length controllable video captioning with ordinal embeddings
Tomoya Nitta
Takumi Fukuzawa
Toru Tamaki
40
0
0
27 Aug 2024
Fairness and Bias Mitigation in Computer Vision: A Survey
Fairness and Bias Mitigation in Computer Vision: A Survey
Sepehr Dehdashtian
Ruozhen He
Yi Li
Guha Balakrishnan
Nuno Vasconcelos
Vicente Ordonez
Vishnu Naresh Boddeti
29
4
0
05 Aug 2024
Consent in Crisis: The Rapid Decline of the AI Data Commons
Consent in Crisis: The Rapid Decline of the AI Data Commons
Shayne Longpre
Robert Mahari
Ariel N. Lee
Campbell Lund
Hamidah Oderinwale
...
Hanlin Li
Daphne Ippolito
Sara Hooker
Jad Kabbara
Sandy Pentland
69
35
0
20 Jul 2024
A Survey of Video Datasets for Grounded Event Understanding
A Survey of Video Datasets for Grounded Event Understanding
Kate Sanders
Benjamin Van Durme
32
4
0
14 Jun 2024
OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation
OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation
Junke Wang
Yi-Xin Jiang
Zehuan Yuan
Binyue Peng
Zuxuan Wu
Yu-Gang Jiang
ViT
VGen
78
36
0
13 Jun 2024
Video-Language Understanding: A Survey from Model Architecture, Model
  Training, and Data Perspectives
Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data Perspectives
Thong Nguyen
Yi Bin
Junbin Xiao
Leigang Qu
Yicong Li
Jay Zhangjie Wu
Cong-Duy Nguyen
See-Kiong Ng
Luu Anh Tuan
VLM
41
9
1
09 Jun 2024
Hierarchical Action Recognition: A Contrastive Video-Language Approach
  with Hierarchical Interactions
Hierarchical Action Recognition: A Contrastive Video-Language Approach with Hierarchical Interactions
Rui Zhang
Shuailong Li
Junxiao Xue
Feng Lin
Qing Zhang
Xiao Ma
Xiaoran Yan
29
0
0
28 May 2024
Identity-free Artificial Emotional Intelligence via Micro-Gesture Understanding
Identity-free Artificial Emotional Intelligence via Micro-Gesture Understanding
Rong Gao
Xin Liu
Bohao Xing
Zitong Yu
Björn W. Schuller
H. Kalviainen
49
3
0
21 May 2024
Incorporating Clinical Guidelines through Adapting Multi-modal Large
  Language Model for Prostate Cancer PI-RADS Scoring
Incorporating Clinical Guidelines through Adapting Multi-modal Large Language Model for Prostate Cancer PI-RADS Scoring
Tiantian Zhang
Manxi Lin
Hongda Guo
Xiaofan Zhang
Ka Fung Peter Chiu
Aasa Feragen
Qi Dou
34
1
0
14 May 2024
360+x: A Panoptic Multi-modal Scene Understanding Dataset
360+x: A Panoptic Multi-modal Scene Understanding Dataset
Hao Chen
Yuqi Hou
Chenyuan Qu
Irene Testini
Xiaohan Hong
Jianbo Jiao
29
6
0
01 Apr 2024
InternVideo2: Scaling Video Foundation Models for Multimodal Video
  Understanding
InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding
Yi Wang
Kunchang Li
Xinhao Li
Jiashuo Yu
Yinan He
...
Hongjie Zhang
Yifei Huang
Yu Qiao
Yali Wang
Limin Wang
34
44
0
22 Mar 2024
Transfer Learning for Cross-dataset Isolated Sign Language Recognition
  in Under-Resourced Datasets
Transfer Learning for Cross-dataset Isolated Sign Language Recognition in Under-Resourced Datasets
A. Kındıroglu
Ozgur Kara
Ogulcan Özdemir
L. Akarun
SLR
29
3
0
21 Mar 2024
Sora as an AGI World Model? A Complete Survey on Text-to-Video
  Generation
Sora as an AGI World Model? A Complete Survey on Text-to-Video Generation
Joseph Cho
Fachrina Dewi Puspitasari
Sheng Zheng
Jingyao Zheng
Lik-Hang Lee
Tae-Ho Kim
Choong Seon Hong
Chaoning Zhang
EGVM
VGen
36
40
0
08 Mar 2024
VideoPrism: A Foundational Visual Encoder for Video Understanding
VideoPrism: A Foundational Visual Encoder for Video Understanding
Long Zhao
N. B. Gundavarapu
Liangzhe Yuan
Hao Zhou
Shen Yan
...
Huisheng Wang
Hartwig Adam
Mikhail Sirotenko
Ting Liu
Boqing Gong
VGen
29
29
0
20 Feb 2024
Can Text-to-image Model Assist Multi-modal Learning for Visual
  Recognition with Visual Modality Missing?
Can Text-to-image Model Assist Multi-modal Learning for Visual Recognition with Visual Modality Missing?
Tiantian Feng
Daniel Yang
Digbalay Bose
Shrikanth Narayanan
32
4
0
14 Feb 2024
Advancing Human Action Recognition with Foundation Models trained on
  Unlabeled Public Videos
Advancing Human Action Recognition with Foundation Models trained on Unlabeled Public Videos
Yang Qian
Yinan Sun
A. Kargarandehkordi
Parnian Azizian
O. Mutlu
Saimourya Surabhi
Pingyi Chen
Zain Jabbar
Dennis Paul Wall
Peter Washington
OffRL
21
1
0
14 Feb 2024
Computer Vision for Primate Behavior Analysis in the Wild
Computer Vision for Primate Behavior Analysis in the Wild
Richard Vogg
Timo Lüddecke
Jonathan Henrich
Sharmita Dey
Matthias Nuske
...
Alexander Gail
Stefan Treue
H. Scherberger
F. Worgotter
Alexander S. Ecker
28
3
0
29 Jan 2024
GTAutoAct: An Automatic Datasets Generation Framework Based on Game
  Engine Redevelopment for Action Recognition
GTAutoAct: An Automatic Datasets Generation Framework Based on Game Engine Redevelopment for Action Recognition
Xingyu Song
Zhan Li
Shi Chen
K. Demachi
27
1
0
24 Jan 2024
Multi-modal News Understanding with Professionally Labelled Videos
  (ReutersViLNews)
Multi-modal News Understanding with Professionally Labelled Videos (ReutersViLNews)
Shih-Han Chou
Matthew Kowal
Yasmin Niknam
Diana Moyano
Shayaan Mehdi
...
Cheng Zhang
Ian Knopke
S. Kocak
Leonid Sigal
Yalda Mohsenzadeh
33
1
0
23 Jan 2024
ActionHub: A Large-scale Action Video Description Dataset for Zero-shot
  Action Recognition
ActionHub: A Large-scale Action Video Description Dataset for Zero-shot Action Recognition
Jiaming Zhou
Junwei Liang
Kun-Yu Lin
Jinrui Yang
Wei-Shi Zheng
VLM
16
8
0
22 Jan 2024
ActAnywhere: Subject-Aware Video Background Generation
ActAnywhere: Subject-Aware Video Background Generation
Boxiao Pan
Zhan Xu
Chun-Hao Paul Huang
Krishna Kumar Singh
Yang Zhou
Leonidas J. Guibas
Jimei Yang
VGen
DiffM
24
3
0
19 Jan 2024
Open-Vocabulary Video Relation Extraction
Open-Vocabulary Video Relation Extraction
Wentao Tian
Zheng Wang
Yu Fu
Jingjing Chen
Lechao Cheng
23
2
0
25 Dec 2023
Video Recognition in Portrait Mode
Video Recognition in Portrait Mode
Mingfei Han
Linjie Yang
Xiaojie Jin
Jiashi Feng
Xiaojun Chang
Heng Wang
28
3
0
21 Dec 2023
Unsupervised Video Domain Adaptation with Masked Pre-Training and Collaborative Self-Training
Unsupervised Video Domain Adaptation with Masked Pre-Training and Collaborative Self-Training
Arun V. Reddy
William Paul
Corban Rivera
Ketul Shah
Celso M. de Melo
Rama Chellappa
37
4
0
05 Dec 2023
Sequential Modeling Enables Scalable Learning for Large Vision Models
Sequential Modeling Enables Scalable Learning for Large Vision Models
Yutong Bai
Xinyang Geng
K. Mangalam
Amir Bar
Alan Yuille
Trevor Darrell
Jitendra Malik
Alexei A. Efros
MLLM
VLM
22
152
0
01 Dec 2023
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
Kunchang Li
Yali Wang
Yinan He
Yizhuo Li
Yi Wang
...
Jilan Xu
Guo Chen
Ping Luo
Limin Wang
Yu Qiao
VLM
MLLM
56
398
0
28 Nov 2023
Learning Human Action Recognition Representations Without Real Humans
Learning Human Action Recognition Representations Without Real Humans
Howard Zhong
Samarth Mishra
Donghyun Kim
SouYoung Jin
Rameswar Panda
Hildegard Kuehne
Leonid Karlinsky
Venkatesh Saligrama
Aude Oliva
Rogerio Feris
24
3
0
10 Nov 2023
OmniVec: Learning robust representations with cross modal sharing
OmniVec: Learning robust representations with cross modal sharing
Siddharth Srivastava
Gaurav Sharma
SSL
21
64
0
07 Nov 2023
ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab
ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab
Jieming Cui
Ziren Gong
Baoxiong Jia
Siyuan Huang
Zilong Zheng
Jianzhu Ma
Yixin Zhu
25
3
0
01 Nov 2023
How Physics and Background Attributes Impact Video Transformers in
  Robotic Manipulation: A Case Study on Planar Pushing
How Physics and Background Attributes Impact Video Transformers in Robotic Manipulation: A Case Study on Planar Pushing
Shutong Jin
Ruiyu Wang
Muhammad Zahid
Florian T. Pokorny
26
1
0
03 Oct 2023
CPR-Coach: Recognizing Composite Error Actions based on Single-class
  Training
CPR-Coach: Recognizing Composite Error Actions based on Single-class Training
Shunli Wang
Qing Yu
Shuai Wang
Dingkang Yang
Liuzhen Su
Xiao Zhao
Haopeng Kuang
Pei Zhang
Peng Zhai
Lihua Zhang
29
3
0
21 Sep 2023
SOAR: Scene-debiasing Open-set Action Recognition
SOAR: Scene-debiasing Open-set Action Recognition
Yuanhao Zhai
Ziyi Liu
Zhenyu Wu
Yi Wu
Chunluan Zhou
David Doermann
Junsong Yuan
Gang Hua
13
11
0
03 Sep 2023
MM-AU:Towards Multimodal Understanding of Advertisement Videos
MM-AU:Towards Multimodal Understanding of Advertisement Videos
Digbalay Bose
Rajat Hebbar
Tiantian Feng
Krishna Somandepalli
Anfeng Xu
Shrikanth Narayanan
25
5
0
27 Aug 2023
Audiovisual Moments in Time: A Large-Scale Annotated Dataset of
  Audiovisual Actions
Audiovisual Moments in Time: A Large-Scale Annotated Dataset of Audiovisual Actions
Michael Joannou
P. Rotshtein
U. Noppeney
13
0
0
18 Aug 2023
The Unreasonable Effectiveness of Large Language-Vision Models for
  Source-free Video Domain Adaptation
The Unreasonable Effectiveness of Large Language-Vision Models for Source-free Video Domain Adaptation
Giacomo Zara
Alessandro Conti
Subhankar Roy
Stéphane Lathuilière
Paolo Rota
Elisa Ricci
25
11
0
17 Aug 2023
Optical Flow boosts Unsupervised Localization and Segmentation
Optical Flow boosts Unsupervised Localization and Segmentation
Xinyu Zhang
Abdeslam Boularias
18
5
0
25 Jul 2023
Similarity Min-Max: Zero-Shot Day-Night Domain Adaptation
Similarity Min-Max: Zero-Shot Day-Night Domain Adaptation
Run Luo
Wenjing Wang
Wenhan Yang
Jiaying Liu
VLM
46
11
0
17 Jul 2023
A Survey of Deep Learning in Sports Applications: Perception,
  Comprehension, and Decision
A Survey of Deep Learning in Sports Applications: Perception, Comprehension, and Decision
Zhonghan Zhao
Wenhao Chai
Shengyu Hao
Wenhao Hu
Guanhong Wang
Shidong Cao
Min-Gyoo Song
Jenq-Neng Hwang
Gaoang Wang
27
17
0
07 Jul 2023
VideoGLUE: Video General Understanding Evaluation of Foundation Models
VideoGLUE: Video General Understanding Evaluation of Foundation Models
Liangzhe Yuan
N. B. Gundavarapu
Long Zhao
Hao Zhou
Yin Cui
...
Florian Schroff
Hartwig Adam
Ming Yang
Ting Liu
Boqing Gong
ELM
32
9
0
06 Jul 2023
123456
Next