ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.17177
  4. Cited By
Sora: A Review on Background, Technology, Limitations, and Opportunities
  of Large Vision Models

Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models

27 February 2024
Yixin Liu
Kai Zhang
Yuan Li
Zhiling Yan
Chujie Gao
Ruoxi Chen
Zhengqing Yuan
Yue Huang
Hanchi Sun
Jianfeng Gao
Lifang He
Lichao Sun
    VLM
    VGen
    EGVM
ArXivPDFHTML

Papers citing "Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models"

30 / 30 papers shown
Title
A Rusty Link in the AI Supply Chain: Detecting Evil Configurations in Model Repositories
A Rusty Link in the AI Supply Chain: Detecting Evil Configurations in Model Repositories
Ziqi Ding
Qian Fu
Junchen Ding
Gelei Deng
Yi Liu
Yuekang Li
17
9
0
02 May 2025
VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations for Synthetic Videos
VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations for Synthetic Videos
Zongxia Li
Xiyang Wu
Yubin Qin
Guangyao Shi
Hongyang Du
Dinesh Manocha
Tianyi Zhou
Jordan Boyd-Graber
MLLM
33
63
0
02 May 2025
Simple Visual Artifact Detection in Sora-Generated Videos
Simple Visual Artifact Detection in Sora-Generated Videos
Misora Sugiyama
Hirokatsu Kataoka
EGVM
41
22
0
30 Apr 2025
Wonderland: Navigating 3D Scenes from a Single Image
Wonderland: Navigating 3D Scenes from a Single Image
Hanwen Liang
Junli Cao
Vidit Goel
Guocheng Qian
Sergei Korolev
Demetri Terzopoulos
Konstantinos N. Plataniotis
Sergey Tulyakov
Jian Ren
VGen
113
115
0
16 Dec 2024
SpatialDreamer: Self-supervised Stereo Video Synthesis from Monocular Input
SpatialDreamer: Self-supervised Stereo Video Synthesis from Monocular Input
Zhen Lv
Yangqi Long
Congzhentao Huang
Cao Li
Chengfei Lv
Hao Ren
Dian Zheng
DiffM
VGen
MDE
106
70
0
18 Nov 2024
LT3SD: Latent Trees for 3D Scene Diffusion
LT3SD: Latent Trees for 3D Scene Diffusion
Quan Meng
Lei Li
Matthias Nießner
Angela Dai
74
56
0
12 Sep 2024
COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability
COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability
Xing-ming Guo
Fangxu Yu
Huan Zhang
Lianhui Qin
Bin Hu
AAML
77
22
0
13 Feb 2024
I Think, Therefore I am: Benchmarking Awareness of Large Language Models
  Using AwareBench
I Think, Therefore I am: Benchmarking Awareness of Large Language Models Using AwareBench
Yuan Li
Yue Huang
Yuli Lin
Siyuan Wu
Yao Wan
Lichao Sun
LLMAG
ELM
22
1
0
31 Jan 2024
Lumiere: A Space-Time Diffusion Model for Video Generation
Lumiere: A Space-Time Diffusion Model for Video Generation
Omer Bar-Tal
Hila Chefer
Omer Tov
Charles Herrmann
Roni Paiss
...
T. Michaeli
Oliver Wang
Deqing Sun
Tali Dekel
Inbar Mosseri
VGen
82
90
0
23 Jan 2024
VideoCrafter2: Overcoming Data Limitations for High-Quality Video
  Diffusion Models
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
Haoxin Chen
Yong Zhang
Xiaodong Cun
Menghan Xia
Xintao Wang
Chao-Liang Weng
Ying Shan
VGen
DiffM
98
75
0
17 Jan 2024
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from
  Fine-grained Correctional Human Feedback
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
M. Steyvers
Yuan Yao
Haoye Zhang
Taiwen He
Yifeng Han
...
Xinyue Hu
Zhiyuan Liu
Hai-Tao Zheng
Maosong Sun
Tat-Seng Chua
MLLM
VLM
104
56
0
01 Dec 2023
AlignBench: Benchmarking Chinese Alignment of Large Language Models
AlignBench: Benchmarking Chinese Alignment of Large Language Models
Xiao Liu
Xuanyu Lei
Sheng-Ping Wang
Yue Huang
Zhuoer Feng
...
Hongning Wang
Jing Zhang
Minlie Huang
Yuxiao Dong
Jie Tang
ELM
LM&MA
ALM
95
16
0
30 Nov 2023
Adversarial Diffusion Distillation
Adversarial Diffusion Distillation
Axel Sauer
Dominik Lorenz
A. Blattmann
Robin Rombach
122
121
0
28 Nov 2023
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large
  Datasets
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets
A. Blattmann
Tim Dockhorn
Sumith Kulal
Daniel Mendelevitch
Maciej Kilian
...
Zion English
Vikram S. Voleti
Adam Letts
Varun Jampani
Robin Rombach
VGen
142
321
0
25 Nov 2023
Control-A-Video: Controllable Text-to-Video Generation with Diffusion
  Models
Control-A-Video: Controllable Text-to-Video Generation with Diffusion Models
Weifeng Chen
Yatai Ji
Jie Wu
Hefeng Wu
Pan Xie
Jiashi Li
Xin Xia
Xuefeng Xiao
Liang Lin
VGen
100
90
0
23 May 2023
MDTv2: Masked Diffusion Transformer is a Strong Image Synthesizer
MDTv2: Masked Diffusion Transformer is a Strong Image Synthesizer
Shanghua Gao
Pan Zhou
Mingg-Ming Cheng
Shuicheng Yan
DiffM
104
74
0
25 Mar 2023
VideoFusion: Decomposed Diffusion Models for High-Quality Video
  Generation
VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation
Zhengxiong Luo
Dayou Chen
Yingya Zhang
Yan Huang
Liangsheng Wang
Yujun Shen
Deli Zhao
Jinren Zhou
Tien-Ping Tan
DiffM
VGen
111
200
0
15 Mar 2023
DALL-E-Bot: Introducing Web-Scale Diffusion Models to Robotics
DALL-E-Bot: Introducing Web-Scale Diffusion Models to Robotics
Ivan Kapelyukh
Vitalis Vosylius
Edward Johns
LM&Ro
DiffM
59
100
0
05 Oct 2022
Fast-Vid2Vid: Spatial-Temporal Compression for Video-to-Video Synthesis
Fast-Vid2Vid: Spatial-Temporal Compression for Video-to-Video Synthesis
Long Zhuo
Guangcong Wang
Shikai Li
Wayne Wu
Ziwei Liu
VGen
31
11
0
11 Jul 2022
MaskViT: Masked Visual Pre-Training for Video Prediction
MaskViT: Masked Visual Pre-Training for Video Prediction
Agrim Gupta
Stephen Tian
Yunzhi Zhang
Jiajun Wu
Roberto Martín-Martín
Li Fei-Fei
69
95
0
23 Jun 2022
CogVideo: Large-scale Pretraining for Text-to-Video Generation via
  Transformers
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
Wenyi Hong
Ming Ding
Wendi Zheng
Xinghan Liu
Jie Tang
DiffM
235
328
0
29 May 2022
Flexible Diffusion Modeling of Long Videos
Flexible Diffusion Modeling of Long Videos
William Harvey
Saeid Naderiparizi
Vaden Masrani
Christian Weilbach
Frank D. Wood
DiffM
BDL
VGen
161
207
0
23 May 2022
Planning with Diffusion for Flexible Behavior Synthesis
Planning with Diffusion for Flexible Behavior Synthesis
Michael Janner
Yilun Du
J. Tenenbaum
Sergey Levine
DiffM
180
364
0
20 May 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
298
7,763
0
04 Mar 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
252
5,353
0
11 Nov 2021
Multitask Prompted Training Enables Zero-Shot Task Generalization
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
203
1,436
0
15 Oct 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
270
2,978
0
18 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
285
2,730
0
11 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
267
1,486
0
09 Feb 2021
U-Net: Convolutional Networks for Biomedical Image Segmentation
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger
Philipp Fischer
Thomas Brox
SSeg
3DV
211
9,999
0
18 May 2015
1