Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1709.07871
Cited By
FiLM: Visual Reasoning with a General Conditioning Layer
22 September 2017
Ethan Perez
Florian Strub
H. D. Vries
Vincent Dumoulin
Aaron Courville
FAtt
AIMat
OffRL
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"FiLM: Visual Reasoning with a General Conditioning Layer"
50 / 1,304 papers shown
Title
HyperCLIP: Adapting Vision-Language models with Hypernetworks
Victor Akinwande
Mohammad Sadegh Norouzzadeh
Devin Willmott
Anna Bair
Madan Ravi Ganesh
J. Zico Kolter
CLIP
VLM
84
0
0
21 Dec 2024
MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
Ho Kei Cheng
Masato Ishii
Akio Hayakawa
Takashi Shibuya
A. Schwing
Yuki Mitsufuji
VGen
126
12
0
19 Dec 2024
Policy Decorator: Model-Agnostic Online Refinement for Large Policy Model
Xiu Yuan
Tongzhou Mu
Stone Tao
Yunhao Fang
Mengke Zhang
H. Su
OffRL
66
0
0
18 Dec 2024
Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning
Moritz Reuss
Jyothish Pari
Pulkit Agrawal
Rudolf Lioutikov
DiffM
MoE
74
5
0
17 Dec 2024
GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training
Renqiu Xia
M. Li
Hancheng Ye
Wenjie Wu
Hongbin Zhou
...
Conghui He
Botian Shi
Tao Chen
Junchi Yan
Bo Zhang
82
7
0
16 Dec 2024
Fast and Robust Visuomotor Riemannian Flow Matching Policy
Haoran Ding
Noémie Jaquier
Jan Peters
Leonel Rozo
77
2
0
14 Dec 2024
VinTAGe: Joint Video and Text Conditioning for Holistic Audio Generation
Saksham Singh Kushwaha
Yapeng Tian
DiffM
VGen
76
2
0
14 Dec 2024
SQ-Whisper: Speaker-Querying based Whisper Model for Target-Speaker ASR
Pengcheng Guo
Xuankai Chang
Hang Lv
Shinji Watanabe
Lei Xie
61
0
0
07 Dec 2024
CA-SSLR: Condition-Aware Self-Supervised Learning Representation for Generalized Speech Processing
Yen-Ju Lu
Jing Liu
Thomas Thebaud
Laureano Moro Velázquez
Ariya Rastrow
Najim Dehak
Jesus Villalba
69
1
0
05 Dec 2024
TASR: Timestep-Aware Diffusion Model for Image Super-Resolution
Qinwei Lin
Xiaopeng Sun
Yu Gao
Yujie Zhong
Dengjie Li
Zheng Zhao
Haoqian Wang
69
0
0
04 Dec 2024
Diffusion-VLA: Scaling Robot Foundation Models via Unified Diffusion and Autoregression
Junjie Wen
Minjie Zhu
Y. X. Zhu
Zhibin Tang
Jinming Li
...
Chengmeng Li
Xiaoyu Liu
Yaxin Peng
Chaomin Shen
Feifei Feng
85
13
0
04 Dec 2024
Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation
Zilyu Ye
Zhiyang Chen
Tiancheng Li
Zemin Huang
Weijian Luo
Guo-jun Qi
DiffM
72
4
0
02 Dec 2024
Zero-shot Musical Stem Retrieval with Joint-Embedding Predictive Architectures
Alain Riou
Antonin Gagnere
Gaëtan Hadjeres
Stefan Lattner
Geoffroy Peeters
86
0
0
29 Nov 2024
Unpacking the Individual Components of Diffusion Policy
Xiu Yuan
77
0
0
27 Nov 2024
MWFormer: Multi-Weather Image Restoration Using Degradation-Aware Transformers
Ruoxi Zhu
Zhengzhong Tu
Jiaming Liu
A. Bovik
Yibo Fan
ViT
65
7
0
26 Nov 2024
Multi-Resolution Generative Modeling of Human Motion from Limited Data
David Eduardo Moreno-Villamarín
A. Hilsmann
Peter Eisert
DiffM
3DH
81
0
0
25 Nov 2024
Gaussian Scenes: Pose-Free Sparse-View Scene Reconstruction using Depth-Enhanced Diffusion Priors
Soumava Paul
Prakhar Kaushik
Alan L. Yuille
3DGS
DiffM
129
0
0
24 Nov 2024
Learning to Reason Iteratively and Parallelly for Complex Visual Reasoning Scenarios
Shantanu Jaiswal
Debaditya Roy
Basura Fernando
Cheston Tan
ReLM
LRM
66
2
0
20 Nov 2024
Generating 3D-Consistent Videos from Unposed Internet Photos
Gene Chou
Kai Zhang
Sai Bi
Hao Tan
Zexiang Xu
Fujun Luan
Bharath Hariharan
Noah Snavely
3DGS
VGen
75
3
0
20 Nov 2024
A Comprehensive Survey on Visual Question Answering Datasets and Algorithms
Raihan Kabir
Naznin Haque
Md. Saiful Islam
Marium-E. Jannat
CoGe
29
1
0
17 Nov 2024
TDSM: Triplet Diffusion for Skeleton-Text Matching in Zero-Shot Action Recognition
Jeonghyeok Do
Munchurl Kim
44
1
0
16 Nov 2024
NeuralDEM -- Real-time Simulation of Industrial Particulate Flows
Benedikt Alkin
Tobias Kronlachner
Samuele Papa
Stefan Pirker
Thomas Lichtenegger
Johannes Brandstetter
PINN
AI4CE
38
1
1
14 Nov 2024
Artificial Intelligence for Biomedical Video Generation
Linyuan Li
Jianing Qiu
Anujit Saha
Lin Li
Poyuan Li
Mengxian He
Ziyu Guo
Wu Yuan
VGen
58
1
0
12 Nov 2024
Modulating State Space Model with SlowFast Framework for Compute-Efficient Ultra Low-Latency Speech Enhancement
Longbiao Cheng
Ashutosh Pandey
Buye Xu
T. Delbruck
V. Ithapu
Shih-Chii Liu
35
0
0
04 Nov 2024
Music Foundation Model as Generic Booster for Music Downstream Tasks
Weihsiang Liao
Yuhta Takida
Yukara Ikemiya
Zhi-Wei Zhong
Chieh-Hsin Lai
...
Stefan Uhlich
Taketo Akama
Woosung Choi
Yuichiro Koyama
Yuki Mitsufuji
51
0
0
02 Nov 2024
Is Multiple Object Tracking a Matter of Specialization?
G. Mancusi
Mattia Bernardi
Aniello Panariello
Angelo Porrello
Rita Cucchiara
Simone Calderara
MoMe
29
1
0
01 Nov 2024
EDT: An Efficient Diffusion Transformer Framework Inspired by Human-like Sketching
Xinwang Chen
Ning Liu
Y. X. Zhu
Feifei Feng
Jian Tang
34
2
0
31 Oct 2024
BLAST: Block-Level Adaptive Structured Matrices for Efficient Deep Neural Network Inference
Changwoo Lee
Soo Min Kwon
Qing Qu
Hun-Seok Kim
25
0
0
28 Oct 2024
Enhancing Lie Detection Accuracy: A Comparative Study of Classic ML, CNN, and GCN Models using Audio-Visual Features
Abdelrahman Abdelwahab
Abdelrahman Abdelwahab
Ayaan Vaswani
Advait Bharathulwar
Arnav Kommaraju
21
1
0
26 Oct 2024
GHIL-Glue: Hierarchical Control with Filtered Subgoal Images
Kyle Hatch
Ashwin Balakrishna
Oier Mees
Suraj Nair
Seohong Park
...
Masha Itkina
Benjamin Eysenbach
Sergey Levine
Thomas Kollar
Benjamin Burchfiel
52
2
0
26 Oct 2024
Considerations for Distribution Shift Robustness of Diagnostic Models in Healthcare
Arno Blaas
Adam Goliñski
Andrew C. Miller
Luca Zappella
J. Jacobsen
Christina Heinze-Deml
OOD
18
0
0
25 Oct 2024
Diffusion for Multi-Embodiment Grasping
Roman Freiberg
Alexander Qualmann
Ngo Anh Vien
Gerhard Neumann
24
3
0
24 Oct 2024
DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation
Yuang Ai
Xiaoqiang Zhou
Huaibo Huang
Xiaotian Han
Zhengyu Chen
Quanzeng You
Hongxia Yang
42
8
0
24 Oct 2024
Unified Microphone Conversion: Many-to-Many Device Mapping via Feature-wise Linear Modulation
Myeonghoon Ryu
Hongseok Oh
Suji Lee
Han Park
18
0
0
23 Oct 2024
Composing Diffusion Policies for Few-shot Learning of Movement Trajectories
Omkar Patil
Anant Sah
N. Gopalan
DiffM
15
1
0
22 Oct 2024
Allegro: Open the Black Box of Commercial-Level Video Generation Model
Yuan Zhou
Qiuyue Wang
Yuxuan Cai
Huan Yang
VGen
VLM
77
25
0
20 Oct 2024
LoRA-IR: Taming Low-Rank Experts for Efficient All-in-One Image Restoration
Yuang Ai
Huaibo Huang
Ran He
30
2
0
20 Oct 2024
FoMo: A Foundation Model for Mobile Traffic Forecasting with Diffusion Model
Haoye Chai
Shiyuan Zhang
Xiaoqian Qi
Yong Li
25
0
0
20 Oct 2024
CAGE: Causal Attention Enables Data-Efficient Generalizable Robotic Manipulation
Shangning Xia
Hongjie Fang
Hao-Shu Fang
Cewu Lu
CML
29
5
0
19 Oct 2024
Diff-DAgger: Uncertainty Estimation with Diffusion Policy for Robotic Manipulation
Sung-Wook Lee
Yen-Ling Kuo
Yen-Ling Kuo
21
4
0
18 Oct 2024
GAN-Based Speech Enhancement for Low SNR Using Latent Feature Conditioning
Shrishti Saha Shetu
Emanuël A. P. Habets
Andreas Brendel
21
1
0
17 Oct 2024
The Latent Road to Atoms: Backmapping Coarse-grained Protein Structures with Latent Diffusion
Xu Han
Yuancheng Sun
Kai Chen
Kang Liu
Qiwei Ye
DiffM
AI4CE
19
0
0
17 Oct 2024
Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance
Mitsuhiko Nakamoto
Oier Mees
Aviral Kumar
Sergey Levine
OffRL
71
11
0
17 Oct 2024
UniCoN: Universal Conditional Networks for Multi-Age Embryonic Cartilage Segmentation with Sparsely Annotated Data
Nishchal Sapkota
Yejia Zhang
Zihao Zhao
Maria Gomez
Yuhan Hsi
...
Meng Wu
E. Jabs
J. Richtsmeier
S. M. Perrine
D. Z. Chen
AI4CE
23
0
0
16 Oct 2024
BlabberSeg: Real-Time Embedded Open-Vocabulary Aerial Segmentation
Haechan Mark Bong
Ricardo de Azambuja
Giovanni Beltrame
VLM
31
0
0
16 Oct 2024
Mind the Gap Between Prototypes and Images in Cross-domain Finetuning
Hongduan Tian
Feng Liu
Zhanke Zhou
Tongliang Liu
Chengqi Zhang
Bo Han
VLM
24
1
0
16 Oct 2024
Parametric model reduction of mean-field and stochastic systems via higher-order action matching
Jules Berman
Tobias Blickhan
Benjamin Peherstorfer
26
0
0
15 Oct 2024
Mitigating Suboptimality of Deterministic Policy Gradients in Complex Q-functions
Ayush Jain
Norio Kosaka
Xinhu Li
Kyung-Min Kim
Erdem Bıyık
Joseph J. Lim
OffRL
16
0
0
15 Oct 2024
On-the-fly Modulation for Balanced Multimodal Learning
Yake Wei
D. Hu
Henghui Du
Ji-Rong Wen
16
7
0
15 Oct 2024
The Ingredients for Robotic Diffusion Transformers
Sudeep Dasari
Oier Mees
Sebastian Zhao
M. K. Srirama
Sergey Levine
48
20
0
14 Oct 2024
Previous
1
2
3
4
5
6
...
25
26
27
Next