ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1709.07871
  4. Cited By
FiLM: Visual Reasoning with a General Conditioning Layer

FiLM: Visual Reasoning with a General Conditioning Layer

22 September 2017
Ethan Perez
Florian Strub
H. D. Vries
Vincent Dumoulin
Aaron Courville
    FAtt
    AIMat
    OffRL
    AI4CE
ArXivPDFHTML

Papers citing "FiLM: Visual Reasoning with a General Conditioning Layer"

50 / 1,304 papers shown
Title
ReCDAP: Relation-Based Conditional Diffusion with Attention Pooling for Few-Shot Knowledge Graph Completion
ReCDAP: Relation-Based Conditional Diffusion with Attention Pooling for Few-Shot Knowledge Graph Completion
Jeongho Kim
Chanyeong Heo
Jaehee Jung
21
0
0
12 May 2025
Efficient Robotic Policy Learning via Latent Space Backward Planning
Efficient Robotic Policy Learning via Latent Space Backward Planning
Dongxiu Liu
Haoyi Niu
Zhihao Wang
Jinliang Zheng
Yinan Zheng
Zhonghong Ou
Jianming Hu
Jianxiong Li
Xianyuan Zhan
18
0
0
11 May 2025
Automated Learning of Semantic Embedding Representations for Diffusion Models
Automated Learning of Semantic Embedding Representations for Diffusion Models
Limai Jiang
Yunpeng Cai
DiffM
23
0
0
09 May 2025
3D CAVLA: Leveraging Depth and 3D Context to Generalize Vision Language Action Models for Unseen Tasks
3D CAVLA: Leveraging Depth and 3D Context to Generalize Vision Language Action Models for Unseen Tasks
V. Bhat
Yu-Hsiang Lan
P. Krishnamurthy
Ramesh Karri
Farshad Khorrami
45
0
0
09 May 2025
Benchmarking Vision, Language, & Action Models in Procedurally Generated, Open Ended Action Environments
Benchmarking Vision, Language, & Action Models in Procedurally Generated, Open Ended Action Environments
Pranav Guruprasad
Yangyue Wang
Sudipta Chowdhury
Harshvardhan Sikka
LM&Ro
VLM
94
0
0
08 May 2025
Lay-Your-Scene: Natural Scene Layout Generation with Diffusion Transformers
Lay-Your-Scene: Natural Scene Layout Generation with Diffusion Transformers
Divyansh Srivastava
Xiang Zhang
He Wen
Chenru Wen
Zhuowen Tu
DiffM
26
0
0
07 May 2025
Miipher-2: A Universal Speech Restoration Model for Million-Hour Scale Data Restoration
Miipher-2: A Universal Speech Restoration Model for Million-Hour Scale Data Restoration
Shigeki Karita
Yuma Koizumi
Heiga Zen
Haruko Ishikawa
Robin Scheibler
M. Bacchiani
VLM
106
1
0
07 May 2025
StableMotion: Training Motion Cleanup Models with Unpaired Corrupted Data
StableMotion: Training Motion Cleanup Models with Unpaired Corrupted Data
Yuxuan Mu
Hung Yu Ling
Yi Shi
Ismael Baira Ojeda
Pengcheng Xi
Chang Shu
F. Zinno
Xue Bin Peng
45
0
0
06 May 2025
The Inverse Drum Machine: Source Separation Through Joint Transcription and Analysis-by-Synthesis
The Inverse Drum Machine: Source Separation Through Joint Transcription and Analysis-by-Synthesis
Bernardo Torres
Geoffroy Peeters
G. Richard
41
0
0
06 May 2025
DPNet: Dynamic Pooling Network for Tiny Object Detection
DPNet: Dynamic Pooling Network for Tiny Object Detection
Luqi Gong
Haotian Chen
Y. Chen
Tianliang Yao
Chao Li
Shuai Zhao
Guangjie Han
ObjD
87
0
0
05 May 2025
Voice Cloning: Comprehensive Survey
Voice Cloning: Comprehensive Survey
Hussam Azzuni
Abdulmotaleb El Saddik
VLM
32
0
0
01 May 2025
J-PARSE: Jacobian-based Projection Algorithm for Resolving Singularities Effectively in Inverse Kinematic Control of Serial Manipulators
J-PARSE: Jacobian-based Projection Algorithm for Resolving Singularities Effectively in Inverse Kinematic Control of Serial Manipulators
Shivani Guptasarma
Matthew Strong
HongHao Zhen
Monroe Kennedy III
26
0
0
01 May 2025
RoboGround: Robotic Manipulation with Grounded Vision-Language Priors
RoboGround: Robotic Manipulation with Grounded Vision-Language Priors
Haifeng Huang
Xinyi Chen
Y. Chen
H. Li
Xiaoshen Han
Z. Wang
Tai Wang
Jiangmiao Pang
Zhou Zhao
LM&Ro
75
0
0
30 Apr 2025
RoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning
RoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning
Haoran Geng
Feishi Wang
Songlin Wei
Y. Li
Bangjun Wang
...
Hao Dong
Siyuan Huang
Yue Wang
Jitendra Malik
Pieter Abbeel
75
3
0
26 Apr 2025
Salient Region-Guided Spacecraft Image Arbitrary-Scale Super-Resolution Network
Salient Region-Guided Spacecraft Image Arbitrary-Scale Super-Resolution Network
J. Yang
Hu Gao
Ying Zhang
Depeng Dang
17
0
0
25 Apr 2025
CIVIL: Causal and Intuitive Visual Imitation Learning
CIVIL: Causal and Intuitive Visual Imitation Learning
Yinlong Dai
Robert Ramirez Sanchez
Ryan Jeronimus
Shahabedin Sagheb
Cara M. Nunez
Heramb Nemlekar
Dylan P. Losey
61
0
0
24 Apr 2025
Multimodal Perception for Goal-oriented Navigation: A Survey
Multimodal Perception for Goal-oriented Navigation: A Survey
I-Tak Ieong
Hao Tang
LM&Ro
LRM
29
0
0
22 Apr 2025
SPECI: Skill Prompts based Hierarchical Continual Imitation Learning for Robot Manipulation
SPECI: Skill Prompts based Hierarchical Continual Imitation Learning for Robot Manipulation
Jingkai Xu
Xiangli Nie
35
0
0
22 Apr 2025
FlowLoss: Dynamic Flow-Conditioned Loss Strategy for Video Diffusion Models
FlowLoss: Dynamic Flow-Conditioned Loss Strategy for Video Diffusion Models
Kuanting Wu
Kei Ota
Asako Kanezaki
DiffM
VGen
41
0
0
20 Apr 2025
Simplifying Graph Transformers
Simplifying Graph Transformers
Liheng Ma
Soumyasundar Pal
Yingxue Zhang
Philip H. S. Torr
Mark J. Coates
26
0
0
17 Apr 2025
Towards Forceful Robotic Foundation Models: a Literature Survey
Towards Forceful Robotic Foundation Models: a Literature Survey
William Xie
N. Correll
OffRL
56
0
0
16 Apr 2025
Autoregressive Distillation of Diffusion Transformers
Autoregressive Distillation of Diffusion Transformers
Yeongmin Kim
Sotiris Anagnostidis
Yuming Du
Edgar Schönfeld
Jonas Kohler
Markos Georgopoulos
Albert Pumarola
Ali K. Thabet
A. Sanakoyeu
26
0
0
15 Apr 2025
Efficient Task-specific Conditional Diffusion Policies: Shortcut Model Acceleration and SO(3) Optimization
Efficient Task-specific Conditional Diffusion Policies: Shortcut Model Acceleration and SO(3) Optimization
Haiyong Yu
Yanqiong Jin
Yonghao He
Wei Sui
27
0
0
14 Apr 2025
Prior Does Matter: Visual Navigation via Denoising Diffusion Bridge Models
Prior Does Matter: Visual Navigation via Denoising Diffusion Bridge Models
Hao Ren
Yiming Zeng
Zetong Bi
Zhaoliang Wan
Junlong Huang
Hui Cheng
78
1
0
14 Apr 2025
Mitigating Timbre Leakage with Universal Semantic Mapping Residual Block for Voice Conversion
Mitigating Timbre Leakage with Universal Semantic Mapping Residual Block for Voice Conversion
Na Li
Chuke Wang
Yu Gu
Zhifeng Li
54
0
0
11 Apr 2025
Diffusion Models for Robotic Manipulation: A Survey
Diffusion Models for Robotic Manipulation: A Survey
Rosa Wolf
Yitian Shi
Sheng Liu
Rania Rayyes
51
1
0
11 Apr 2025
DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation
DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation
Wangbo Zhao
Yizeng Han
Jiasheng Tang
Kai Wang
Hao Luo
Yibing Song
Gao Huang
Fan Wang
Yang You
66
0
0
09 Apr 2025
Robust Fusion Controller: Degradation-aware Image Fusion with Fine-grained Language Instructions
Robust Fusion Controller: Degradation-aware Image Fusion with Fine-grained Language Instructions
Hao Zhang
Yanping Zha
Qingwei Zhuang
Z. Shao
Jiayi Ma
22
0
0
08 Apr 2025
Diff-SSL-G-Comp: Towards a Large-Scale and Diverse Dataset for Virtual Analog Modeling
Diff-SSL-G-Comp: Towards a Large-Scale and Diverse Dataset for Virtual Analog Modeling
Yicheng Gu
Runsong Zhang
Lauri Juvela
Z. Wu
DiffM
76
0
0
06 Apr 2025
PRISM: Probabilistic Representation for Integrated Shape Modeling and Generation
PRISM: Probabilistic Representation for Integrated Shape Modeling and Generation
Lei Cheng
Mahdi Saleh
Qing Cheng
Lu Sang
Hongli Xu
Daniel Cremers
F. Tombari
23
0
0
06 Apr 2025
Compressing 3D Gaussian Splatting by Noise-Substituted Vector Quantization
Compressing 3D Gaussian Splatting by Noise-Substituted Vector Quantization
Haishan Wang
Mohammad Hassan Vali
Arno Solin
3DGS
54
0
0
03 Apr 2025
MultiNeRF: Multiple Watermark Embedding for Neural Radiance Fields
MultiNeRF: Multiple Watermark Embedding for Neural Radiance Fields
Yash Kulthe
Andrew Gilbert
John Collomosse
38
0
0
03 Apr 2025
A Survey on Music Generation from Single-Modal, Cross-Modal, and Multi-Modal Perspectives
A Survey on Music Generation from Single-Modal, Cross-Modal, and Multi-Modal Perspectives
Shuyu Li
Shulei Ji
Zihao W. Wang
Songruoyao Wu
Jiaxing Yu
K. Zhang
MGen
VGen
65
1
0
01 Apr 2025
Sim-and-Real Co-Training: A Simple Recipe for Vision-Based Robotic Manipulation
Sim-and-Real Co-Training: A Simple Recipe for Vision-Based Robotic Manipulation
Abhiram Maddukuri
Z. L. Jiang
L. Chen
Soroush Nasiriany
Yuqi Xie
...
Scott Reed
Ken Goldberg
Ajay Mandlekar
Linxi Fan
Yuke Zhu
59
1
0
31 Mar 2025
Language-Guided Trajectory Traversal in Disentangled Stable Diffusion Latent Space for Factorized Medical Image Generation
Language-Guided Trajectory Traversal in Disentangled Stable Diffusion Latent Space for Factorized Medical Image Generation
Zahra Tehraninasab
Amar Kumar
Tal Arbel
MedIm
54
0
0
30 Mar 2025
Baseline Systems and Evaluation Metrics for Spatial Semantic Segmentation of Sound Scenes
Baseline Systems and Evaluation Metrics for Spatial Semantic Segmentation of Sound Scenes
Binh Thien Nguyen
Masahiro Yasuda
Daiki Takeuchi
Daisuke Niizumi
Yasunori Ohishi
N. Harada
39
0
0
28 Mar 2025
CTRL-O: Language-Controllable Object-Centric Visual Representation Learning
CTRL-O: Language-Controllable Object-Centric Visual Representation Learning
Aniket Didolkar
Andrii Zadaianchuk
Rabiul Awal
Maximilian Seitzer
E. Gavves
Aishwarya Agrawal
OCL
VLM
82
2
0
27 Mar 2025
A multi-agentic framework for real-time, autonomous freeform metasurface design
A multi-agentic framework for real-time, autonomous freeform metasurface design
Robert Lupoiu
Yixuan Shao
Tianxiang Dai
Chenkai Mao
Kofi Edee
Jonathan A. Fan
AI4CE
73
0
0
26 Mar 2025
Unpaired Object-Level SAR-to-Optical Image Translation for Aircraft with Keypoints-Guided Diffusion Models
Unpaired Object-Level SAR-to-Optical Image Translation for Aircraft with Keypoints-Guided Diffusion Models
Ruixi You
Hecheng Jia
Feng Xu
DiffM
34
0
0
25 Mar 2025
Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy
Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy
Zhi Hou
Tianyi Zhang
Yuwen Xiong
Haonan Duan
Hengjun Pu
...
Chengyang Zhao
X. Zhu
Yu Qiao
Jifeng Dai
Y. Chen
59
1
0
25 Mar 2025
RoboFlamingo-Plus: Fusion of Depth and RGB Perception with Vision-Language Models for Enhanced Robotic Manipulation
RoboFlamingo-Plus: Fusion of Depth and RGB Perception with Vision-Language Models for Enhanced Robotic Manipulation
Sheng Wang
VLM
76
2
0
25 Mar 2025
Adaptive Unimodal Regulation for Balanced Multimodal Information Acquisition
Adaptive Unimodal Regulation for Balanced Multimodal Information Acquisition
Chengxiang Huang
Yake Wei
Zequn Yang
D. Hu
42
0
0
24 Mar 2025
DisentTalk: Cross-lingual Talking Face Generation via Semantic Disentangled Diffusion Model
DisentTalk: Cross-lingual Talking Face Generation via Semantic Disentangled Diffusion Model
Kangwei Liu
Junwu Liu
Yun Cao
Jinlin Guo
Xiaowei Yi
DiffM
41
0
0
24 Mar 2025
LightLoc: Learning Outdoor LiDAR Localization at Light Speed
LightLoc: Learning Outdoor LiDAR Localization at Light Speed
W. J. Li
Chen Liu
Shangshu Yu
Dunqiang Liu
Yin Zhou
Siqi Shen
Chenglu Wen
Cheng-Yu Wang
41
0
0
22 Mar 2025
PRIMAL: Physically Reactive and Interactive Motor Model for Avatar Learning
PRIMAL: Physically Reactive and Interactive Motor Model for Avatar Learning
Yan Zhang
Yao Feng
Alpár Cseke
Nitin Saini
Nathan Bajandas
Nicolas Heron
M. Black
DiffM
VGen
62
0
0
21 Mar 2025
DyWA: Dynamics-adaptive World Action Model for Generalizable Non-prehensile Manipulation
DyWA: Dynamics-adaptive World Action Model for Generalizable Non-prehensile Manipulation
Jiangran Lyu
Ziming Li
Xuesong Shi
Chaoyi Xu
Yizhou Wang
He Wang
47
0
0
21 Mar 2025
Real-Time Diffusion Policies for Games: Enhancing Consistency Policies with Q-Ensembles
Real-Time Diffusion Policies for Games: Enhancing Consistency Policies with Q-Ensembles
Ruoqi Zhang
Ziwei Luo
Jens Sjölund
Per Mattsson
Linus Gisslén
Alessandro Sestini
42
0
0
21 Mar 2025
Diffusion-augmented Graph Contrastive Learning for Collaborative Filter
Diffusion-augmented Graph Contrastive Learning for Collaborative Filter
Fan Huang
Wei Wang
DiffM
62
0
0
20 Mar 2025
SaMam: Style-aware State Space Model for Arbitrary Image Style Transfer
SaMam: Style-aware State Space Model for Arbitrary Image Style Transfer
Hongda Liu
Longguang Wang
Ye Zhang
Ziru Yu
Yulan Guo
Mamba
70
0
0
20 Mar 2025
Neuro Symbolic Knowledge Reasoning for Procedural Video Question Answering
Neuro Symbolic Knowledge Reasoning for Procedural Video Question Answering
Thanh-Son Nguyen
Hong Yang
Tzeh Yuan Neoh
Hao Zhang
Ee Yeo Keat
Basura Fernando
NAI
54
0
0
19 Mar 2025
1234...252627
Next