Neighbor communities
0 / 0 papers shown
Top Contributors
| Name | # Papers | # Citations |
|---|---|---|
Social Events
| Date | Location | Event |
|---|---|---|
| Name | # Papers | # Citations |
|---|---|---|
| Date | Location | Event |
|---|---|---|
Study and develop models that can generalize to unseen compositions of known concepts.
![]() The Spatial Blindspot of Vision-Language Models Nahid Alam Leema Krishna Murali Siddhant Bharadwaj Patrick Liu Timothy Chung Drishti Sharma Akshata A Kranthi Kiran Wesley Tam Bala Krishna S Vegesna | |||
![]() CtD: Composition through Decomposition in Emergent CommunicationInternational Conference on Learning Representations (ICLR), 2026 Boaz Carmeli Ron Meir Yonatan Belinkov | |||
![]() VULCA-Bench: A Multicultural Vision-Language Benchmark for Evaluating Cultural Understanding Haorui Yu Ramon Ruiz-Dolz Diji Yang Hang He Fengrui Zhang Qiufeng Yi | |||
![]() LitVISTA: A Benchmark for Narrative Orchestration in Literary Text Mingzhe Lu Yiwen Wang Yanbing Liu Qi You Chong Liu ...Haoyu Dong Wenyu Zhang Jiarui Zhang Yue Hu Yunpeng Li | |||
![]() Boosting Latent Diffusion Models via Disentangled Representation Alignment John Page Xuesong Niu Kai Wu Kun Gai | |||
![]() V-FAT: Benchmarking Visual Fidelity Against Text-bias Ziteng Wang Yujie He Guanliang Li Siqi Yang Jiaqi Xiong Songxiang Liu | |||
![]() Eye-Q: A Multilingual Benchmark for Visual Word Puzzle Solving and Image-to-Phrase Reasoning Ali Najar Alireza Mirrokni Arshia Izadyari Sadegh Mohammadian Amir Homayoon Sharifizade Asal Meskin Mobin Bagherian Ehsaneddin Asgari | |||
![]() Exploring Compositionality in Vision Transformers using Wavelet Representations Akshad Shyam Purushottamdas Pranav K Nayak Divya Mehul Rajparia Deekshith Patel Yashmitha Gogineni Konda Reddy Mopuri Sumohana S. Channappayya | |||
![]() Same or Not? Enhancing Visual Perception in Vision-Language Models Damiano Marsili Aditya Mehta Ryan Y. Lin Georgia Gkioxari | |||
![]() VisRes Bench: On Evaluating the Visual Reasoning Capabilities of VLMs Brigitta Malagurski Törtei Yasser Dahou Ngoc Dung Huynh Wamiq Reyaz Para Phúc H. Lê Khac Ankit Singh Sofian Chaybouti Sanath Narayan | |||
![]() VL4Gaze: Unleashing Vision-Language Models for Gaze Following Shijing Wang Chaoqun Cui Yaping Huang Hyung Jin Chang Yihua Cheng | |||
![]() Self-Attention with State-Object Weighted Combination for Compositional Zero Shot Learning Cheng-Hong Chang Pei-Hsuan Tsai | |||
![]() TextEditBench: Evaluating Reasoning-aware Text Editing Beyond Rendering Rui Gui Yang Wan Haochen Han Dongxing Mao Fangming Liu Min Li Alex Jinpeng Wang | |||
![]() DeX-Portrait: Disentangled and Expressive Portrait Animation via Explicit and Latent Motion Representations Yuxiang Shi Zhe Li Yanwen Wang Hao Zhu Xun Cao Ligang Liu | |||
![]() From Isolation to Entanglement: When Do Interpretability Methods Identify and Disentangle Known Concepts? Aaron Mueller Andrew Lee Shruti Joshi Ekdeep Singh Lubana Dhanya Sridhar Patrik Reizinger | |||
![]() FactorPortrait: Controllable Portrait Animation via Disentangled Expression, Pose, and Viewpoint Jiapeng Tang Kai Li Chengxiang Yin Liuhao Ge Fei Jiang ...Matthias Nießner Christian Häne Timur Bagautdinov Egor Zakharov Peihong Guo | |||
![]() Infinity and Beyond: Compositional Alignment in VAR and Diffusion T2I Models Hossein Shahabadi Niki Sepasian Arash Marioriyad Ali Sharifi-Zarchi Mahdieh Soleymani Baghshah | |||
![]() Disentangled and Distilled Encoder for Out-of-Distribution Reasoning with Rademacher Guarantees Zahra Rahiminasab Michael Yuhas Arvind Easwaran | |||
![]() Learning by Analogy: A Causal Framework for Composition Generalization Lingjing Kong Shaoan Xie Yang Jiao Yetian Chen Yanhui Guo Simone Shao Yan Gao Guangyi Chen Kun Zhang | |||
![]() VisualActBench: Can VLMs See and Act like a Human? Daoan Zhang Pai Liu Xiaofei Zhou Yuan Ge Guangchen Lan Jing Bi Christopher Brinton Ehsan Hoque Jiebo Luo | |||
![]() Composing Concepts from Images and Videos via Concept-prompt Binding Xianghao Kong Zeyu Zhang Yuwei Guo Zhuoran Zhao Songchun Zhang Anyi Rao | |||
![]() AgentComp: From Agentic Reasoning to Compositional Mastery in Text-to-Image Models Arman Zarei Jiacheng Pan Matthew Gwilliam Soheil Feizi Zhenheng Yang | |||
![]() MICo-150K: A Comprehensive Dataset Advancing Multi-Image Composition Xinyu Wei Kangrui Cen Hongyang Wei Zhen Guo Bairui Li Zeqing Wang Jinrui Zhang Lei Zhang | |||
![]() Relational Visual Similarity Thao Nguyen Sicheng Mo Krishna Kumar Singh Yilin Wang Jing Shi Nicholas Kolkin Eli Shechtman Yong Jae Lee Yuheng Li | |||
![]() VisChainBench: A Benchmark for Multi-Turn, Multi-Image Visual Reasoning Beyond Language Priors Wenbo Lyu Yingjun Du Jinglin Zhao Xianton Zhen Ling Shao | |||
![]() Inferring Compositional 4D Scenes without Ever Seeing One Ahmet Berke Gokmen Ajad Chhatkuli Luc Van Gool Danda Pani Paudel | |||
![]() ChromouVQA: Benchmarking Vision-Language Models under Chromatic Camouflaged Images Yunfei Zhang Yizhuo He Yuanxun Shao Zhengtao Yao Haoyan Xu Junhao Dong Zhen Yao Zhikang Dong | |||
![]() Exact Learning of Weighted Graphs Using Composite QueriesInternational Workshop on Combinatorial Algorithms (IWOCA), 2025 | |||
| Name (-) |
|---|
| Name (-) |
|---|
| Name (-) |
|---|
| Date | Location | Event | |
|---|---|---|---|
| No social events available | |||