Emerging Properties in Self-Supervised Vision Transformers

29 April 2021

Papers citing "Emerging Properties in Self-Supervised Vision Transformers"

50 / 50 papers shown

Title
seq-JEPA: Autoregressive Predictive Learning of Invariant-Equivariant World Models Hafez Ghaemi Eilif Muller Shahab Bakhtiari 4 0 0 06 May 2025
DualReal: Adaptive Joint Training for Lossless Identity-Motion Fusion in Video Customization Wenchuan Wang Mengqi Huang Yijing Tu Zhendong Mao VGen 18 0 0 04 May 2025
Self-Supervision Enhances Instance-based Multiple Instance Learning Methods in Digital Pathology: A Benchmark Study Ali Mammadov Loic Le Folgoc Julien Adam Anne Buronfosse Gilles Hayem Guillaume Hocquet Pietro Gori SSL 6 0 0 02 May 2025
CAV-MAE Sync: Improving Contrastive Audio-Visual Mask Autoencoders via Fine-Grained Alignment Edson Araujo Andrew Rouditchenko Yuan Gong Saurabhchand Bhati Samuel Thomas Brian Kingsbury Leonid Karlinsky Rogerio Feris James Glass 2 0 0 02 May 2025
VIDSTAMP: A Temporally-Aware Watermark for Ownership and Integrity in Video Diffusion Models Mohammadreza Teymoorianfard Shiqing Ma Amir Houmansadr WIGM 23 0 0 02 May 2025
InstructAttribute: Fine-grained Object Attributes editing with Instruction Xingxi Yin Jingfeng Zhang Zhi Li Y. Li Y. Zhang DiffM 37 0 0 01 May 2025
Online Federation For Mixtures of Proprietary Agents with Black-Box Encoders Xuwei Yang Fatemeh Tavakoli D. B. Emerson Anastasis Kratsios FedML 42 0 0 30 Apr 2025
Recursive KL Divergence Optimization: A Dynamic Framework for Representation Learning Anthony D Martin 36 0 0 30 Apr 2025
Adept: Annotation-Denoising Auxiliary Tasks with Discrete Cosine Transform Map and Keypoint for Human-Centric Pretraining Weizhen He Yunfeng Yan Shixiang Tang Yiheng Deng Yangyang Zhong Pengxin Luo Donglian Qi VLM 61 1 0 29 Apr 2025
SVD Based Least Squares for X-Ray Pneumonia Classification Using Deep Features Mete Erdogan Sebnem Demirtas 32 0 0 29 Apr 2025
PRISM: Projection-based Reward Integration for Scene-Aware Real-to-Sim-to-Real Transfer with Few Demonstrations Haowen Sun H. Wang Chengzhong Ma Shaolong Zhang Jiawei Ye Xingyu Chen Xuguang Lan OffRL 24 0 0 29 Apr 2025
In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer Zechuan Zhang Ji Xie Yu Lu Zongxin Yang Y. Yang DiffM 65 0 0 29 Apr 2025
Prisma: An Open Source Toolkit for Mechanistic Interpretability in Vision and Video Sonia Joseph Praneet Suresh Lorenz Hufe Edward Stevinson Robert Graham Yash Vadi Danilo Bzdok Sebastian Lapuschkin Lee Sharkey Blake A. Richards 36 34 0 28 Apr 2025
LIRM: Large Inverse Rendering Model for Progressive Reconstruction of Shape, Materials and View-dependent Radiance Fields Zhengqin Li Dilin Wang Ka Chen Zhaoyang Lv Thu Nguyen-Phuoc ... Yufeng Zhu Carl S. Marshall Yufeng Ren Richard A. Newcombe Zhao Dong 3DV 68 100 0 28 Apr 2025
Enhancing breast cancer detection on screening mammogram using self-supervised learning and a hybrid deep model of Swin Transformer and Convolutional Neural Network Han Chen Anne L. Martel 31 31 0 28 Apr 2025
Do You Know the Way? Human-in-the-Loop Understanding for Fast Traversability Estimation in Mobile Robotics Andre Schreiber Katherine Rose Driggs-Campbell 34 36 0 28 Apr 2025
CompleteMe: Reference-based Human Image Completion Yu-Ju Tsai Brian L. Price Qing Liu Luis Figueroa D. Pakhomov Zhihong Ding Scott D. Cohen Ming Yang 3DH 33 41 0 28 Apr 2025
Taming the Randomness: Towards Label-Preserving Cropping in Contrastive Learning Mohamed Hassan Mohammad Wasil Sebastian Houben 33 0 0 28 Apr 2025
MERA: Multimodal and Multiscale Self-Explanatory Model with Considerably Reduced Annotation for Lung Nodule Diagnosis Jiahao Lu Chong Yin Silvia Ingala Kenny Erleben M. Nielsen S. Darkner 41 54 0 27 Apr 2025
CARL: Camera-Agnostic Representation Learning for Spectral Image Analysis Alexander Baumann Leonardo Ayala S. Jan Sellner Alexander Studier-Fischer Berkin Özdemir Lena Maier-Hein Slobodan Ilic 29 37 0 27 Apr 2025
OpenFusion++: An Open-vocabulary Real-time Scene Understanding System Xiaofeng Jin Matteo Frosi Matteo Matteucci 27 26 0 27 Apr 2025
Platonic Grounding for Efficient Multimodal Language Models Moulik Choraria Xinbo Wu Akhil Bhimaraju Nitesh Sekhar Yue Wu Xu Zhang Prateek Singhal L. Varshney 44 52 0 27 Apr 2025
Multi-Resolution Pathology-Language Pre-training Model with Text-Guided Visual Representation Shahad Albastaki Anabia Sohail I. I. Ganapathi B. Alawode Asim Khan Sajid Javed N. Werghi Mohammed Bennamoun Arif Mahmood 43 96 0 26 Apr 2025
Examining the Impact of Optical Aberrations to Image Classification and Object Detection Models Patrick Müller Alexander Braun M. Keuper 42 0 0 25 Apr 2025
CLOC: Contrastive Learning for Ordinal Classification with Multi-Margin N-pair Loss D. Pitawela Gustavo Carneiro Hsiang-Ting Chen 17 0 0 22 Apr 2025
ForesightNav: Learning Scene Imagination for Efficient Exploration Hardik Shah Jiaxu Xing Nico Messikommer Boyang Sun Marc Pollefeys Davide Scaramuzza 48 0 0 22 Apr 2025
Search is All You Need for Few-shot Anomaly Detection Qishan Wang Jia Guo Shuyong Gao H. Wang Li Xiong J. Hu Hanqi Guo Wenqiang Zhang 26 0 0 16 Apr 2025
Omni-Dish: Photorealistic and Faithful Image Generation and Editing for Arbitrary Chinese Dishes Huijie Liu Bingcan Wang Jie Hu Xiaoming Wei Guoliang Kang 33 0 0 14 Apr 2025
Efficient Self-Supervised Learning for Earth Observation via Dynamic Dataset Curation Thomas Kerdreux A. Tuel Quentin Febvre A. Mouche Bertrand Chapron 45 0 0 09 Apr 2025
VideoGen-Eval: Agent-based System for Video Generation Evaluation Yuhang Yang Ke Fan S. Hongxiang Li Ailing Zeng FeiLin Han Wei-dong Zhai W. Liu Yang Cao Zheng-jun Zha EGVM VGen 55 0 0 30 Mar 2025
Leveraging Motion Information for Better Self-Supervised Video Correspondence Learning Zihan Zhoua Changrui Daia Aibo Songa Xiaolin Fang VOS 41 0 0 15 Mar 2025
A Survey on Class-Agnostic Counting: Advancements from Reference-Based to Open-World Text-Guided Approaches Luca Ciampi Ali Azmoudeh Elif Ecem Akbaba Erdi Sarıtaş Ziya Ata Yazıcı H. K. Ekenel Giuseppe Amato Fabrizio Falchi 70 0 0 31 Jan 2025
Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation Sungnyun Kim Sungwoo Cho Sangmin Bae Kangwook Jang Se-Young Yun SSL 53 60 0 23 Jan 2025
Wonderland: Navigating 3D Scenes from a Single Image Hanwen Liang Junli Cao Vidit Goel Guocheng Qian Sergei Korolev Demetri Terzopoulos Konstantinos N. Plataniotis Sergey Tulyakov Jian Ren VGen 109 11 0 16 Dec 2024
ColorEdit: Training-free Image-Guided Color editing with diffusion model Xingxi Yin Zhi Li Jingfeng Zhang Chenglin Li Yin Zhang DiffM 39 0 0 15 Nov 2024
OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision Cong Wei Zheyang Xiong Weiming Ren Xinrun Du Ge Zhang Wenhu Chen 45 17 0 11 Nov 2024
Variable Bitrate Residual Vector Quantization for Audio Coding Yunkee Chae Woosung Choi Yuhta Takida Junghyun Koo Yukara Ikemiya ... K. Cheuk Marco A. Martínez Ramírez Kyogu Lee Wei-Hsiang Liao Yuki Mitsufuji 40 0 0 08 Oct 2024
S3Former: Self-supervised High-resolution Transformer for Solar PV Profiling Minh-Triet Tran Adrian de Luis Haitao Liao Ying Huang Roy McCann Alan Mantooth Jack Cothren Ngan Le 47 0 0 07 May 2024
Visual Encoders for Data-Efficient Imitation Learning in Modern Video Games Lukas Schäfer Logan Jones Anssi Kanervisto Yuhan Cao Tabish Rashid Raluca Georgescu David Bignell Siddhartha Sen Andrea Trevino Gavito Sam Devlin 58 3 0 04 Dec 2023
MENTOR: Human Perception-Guided Pretraining for Increased Generalization Colton R. Crum Adam Czajka 33 1 0 30 Oct 2023
An Empirical Study of Pre-trained Model Selection for Out-of-Distribution Generalization and Calibration Hiroki Naganuma Ryuichiro Hataya Kotaro Yoshida Ioannis Mitliagkas OODD 56 1 0 17 Jul 2023
Semantic Positive Pairs for Enhancing Visual Representation Learning of Instance Discrimination Methods Mohammad Alkhalefi Georgios Leontidis Min Zhong SSL 54 2 0 28 Jun 2023
S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration Zhiqiang Shen Zechun Liu Jie Qin Lei Huang Kwang-Ting Cheng Marios Savvides UQCV SSL MQ 218 19 0 17 Feb 2021
SEED: Self-supervised Distillation For Visual Representation Zhiyuan Fang Jianfeng Wang Lijuan Wang Lei Zhang Yezhou Yang Zicheng Liu SSL 213 166 0 12 Jan 2021
BYOL works even without batch statistics Pierre Harvey Richemond Jean-Bastien Grill Florent Altché Corentin Tallec Florian Strub ... Samuel L. Smith Soham De Razvan Pascanu Bilal Piot Michal Valko SSL 211 104 0 20 Oct 2020
Meta Pseudo Labels Hieu H. Pham Zihang Dai Qizhe Xie Minh-Thang Luong Quoc V. Le VLM 230 583 0 23 Mar 2020
Improved Baselines with Momentum Contrastive Learning Xinlei Chen Haoqi Fan Ross B. Girshick Kaiming He SSL 213 3,029 0 09 Mar 2020
Boosting Self-Supervised Learning via Knowledge Transfer M. Noroozi Ananth Vinjimoor Paolo Favaro Hamed Pirsiavash SSL 186 282 0 01 May 2018
Large scale distributed neural network training through online distillation Rohan Anil Gabriel Pereyra Alexandre Passos Róbert Ormándi George E. Dahl Geoffrey E. Hinton FedML 241 381 0 09 Apr 2018
OpenNMT: Open-Source Toolkit for Neural Machine Translation Guillaume Klein Yoon Kim Yuntian Deng Jean Senellart Alexander M. Rush 229 1,863 0 10 Jan 2017