
![]() What matters when building vision-language models?Neural Information Processing Systems (NeurIPS), 2024 |
![]() BA-SAM: Scalable Bias-Mode Attention Mask for Segment Anything ModelComputer Vision and Pattern Recognition (CVPR), 2024 |
![]() Navigating Scaling Laws: Compute Optimality in Adaptive Model TrainingInternational Conference on Machine Learning (ICML), 2023 |
![]() Win-Win: Training High-Resolution Vision Transformers from Two WindowsInternational Conference on Learning Representations (ICLR), 2023 |
![]() Beyond Grids: Exploring Elastic Input Sampling for Vision TransformersIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023 |
![]() DropCompute: simple and more robust distributed synchronous training via
compute variance reductionNeural Information Processing Systems (NeurIPS), 2023 |