Would Deep Generative Models Amplify Bias in Future Models?Computer Vision and Pattern Recognition (CVPR), 2024 |
InternVideo2: Scaling Video Foundation Models for Multimodal Video
UnderstandingEuropean Conference on Computer Vision (ECCV), 2024 |
A Survey on Quality Metrics for Text-to-Image GenerationIEEE Transactions on Visualization and Computer Graphics (TVCG), 2024 |
GiT: Towards Generalist Vision Transformer through Universal Language
InterfaceEuropean Conference on Computer Vision (ECCV), 2024 |
MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with
Module-wise Pruning Error MetricComputer Vision and Pattern Recognition (CVPR), 2024 |
An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference
Acceleration for Large Vision-Language ModelsEuropean Conference on Computer Vision (ECCV), 2024 |