ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2303.14465
  4. Cited By
Equivariant Similarity for Vision-Language Foundation Models

Equivariant Similarity for Vision-Language Foundation Models

25 March 2023
Tan Wang
Kevin Qinghong Lin
Linjie Li
Chung-Ching Lin
Zhengyuan Yang
Hanwang Zhang
Zicheng Liu
Lijuan Wang
    CoGe
ArXivPDFHTML

Papers citing "Equivariant Similarity for Vision-Language Foundation Models"

12 / 12 papers shown
Title
Multi-Modal Language Models as Text-to-Image Model Evaluators
Multi-Modal Language Models as Text-to-Image Model Evaluators
Jiahui Chen
Candace Ross
Reyhane Askari Hemmat
Koustuv Sinha
Melissa Hall
M. Drozdzal
Adriana Romero-Soriano
EGVM
60
0
0
01 May 2025
VideoComp: Advancing Fine-Grained Compositional and Temporal Alignment in Video-Text Models
VideoComp: Advancing Fine-Grained Compositional and Temporal Alignment in Video-Text Models
Dahun Kim
A. Piergiovanni
Ganesh Mallya
A. Angelova
CoGe
34
0
0
04 Apr 2025
Progressive Compositionality in Text-to-Image Generative Models
Progressive Compositionality in Text-to-Image Generative Models
Xu Han
Linghao Jin
Xiaofeng Liu
Paul Pu Liang
CoGe
93
2
0
22 Oct 2024
Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective
Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective
Xiangru Zhu
Penglei Sun
Yaoxian Song
Yanghua Xiao
Zhixu Li
Chengyu Wang
Jun Huang
Bei Yang
Xiaoxiao Xu
EGVM
78
1
0
14 Oct 2024
RankCLIP: Ranking-Consistent Language-Image Pretraining
RankCLIP: Ranking-Consistent Language-Image Pretraining
Yiming Zhang
Zhuokai Zhao
Zhaorun Chen
Zhili Feng
Zenghui Ding
Yining Sun
SSL
VLM
34
7
0
15 Apr 2024
Synthesize, Diagnose, and Optimize: Towards Fine-Grained Vision-Language
  Understanding
Synthesize, Diagnose, and Optimize: Towards Fine-Grained Vision-Language Understanding
Wujian Peng
Sicheng Xie
Zuyao You
Shiyi Lan
Zuxuan Wu
VLM
CoGe
MLLM
21
16
0
30 Nov 2023
CyCLIP: Cyclic Contrastive Language-Image Pretraining
CyCLIP: Cyclic Contrastive Language-Image Pretraining
Shashank Goel
Hritik Bansal
S. Bhatia
Ryan A. Rossi
Vishwa Vinay
Aditya Grover
CLIP
VLM
166
131
0
28 May 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
382
4,010
0
28 Jan 2022
Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges
Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges
M. Bronstein
Joan Bruna
Taco S. Cohen
Petar Velivcković
GNN
163
1,095
0
27 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
2,875
0
11 Feb 2021
Generic Event Boundary Detection: A Benchmark for Event Segmentation
Generic Event Boundary Detection: A Benchmark for Event Segmentation
Mike Zheng Shou
Stan Weixian Lei
Weiyao Wang
Deepti Ghadiyaram
Matt Feiszli
VOS
74
76
0
26 Jan 2021
Self-supervised Co-training for Video Representation Learning
Self-supervised Co-training for Video Representation Learning
Tengda Han
Weidi Xie
Andrew Zisserman
SSL
198
371
0
19 Oct 2020
1