Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.12898
Cited By
Can Linguistic Knowledge Improve Multimodal Alignment in Vision-Language Pretraining?
24 August 2023
Fei-Yue Wang
Liang Ding
Jun Rao
Ye Liu
Li Shen
Changxing Ding
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Can Linguistic Knowledge Improve Multimodal Alignment in Vision-Language Pretraining?"
23 / 23 papers shown
Title
Enhancing Vision-Language Compositional Understanding with Multimodal Synthetic Data
Haoxin Li
Boyang Li
CoGe
67
0
0
03 Mar 2025
Distilled Transformers with Locally Enhanced Global Representations for Face Forgery Detection
Yaning Zhang
Qiufu Li
Zitong Yu
L. Shen
ViT
40
3
0
31 Dec 2024
Simultaneous Computation and Memory Efficient Zeroth-Order Optimizer for Fine-Tuning Large Language Models
Fei Wang
Li Shen
Liang Ding
Chao Xue
Ye Liu
Changxing Ding
23
0
0
13 Oct 2024
Anchors Aweigh! Sail for Optimal Unified Multi-Modal Representations
Minoh Jeong
Min Namgung
Zae Myung Kim
Dongyeop Kang
Yao-Yi Chiang
Alfred Hero
23
0
0
02 Oct 2024
Unified Lexical Representation for Interpretable Visual-Language Alignment
Yifan Li
Yikai Wang
Yanwei Fu
Dongyu Ru
Zheng-Wei Zhang
Tong He
VLM
27
3
0
25 Jul 2024
HEMM: Holistic Evaluation of Multimodal Foundation Models
Paul Pu Liang
Akshay Goindani
Talha Chafekar
Leena Mathur
Haofei Yu
Ruslan Salakhutdinov
Louis-Philippe Morency
36
10
0
03 Jul 2024
SUGARCREPE++ Dataset: Vision-Language Model Sensitivity to Semantic and Lexical Alterations
Sri Harsha Dumpala
Aman Jaiswal
Chandramouli Shama Sastry
E. Milios
Sageev Oore
Hassan Sajjad
CoGe
30
8
0
17 Jun 2024
VISLA Benchmark: Evaluating Embedding Sensitivity to Semantic and Lexical Alterations
Sri Harsha Dumpala
Aman Jaiswal
Chandramouli Shama Sastry
E. Milios
Sageev Oore
Hassan Sajjad
VLM
CoGe
35
0
0
25 Apr 2024
Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decoding
Xintong Wang
Jingheng Pan
Liang Ding
Christian Biemann
MLLM
26
52
0
27 Mar 2024
Can 3D Vision-Language Models Truly Understand Natural Language?
Weipeng Deng
Jihan Yang
Runyu Ding
Jiahui Liu
Yijiang Li
Xiaojuan Qi
Edith Ngai
26
4
0
21 Mar 2024
WisdoM: Improving Multimodal Sentiment Analysis by Fusing Contextual World Knowledge
Wenbin Wang
Liang Ding
Li Shen
Yong Luo
Han Hu
Dacheng Tao
30
11
0
12 Jan 2024
Visual Data-Type Understanding does not emerge from Scaling Vision-Language Models
Vishaal Udandarao
Max F. Burg
Samuel Albanie
Matthias Bethge
VLM
24
6
0
12 Oct 2023
Towards Making the Most of ChatGPT for Machine Translation
Keqin Peng
Liang Ding
Qihuang Zhong
Li Shen
Xuebo Liu
Min Zhang
Y. Ouyang
Dacheng Tao
LRM
81
203
0
24 Mar 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
Dynamic Contrastive Distillation for Image-Text Retrieval
Jun Rao
Liang Ding
Shuhan Qi
Meng Fang
Yang Liu
Liqiong Shen
Dacheng Tao
VLM
32
30
0
04 Jul 2022
E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language Understanding and Generation
Qihuang Zhong
Liang Ding
Juhua Liu
Bo Du
Dacheng Tao
29
26
0
30 May 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
382
4,010
0
28 Jan 2022
Does Vision-and-Language Pretraining Improve Lexical Grounding?
Tian Yun
Chen Sun
Ellie Pavlick
VLM
CoGe
32
30
0
21 Sep 2021
UnNatural Language Inference
Koustuv Sinha
Prasanna Parthasarathi
Joelle Pineau
Adina Williams
203
80
0
30 Dec 2020
Out of Order: How Important Is The Sequential Order of Words in a Sentence in Natural Language Understanding Tasks?
Thang M. Pham
Trung Bui
Long Mai
Anh Totti Nguyen
195
122
0
30 Dec 2020
Understanding and Improving Lexical Choice in Non-Autoregressive Translation
Liang Ding
Longyue Wang
Xuebo Liu
Derek F. Wong
Dacheng Tao
Zhaopeng Tu
91
76
0
29 Dec 2020
What you can cram into a single vector: Probing sentence embeddings for linguistic properties
Alexis Conneau
Germán Kruszewski
Guillaume Lample
Loïc Barrault
Marco Baroni
199
876
0
03 May 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
1