ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.00020
  4. Cited By
Learning Transferable Visual Models From Natural Language Supervision

Learning Transferable Visual Models From Natural Language Supervision

26 February 2021
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
Sandhini Agarwal
Girish Sastry
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
    CLIP
    VLM
ArXivPDFHTML

Papers citing "Learning Transferable Visual Models From Natural Language Supervision"

50 / 8,869 papers shown
Title
Prophet: Prompting Large Language Models with Complementary Answer Heuristics for Knowledge-based Visual Question Answering
Prophet: Prompting Large Language Models with Complementary Answer Heuristics for Knowledge-based Visual Question Answering
Zhou Yu
Xuecheng Ouyang
Zhenwei Shao
Mei Wang
Jun Yu
MLLM
89
11
0
03 Mar 2023
Computational Language Acquisition with Theory of Mind
Computational Language Acquisition with Theory of Mind
Andy Liu
Hao Zhu
Emmy Liu
Yonatan Bisk
Graham Neubig
LLMAG
AI4CE
16
18
0
02 Mar 2023
MLANet: Multi-Level Attention Network with Sub-instruction for
  Continuous Vision-and-Language Navigation
MLANet: Multi-Level Attention Network with Sub-instruction for Continuous Vision-and-Language Navigation
Zongtao He
Liuyi Wang
Shu Li
Qingqing Yan
Chengju Liu
Qi Chen
10
7
0
02 Mar 2023
Token Contrast for Weakly-Supervised Semantic Segmentation
Token Contrast for Weakly-Supervised Semantic Segmentation
Lixiang Ru
Heliang Zheng
Yibing Zhan
Bo Du
ViT
35
86
0
02 Mar 2023
X&Fuse: Fusing Visual Information in Text-to-Image Generation
X&Fuse: Fusing Visual Information in Text-to-Image Generation
Yuval Kirstain
Omer Levy
Adam Polyak
DiffM
19
5
0
02 Mar 2023
Coarse-to-Fine Covid-19 Segmentation via Vision-Language Alignment
Coarse-to-Fine Covid-19 Segmentation via Vision-Language Alignment
Dandan Shan
Zihan Li
Wentao Chen
Qingde Li
Jie Tian
Qingqi Hong
6
8
0
01 Mar 2023
Collage Diffusion
Collage Diffusion
Vishnu Sarukkai
Linden Li
Arden Ma
Christopher Ré
Kayvon Fatahalian
DiffM
15
23
0
01 Mar 2023
Convolutional Visual Prompt for Robust Visual Perception
Convolutional Visual Prompt for Robust Visual Perception
Yun-Yun Tsai
Chengzhi Mao
Junfeng Yang
VLM
VPVLM
29
13
0
01 Mar 2023
Applying Plain Transformers to Real-World Point Clouds
Applying Plain Transformers to Real-World Point Clouds
Lanxiao Li
M. Heizmann
3DPC
ViT
16
3
0
28 Feb 2023
DART: Diversify-Aggregate-Repeat Training Improves Generalization of
  Neural Networks
DART: Diversify-Aggregate-Repeat Training Improves Generalization of Neural Networks
Samyak Jain
Sravanti Addepalli
P. Sahu
Priyam Dey
R. Venkatesh Babu
MoMe
OOD
30
20
0
28 Feb 2023
Linear Spaces of Meanings: Compositional Structures in Vision-Language
  Models
Linear Spaces of Meanings: Compositional Structures in Vision-Language Models
Matthew Trager
Pramuditha Perera
L. Zancato
Alessandro Achille
Parminder Bhatia
Stefano Soatto
CoGe
19
30
0
28 Feb 2023
Task-Oriented Grasp Prediction with Visual-Language Inputs
Task-Oriented Grasp Prediction with Visual-Language Inputs
Chao Tang
Dehao Huang
Lingxiao Meng
Weiyu Liu
Hong Zhang
15
33
0
28 Feb 2023
A Comprehensive Study on Robustness of Image Classification Models:
  Benchmarking and Rethinking
A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking
Chang-Shu Liu
Yinpeng Dong
Wenzhao Xiang
X. Yang
Hang Su
Junyi Zhu
YueFeng Chen
Yuan He
H. Xue
Shibao Zheng
OOD
VLM
AAML
15
72
0
28 Feb 2023
Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense
  Video Captioning
Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning
Antoine Yang
Arsha Nagrani
Paul Hongsuck Seo
Antoine Miech
Jordi Pont-Tuset
Ivan Laptev
Josef Sivic
Cordelia Schmid
AI4TS
VLM
23
220
0
27 Feb 2023
Knowledge-enhanced Visual-Language Pre-training on Chest Radiology
  Images
Knowledge-enhanced Visual-Language Pre-training on Chest Radiology Images
Xiaoman Zhang
Chaoyi Wu
Ya-Qin Zhang
Yanfeng Wang
Weidi Xie
MedIm
19
119
0
27 Feb 2023
The Role of Pre-training Data in Transfer Learning
The Role of Pre-training Data in Transfer Learning
R. Entezari
Mitchell Wortsman
O. Saukh
M. Shariatnia
Hanie Sedghi
Ludwig Schmidt
38
20
0
27 Feb 2023
TOT: Topology-Aware Optimal Transport For Multimodal Hate Detection
TOT: Topology-Aware Optimal Transport For Multimodal Hate Detection
Linhao Zhang
Li Jin
Xian Sun
Guangluan Xu
Zequn Zhang
Xiaoyu Li
Nayu Liu
Qing Liu
Shiyao Yan
28
7
0
27 Feb 2023
LMSeg: Language-guided Multi-dataset Segmentation
LMSeg: Language-guided Multi-dataset Segmentation
Qiang-feng Zhou
Yuang Liu
Chaohui Yu
Jingliang Li
Zhibin Wang
Fan Wang
VLM
13
18
0
27 Feb 2023
Saliency Guided Contrastive Learning on Scene Images
Saliency Guided Contrastive Learning on Scene Images
Meilin Chen
Yizhou Wang
Shixiang Tang
Feng Zhu
Haiyang Yang
Lei Bai
Rui Zhao
Donglian Qi
Wanli Ouyang
SSL
13
1
0
22 Feb 2023
ASSET: Robust Backdoor Data Detection Across a Multiplicity of Deep
  Learning Paradigms
ASSET: Robust Backdoor Data Detection Across a Multiplicity of Deep Learning Paradigms
Minzhou Pan
Yi Zeng
Lingjuan Lyu
X. Lin
R. Jia
AAML
13
35
0
22 Feb 2023
Steerable Equivariant Representation Learning
Steerable Equivariant Representation Learning
Sangnie Bhardwaj
Willie McClinton
Tongzhou Wang
Guillaume Lajoie
Chen Sun
Phillip Isola
Dilip Krishnan
OOD
LLMSV
26
5
0
22 Feb 2023
Focusing On Targets For Improving Weakly Supervised Visual Grounding
Focusing On Targets For Improving Weakly Supervised Visual Grounding
V. Pham
Nao Mishima
ObjD
19
1
0
22 Feb 2023
Connecting Vision and Language with Video Localized Narratives
Connecting Vision and Language with Video Localized Narratives
P. Voigtlaender
Soravit Changpinyo
Jordi Pont-Tuset
Radu Soricut
V. Ferrari
VGen
31
21
0
22 Feb 2023
Poisoning Web-Scale Training Datasets is Practical
Poisoning Web-Scale Training Datasets is Practical
Nicholas Carlini
Matthew Jagielski
Christopher A. Choquette-Choo
Daniel Paleka
Will Pearce
Hyrum S. Anderson
Andreas Terzis
Kurt Thomas
Florian Tramèr
SILM
24
181
0
20 Feb 2023
CISum: Learning Cross-modality Interaction to Enhance Multimodal
  Semantic Coverage for Multimodal Summarization
CISum: Learning Cross-modality Interaction to Enhance Multimodal Semantic Coverage for Multimodal Summarization
Litian Zhang
Xiaoming Zhang
Ziming Guo
Zhipeng Liu
19
7
0
20 Feb 2023
Prompt Stealing Attacks Against Text-to-Image Generation Models
Prompt Stealing Attacks Against Text-to-Image Generation Models
Xinyue Shen
Y. Qu
Michael Backes
Yang Zhang
22
31
0
20 Feb 2023
Affect-Conditioned Image Generation
Affect-Conditioned Image Generation
F. Ibarrola
R. Lulham
Kazjon Grace
DiffM
27
3
0
20 Feb 2023
STOA-VLP: Spatial-Temporal Modeling of Object and Action for
  Video-Language Pre-training
STOA-VLP: Spatial-Temporal Modeling of Object and Action for Video-Language Pre-training
Weihong Zhong
Mao Zheng
Duyu Tang
Xuan Luo
Heng Gong
Xiaocheng Feng
Bing Qin
25
8
0
20 Feb 2023
A Picture May Be Worth a Thousand Lives: An Interpretable Artificial
  Intelligence Strategy for Predictions of Suicide Risk from Social Media
  Images
A Picture May Be Worth a Thousand Lives: An Interpretable Artificial Intelligence Strategy for Predictions of Suicide Risk from Social Media Images
Yael Badian
Yaakov Ophir
Refael Tikochinski
Nitay Calderon
A. Klomek
Roi Reichart
17
4
0
19 Feb 2023
Few-shot Multimodal Multitask Multilingual Learning
Few-shot Multimodal Multitask Multilingual Learning
Aman Chadha
Vinija Jain
34
0
0
19 Feb 2023
RePrompt: Automatic Prompt Editing to Refine AI-Generative Art Towards
  Precise Expressions
RePrompt: Automatic Prompt Editing to Refine AI-Generative Art Towards Precise Expressions
Yunlong Wang
Shuyuan Shen
Brian Y. Lim
28
88
0
19 Feb 2023
Redes Generativas Adversarias (GAN) Fundamentos Teóricos y
  Aplicaciones
Redes Generativas Adversarias (GAN) Fundamentos Teóricos y Aplicaciones
J. D. L. Torre
GAN
21
1
0
18 Feb 2023
VLN-Trans: Translator for the Vision and Language Navigation Agent
VLN-Trans: Translator for the Vision and Language Navigation Agent
Yue Zhang
Parisa Kordjamshidi
25
16
0
18 Feb 2023
Towards Unifying Medical Vision-and-Language Pre-training via Soft
  Prompts
Towards Unifying Medical Vision-and-Language Pre-training via Soft Prompts
Zhihong Chen
Shizhe Diao
Benyou Wang
Guanbin Li
Xiang Wan
MedIm
17
29
0
17 Feb 2023
Rejecting Cognitivism: Computational Phenomenology for Deep Learning
Rejecting Cognitivism: Computational Phenomenology for Deep Learning
P. Beckmann
G. Köstner
Ines Hipólito
14
4
0
16 Feb 2023
Text-driven Visual Synthesis with Latent Diffusion Prior
Text-driven Visual Synthesis with Latent Diffusion Prior
Tingbo Liao
Songwei Ge
Yiran Xu
Yao-Chih Lee
Badour Albahar
Jia-Bin Huang
DiffM
28
6
0
16 Feb 2023
Auditing large language models: a three-layered approach
Auditing large language models: a three-layered approach
Jakob Mokander
Jonas Schuett
Hannah Rose Kirk
Luciano Floridi
AILaw
MLAU
29
193
0
16 Feb 2023
Retrieval-augmented Image Captioning
Retrieval-augmented Image Captioning
R. Ramos
Desmond Elliott
Bruno Martins
VLM
22
29
0
16 Feb 2023
MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation
MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation
Omer Bar-Tal
Lior Yariv
Y. Lipman
Tali Dekel
45
364
1
16 Feb 2023
MINOTAUR: Multi-task Video Grounding From Multimodal Queries
MINOTAUR: Multi-task Video Grounding From Multimodal Queries
Raghav Goyal
E. Mavroudi
Xitong Yang
Sainbayar Sukhbaatar
Leonid Sigal
Matt Feiszli
Lorenzo Torresani
Du Tran
8
7
0
16 Feb 2023
Learning to Substitute Ingredients in Recipes
Learning to Substitute Ingredients in Recipes
Bahare Fatemi
Quentin Duval
Rohit Girdhar
M. Drozdzal
Adriana Romero Soriano
18
7
0
15 Feb 2023
From paintbrush to pixel: A review of deep neural networks in
  AI-generated art
From paintbrush to pixel: A review of deep neural networks in AI-generated art
Anne-Sofie Maerten
Derya Soydaner
30
22
0
14 Feb 2023
A Friendly Face: Do Text-to-Image Systems Rely on Stereotypes when the
  Input is Under-Specified?
A Friendly Face: Do Text-to-Image Systems Rely on Stereotypes when the Input is Under-Specified?
Kathleen C. Fraser
S. Kiritchenko
I. Nejadgholi
DiffM
24
36
0
14 Feb 2023
Universal Guidance for Diffusion Models
Universal Guidance for Diffusion Models
Arpit Bansal
Hong-Min Chu
Avi Schwarzschild
Soumyadip Sengupta
Micah Goldblum
Jonas Geiping
Tom Goldstein
VLM
37
242
0
14 Feb 2023
Guiding Pretraining in Reinforcement Learning with Large Language Models
Guiding Pretraining in Reinforcement Learning with Large Language Models
Yuqing Du
Olivia Watkins
Zihan Wang
Cédric Colas
Trevor Darrell
Pieter Abbeel
Abhishek Gupta
Jacob Andreas
LM&Ro
16
171
0
13 Feb 2023
Symbolic Discovery of Optimization Algorithms
Symbolic Discovery of Optimization Algorithms
Xiangning Chen
Chen Liang
Da Huang
Esteban Real
Kaiyuan Wang
...
Xuanyi Dong
Thang Luong
Cho-Jui Hsieh
Yifeng Lu
Quoc V. Le
28
348
0
13 Feb 2023
Paparazzi: A Deep Dive into the Capabilities of Language and Vision
  Models for Grounding Viewpoint Descriptions
Paparazzi: A Deep Dive into the Capabilities of Language and Vision Models for Grounding Viewpoint Descriptions
Henrik Voigt
J. Hombeck
M. Meuschke
K. Lawonn
Sina Zarrieß
VLM
20
1
0
13 Feb 2023
Semantic Image Segmentation: Two Decades of Research
Semantic Image Segmentation: Two Decades of Research
G. Csurka
Riccardo Volpi
Boris Chidlovskii
3DV
24
49
0
13 Feb 2023
A Simple Zero-shot Prompt Weighting Technique to Improve Prompt
  Ensembling in Text-Image Models
A Simple Zero-shot Prompt Weighting Technique to Improve Prompt Ensembling in Text-Image Models
J. Allingham
Jie Jessie Ren
Michael W. Dusenberry
Xiuye Gu
Yin Cui
Dustin Tran
J. Liu
Balaji Lakshminarayanan
LLMAG
VLM
19
32
0
13 Feb 2023
Actional Atomic-Concept Learning for Demystifying Vision-Language
  Navigation
Actional Atomic-Concept Learning for Demystifying Vision-Language Navigation
Bingqian Lin
Yi Zhu
Xiaodan Liang
Liang Lin
Jian-zhuo Liu
CoGe
LM&Ro
29
3
0
13 Feb 2023
Previous
123...158159160...176177178
Next