Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.07073
Cited By
Pixtral 12B
9 October 2024
Pravesh Agrawal
Szymon Antoniak
Emma Bou Hanna
Baptiste Bout
Devendra Singh Chaplot
Jessica Chudnovsky
Diogo Costa
Baudouin De Monicault
Saurabh Garg
Théophile Gervet
Soham Ghosh
Amélie Héliou
Paul Jacob
Albert Q. Jiang
Kartik Khandelwal
Timothée Lacroix
Guillaume Lample
Diego de Las Casas
Thibaut Lavril
Teven Le Scao
Andy Lo
William Marshall
Louis Martin
A. Mensch
Pavankumar Muddireddy
Valera Nemychnikova
Marie Pellat
Patrick von Platen
Nikhil Raghuraman
Baptiste Rozière
Alexandre Sablayrolles
Lucile Saulnier
Romain Sauvestre
Wendy Shang
Roman Soletskyi
Lawrence Stewart
Pierre Stock
Joachim Studnia
Sandeep Subramanian
Sagar Vaze
Thomas Wang
Sophia Yang
VLM
MLLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Pixtral 12B"
11 / 11 papers shown
Title
RESAnything: Attribute Prompting for Arbitrary Referring Segmentation
Ruiqi Wang
Hao Zhang
VLM
52
0
0
03 May 2025
Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control
Nvidia
Hassan Abu Alhaija
Jose M. Alvarez
Maciej Bala
Tiffany Cai
...
Yuchong Ye
Xiaodong Yang
X. Yang
Xiaohui Zeng
Yu Zeng
VGen
90
1
0
18 Mar 2025
Aligning Multimodal LLM with Human Preference: A Survey
Tao Yu
Y. Zhang
Chaoyou Fu
Junkang Wu
Jinda Lu
...
Qingsong Wen
Z. Zhang
Yan Huang
Liang Wang
T. Tan
76
2
0
18 Mar 2025
TLAC: Two-stage LMM Augmented CLIP for Zero-Shot Classification
Ans Munir
Faisal Z. Qureshi
M. H. Khan
Mohsen Ali
VLM
70
0
0
15 Mar 2025
Referring to Any Person
Qing Jiang
Lin Wu
Zhaoyang Zeng
Tianhe Ren
Yuda Xiong
Yihao Chen
Qin Liu
Lei Zhang
77
0
0
11 Mar 2025
Scientific Reasoning: Assessment of Multimodal Generative LLMs
Florian Dreyer
Ekaterina Kolos
Daria Matiash
ReLM
LRM
59
0
0
03 Mar 2025
Chimera: Improving Generalist Model with Domain-Specific Experts
Tianshuo Peng
M. Li
Hongbin Zhou
Renqiu Xia
Renrui Zhang
...
Aojun Zhou
Botian Shi
Tao Chen
Bo Zhang
Xiangyu Yue
84
4
0
08 Dec 2024
ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
Qing Jiang
Gen Luo
Yuqin Yang
Yuda Xiong
Yihao Chen
Zhaoyang Zeng
Tianhe Ren
Lei Zhang
VLM
LRM
105
6
0
27 Nov 2024
GIFT: A Framework for Global Interpretable Faithful Textual Explanations of Vision Classifiers
Éloi Zablocki
Valentin Gerard
Amaia Cardiel
Eric Gaussier
Matthieu Cord
Eduardo Valle
69
0
0
23 Nov 2024
Teaching VLMs to Localize Specific Objects from In-context Examples
Sivan Doveh
Nimrod Shabtay
Wei Lin
Eli Schwartz
Hilde Kuehne
...
Leonid Karlinsky
James Glass
Assaf Arbelle
S. Ullman
Muhammad Jehanzeb Mirza
VLM
96
1
0
20 Nov 2024
3DArticCyclists: Generating Synthetic Articulated 8D Pose-Controllable Cyclist Data for Computer Vision Applications
Eduardo R. Corral-Soto
Yang Liu
Tongtong Cao
Y. Ren
Liu Bingbing
44
4
0
14 Oct 2024
1