ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.16860
  4. Cited By
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs

Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs

24 June 2024
Shengbang Tong
Ellis L Brown
Penghao Wu
Sanghyun Woo
Manoj Middepogu
Sai Charitha Akula
Jihan Yang
Shusheng Yang
Adithya Iyer
Xichen Pan
Austin Wang
Rob Fergus
Yann LeCun
Saining Xie
    3DVMLLM
ArXiv (abs)PDFHTMLHuggingFace (61 upvotes)

Papers citing "Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs"

13 / 413 papers shown
ANOLE: An Open, Autoregressive, Native Large Multimodal Models for
  Interleaved Image-Text Generation
ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation
Ethan Chern
Jiadi Su
Yan Ma
Pengfei Liu
MLLM
235
73
0
08 Jul 2024
VCR: A Task for Pixel-Level Complex Reasoning in Vision Language Models via Restoring Occluded Text
VCR: A Task for Pixel-Level Complex Reasoning in Vision Language Models via Restoring Occluded TextInternational Conference on Learning Representations (ICLR), 2024
Tianyu Zhang
Suyuchen Wang
Lu Li
Ge Zhang
Perouz Taslakian
Sai Rajeswar
Jie Fu
Bang Liu
Yoshua Bengio
265
5
0
10 Jun 2024
Don't Miss the Forest for the Trees: Attentional Vision Calibration for Large Vision Language Models
Don't Miss the Forest for the Trees: Attentional Vision Calibration for Large Vision Language Models
Sangmin Woo
Donguk Kim
Jaehyuk Jang
Yubin Choi
Changick Kim
328
31
0
28 May 2024
A Survey of Multimodal Large Language Model from A Data-centric
  Perspective
A Survey of Multimodal Large Language Model from A Data-centric Perspective
Tianyi Bai
Hao Liang
Binwang Wan
Yanran Xu
Xi Li
...
Ping Huang
Jiulong Shan
Conghui He
Binhang Yuan
Wentao Zhang
383
64
0
26 May 2024
Pensieve: Retrospect-then-Compare Mitigates Visual Hallucination
Pensieve: Retrospect-then-Compare Mitigates Visual Hallucination
Dingchen Yang
Bowen Cao
Guang Chen
Changjun Jiang
237
13
0
21 Mar 2024
A Comprehensive Study of Multimodal Large Language Models for Image
  Quality Assessment
A Comprehensive Study of Multimodal Large Language Models for Image Quality Assessment
Tianhe Wu
Kede Ma
Jie Liang
Yujiu Yang
Lei Zhang
218
48
0
16 Mar 2024
Yi: Open Foundation Models by 01.AI
Yi: Open Foundation Models by 01.AI
01. AI
Alex Young
01.AI Alex Young
Bei Chen
Chao Li
...
Yue Wang
Yuxuan Cai
Zhenyu Gu
Zhiyuan Liu
Zonghong Dai
OSLMLRM
833
766
0
07 Mar 2024
ImgTrojan: Jailbreaking Vision-Language Models with ONE Image
ImgTrojan: Jailbreaking Vision-Language Models with ONE Image
Xijia Tao
Shuai Zhong
Lei Li
Qi Liu
Lingpeng Kong
393
46
0
05 Mar 2024
What Is Missing in Multilingual Visual Reasoning and How to Fix It
What Is Missing in Multilingual Visual Reasoning and How to Fix It
Yueqi Song
Simran Khanuja
Graham Neubig
VLMLRM
609
8
0
03 Mar 2024
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
Chris Liu
Renrui Zhang
Longtian Qiu
Siyuan Huang
Weifeng Lin
...
Hao Shao
Pan Lu
Jiaming Song
Yu Qiao
Shiyang Feng
MLLM
512
139
0
08 Feb 2024
Dense Video Object Captioning from Disjoint Supervision
Dense Video Object Captioning from Disjoint SupervisionInternational Conference on Learning Representations (ICLR), 2023
Xingyi Zhou
Anurag Arnab
Chen Sun
Cordelia Schmid
287
7
0
20 Jun 2023
WizardCoder: Empowering Code Large Language Models with Evol-Instruct
WizardCoder: Empowering Code Large Language Models with Evol-InstructInternational Conference on Learning Representations (ICLR), 2023
Ziyang Luo
Can Xu
Lu Wang
Qingfeng Sun
Xiubo Geng
Wenxiang Hu
Chongyang Tao
Jing Ma
Qingwei Lin
Daxin Jiang
ELMSyDaALM
723
857
0
14 Jun 2023
ScreenQA: Large-Scale Question-Answer Pairs over Mobile App Screenshots
ScreenQA: Large-Scale Question-Answer Pairs over Mobile App ScreenshotsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2022
Yu-Chung Hsiao
Fedir Zubach
Maria Wang
Jindong Chen
Victor Carbune
Jason Lin
Maria Wang
Yun Zhu
Jindong Chen
RALM
965
46
0
16 Sep 2022
Previous
123456789