Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2401.15071
Cited By

From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on
Generalizability, Trustworthiness and Causality through Four Modalities

v1v2 (latest)

From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities

26 January 2024

Wanli Ouyang

Yu Qiao

ArXiv (abs)PDF HTML HuggingFace (38 upvotes)Github

Papers citing "From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities"

14 / 14 papers shown

Contextual Image Attack: How Visual Context Exposes Multimodal Safety Vulnerabilities

Contextual Image Attack: How Visual Context Exposes Multimodal Safety Vulnerabilities

325

0

0

02 Dec 2025

Response Attack: Exploiting Contextual Priming to Jailbreak Large Language Models

Response Attack: Exploiting Contextual Priming to Jailbreak Large Language Models

235

6

0

07 Jul 2025

Visual Contextual Attack: Jailbreaking MLLMs with Image-Driven Context Injection

Visual Contextual Attack: Jailbreaking MLLMs with Image-Driven Context Injection

328

18

0

03 Jul 2025

Dynamic Chain-of-Thought: Towards Adaptive Deep Reasoning

Dynamic Chain-of-Thought: Towards Adaptive Deep Reasoning

1.2K

5

0

07 Feb 2025

Wormhole Memory: A Rubik's Cube for Cross-Dialogue Retrieval

Wormhole Memory: A Rubik's Cube for Cross-Dialogue Retrieval

1.3K

0

0

24 Jan 2025

Enhanced Multimodal RAG-LLM for Accurate Visual Question Answering

Enhanced Multimodal RAG-LLM for Accurate Visual Question Answering

326

14

0

31 Dec 2024

Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey

Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey

...

Toshihiko Yamasaki

Kiyoharu Aizawa

521

39

0

31 Jul 2024

Principled Understanding of Generalization for Generative Transformer Models in Arithmetic Reasoning Tasks

Principled Understanding of Generalization for Generative Transformer Models in Arithmetic Reasoning Tasks

320

0

0

25 Jul 2024

CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision
Language Models

CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language ModelsNeural Information Processing Systems (NeurIPS), 2024

...

Zongyuan Ge

Gang Li

Huaxiu Yao

320

74

0

10 Jun 2024

A Misleading Gallery of Fluid Motion by Generative Artificial
Intelligence

A Misleading Gallery of Fluid Motion by Generative Artificial Intelligence

386

9

0

24 May 2024

Fine-Tuning Large Vision-Language Models as Decision-Making Agents via
Reinforcement Learning

Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2024

Shengbang Tong

...

435

169

0

16 May 2024

Quantifying and Mitigating Unimodal Biases in Multimodal Large Language
Models: A Causal Perspective

Quantifying and Mitigating Unimodal Biases in Multimodal Large Language Models: A Causal Perspective

602

41

0

27 Mar 2024

Assessment of Multimodal Large Language Models in Alignment with Human
Values

Assessment of Multimodal Large Language Models in Alignment with Human Values

Yu Qiao

280

37

0

26 Mar 2024

Review of Generative AI Methods in Cybersecurity

Review of Generative AI Methods in Cybersecurity

William J. Buchanan

Madjid G Tehrani

Leandros A. Maglaras

518

46

0

13 Mar 2024

Page 1 of 1