ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.15010
  4. Cited By
LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model

LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model

28 April 2023
Peng Gao
Jiaming Han
Renrui Zhang
Ziyi Lin
Shijie Geng
Aojun Zhou
Wei Zhang
Pan Lu
Conghui He
Xiangyu Yue
Hongsheng Li
Yu Qiao
    MLLM
ArXivPDFHTML

Papers citing "LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model"

50 / 87 papers shown
Title
Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving
Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving
Zongchuang Zhao
Haoyu Fu
Dingkang Liang
Xin Zhou
Dingyuan Zhang
Hongwei Xie
Bing Wang
Xiang Bai
MLLM
VLM
39
0
0
13 May 2025
Segment Any RGB-Thermal Model with Language-aided Distillation
Segment Any RGB-Thermal Model with Language-aided Distillation
Dong Xing
Xianxun Zhu
Wei Zhou
Qika Lin
Hang Yang
Yuqing Wang
VLM
56
0
0
04 May 2025
RESAnything: Attribute Prompting for Arbitrary Referring Segmentation
RESAnything: Attribute Prompting for Arbitrary Referring Segmentation
Ruiqi Wang
Hao Zhang
VLM
54
0
0
03 May 2025
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
Jiaming Liu
Hao Chen
Pengju An
Zhuoyang Liu
Renrui Zhang
...
Chengkai Hou
Mengdi Zhao
KC alex Zhou
Pheng-Ann Heng
S. Zhang
64
6
0
13 Mar 2025
Adversarial Training for Multimodal Large Language Models against Jailbreak Attacks
Adversarial Training for Multimodal Large Language Models against Jailbreak Attacks
Liming Lu
Shuchao Pang
Siyuan Liang
Haotian Zhu
Xiyu Zeng
Aishan Liu
Yunhuai Liu
Yongbin Zhou
AAML
49
1
0
05 Mar 2025
PaCA: Partial Connection Adaptation for Efficient Fine-Tuning
Sunghyeon Woo
Sol Namkung
Sunwoo Lee
Inho Jeong
Beomseok Kim
Dongsuk Jeon
33
0
0
28 Feb 2025
DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
Taiyi Wang
Zhihao Wu
Jianheng Liu
Jianye Hao
J. Wang
Kun Shao
OffRL
34
13
0
24 Feb 2025
LOVA3: Learning to Visual Question Answering, Asking and Assessment
LOVA3: Learning to Visual Question Answering, Asking and Assessment
Henry Hengyuan Zhao
Pan Zhou
Difei Gao
Zechen Bai
Mike Zheng Shou
77
8
0
21 Feb 2025
DriveLM: Driving with Graph Visual Question Answering
DriveLM: Driving with Graph Visual Question Answering
Chonghao Sima
Katrin Renz
Kashyap Chitta
L. Chen
Hanxue Zhang
Chengen Xie
Jens Beißwenger
Ping Luo
Andreas Geiger
Hongyang Li
84
160
0
17 Jan 2025
OneLLM: One Framework to Align All Modalities with Language
OneLLM: One Framework to Align All Modalities with Language
Jiaming Han
Kaixiong Gong
Yiyuan Zhang
Jiaqi Wang
Kaipeng Zhang
D. Lin
Yu Qiao
Peng Gao
Xiangyu Yue
MLLM
104
107
0
10 Jan 2025
LSceneLLM: Enhancing Large 3D Scene Understanding Using Adaptive Visual Preferences
LSceneLLM: Enhancing Large 3D Scene Understanding Using Adaptive Visual Preferences
Hongyan Zhi
Peihao Chen
Junyan Li
Shuailei Ma
Xinyu Sun
Tianhang Xiang
Yinjie Lei
Mingkui Tan
Chuang Gan
72
3
0
02 Dec 2024
NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples
NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples
Baiqi Li
Zhiqiu Lin
Wenxuan Peng
Jean de Dieu Nyandwi
Daniel Jiang
Zixian Ma
Simran Khanuja
Ranjay Krishna
Graham Neubig
Deva Ramanan
AAML
CoGe
VLM
61
20
0
18 Oct 2024
Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent
Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent
Bo Chen
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
83
19
0
15 Oct 2024
MMSearch: Benchmarking the Potential of Large Models as Multi-modal
  Search Engines
MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines
Dongzhi Jiang
Renrui Zhang
Ziyu Guo
Yanmin Wu
Jiayi Lei
...
Guanglu Song
Peng Gao
Yu Liu
Chunyuan Li
Hongsheng Li
MLLM
27
16
0
19 Sep 2024
AudioComposer: Towards Fine-grained Audio Generation with Natural Language Descriptions
AudioComposer: Towards Fine-grained Audio Generation with Natural Language Descriptions
Y. Wang
Hangting Chen
Dongchao Yang
Zhiyong Wu
Xixin Wu
DiffM
40
2
0
19 Sep 2024
IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities
IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities
Bin Wang
Chunyu Xie
Dawei Leng
Yuhui Yin
MLLM
45
1
0
23 Aug 2024
FEDKIM: Adaptive Federated Knowledge Injection into Medical Foundation Models
FEDKIM: Adaptive Federated Knowledge Injection into Medical Foundation Models
Xiaochen Wang
Jiaqi Wang
Houping Xiao
J. Chen
Fenglong Ma
MedIm
61
7
0
17 Aug 2024
ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large
  Language Models
ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models
Ming-Kuan Wu
Xinyue Cai
Jiayi Ji
Jiale Li
Oucheng Huang
Gen Luo
Hao Fei
Xiaoshuai Sun
Rongrong Ji
MLLM
40
7
0
31 Jul 2024
ViLLa: Video Reasoning Segmentation with Large Language Model
ViLLa: Video Reasoning Segmentation with Large Language Model
Rongkun Zheng
Lu Qi
Xi Chen
Yi Wang
Kun Wang
Yu Qiao
Hengshuang Zhao
VOS
LRM
52
2
0
18 Jul 2024
GenderBias-\emph{VL}: Benchmarking Gender Bias in Vision Language Models
  via Counterfactual Probing
GenderBias-\emph{VL}: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing
Yisong Xiao
Aishan Liu
QianJia Cheng
Zhenfei Yin
Siyuan Liang
Jiapeng Li
Jing Shao
Xianglong Liu
Dacheng Tao
33
4
0
30 Jun 2024
Sports Intelligence: Assessing the Sports Understanding Capabilities of
  Language Models through Question Answering from Text to Video
Sports Intelligence: Assessing the Sports Understanding Capabilities of Language Models through Question Answering from Text to Video
Zhengbang Yang
Haotian Xia
Jingxi Li
Zezhi Chen
Zhuangdi Zhu
Weining Shen
ELM
LRM
35
1
0
21 Jun 2024
SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model
SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model
Yongting Zhang
Lu Chen
Guodong Zheng
Yifeng Gao
Rui Zheng
...
Yu Qiao
Xuanjing Huang
Feng Zhao
Tao Gui
Jing Shao
VLM
75
23
0
17 Jun 2024
Learning Fine-Grained Controllability on Speech Generation via Efficient
  Fine-Tuning
Learning Fine-Grained Controllability on Speech Generation via Efficient Fine-Tuning
Chung-Ming Chien
Andros Tjandra
Apoorv Vyas
Matt Le
Bowen Shi
Wei-Ning Hsu
32
0
0
10 Jun 2024
Towards Semantic Equivalence of Tokenization in Multimodal LLM
Towards Semantic Equivalence of Tokenization in Multimodal LLM
Shengqiong Wu
Hao Fei
Xiangtai Li
Jiayi Ji
Hanwang Zhang
Tat-Seng Chua
Shuicheng Yan
MLLM
59
31
0
07 Jun 2024
A-Bench: Are LMMs Masters at Evaluating AI-generated Images?
A-Bench: Are LMMs Masters at Evaluating AI-generated Images?
Zicheng Zhang
H. Wu
Chunyi Li
Yingjie Zhou
Wei Sun
Xiongkuo Min
Zijian Chen
Xiaohong Liu
Weisi Lin
Guangtao Zhai
EGVM
59
15
0
05 Jun 2024
Enhancing Large Vision Language Models with Self-Training on Image
  Comprehension
Enhancing Large Vision Language Models with Self-Training on Image Comprehension
Yihe Deng
Pan Lu
Fan Yin
Ziniu Hu
Sheng Shen
James Y. Zou
Kai-Wei Chang
Wei Wang
SyDa
VLM
LRM
36
36
0
30 May 2024
Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding
Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding
Shenghuan Sun
Gregory M. Goldgof
Alexander Schubert
Zhiqing Sun
Thomas Hartvigsen
A. Butte
Ahmed Alaa
LM&MA
27
4
0
29 May 2024
Adapting Pre-Trained Vision Models for Novel Instance Detection and Segmentation
Adapting Pre-Trained Vision Models for Novel Instance Detection and Segmentation
Ya Lu
Jishnu Jaykumar
Yunhui Guo
Nicholas Ruozzi
Yu Xiang
VLM
ISeg
48
4
0
28 May 2024
VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Models
VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Models
Zejun Li
Ruipu Luo
Jiwen Zhang
Minghui Qiu
Zhongyu Wei
Zhongyu Wei
LRM
MLLM
60
7
0
27 May 2024
A Survey on Vision-Language-Action Models for Embodied AI
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
67
41
0
23 May 2024
Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts
Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts
Yunxin Li
Shenyuan Jiang
Baotian Hu
Longyue Wang
Wanqi Zhong
Wenhan Luo
Lin Ma
Min-Ling Zhang
MoE
34
28
0
18 May 2024
Listen Again and Choose the Right Answer: A New Paradigm for Automatic
  Speech Recognition with Large Language Models
Listen Again and Choose the Right Answer: A New Paradigm for Automatic Speech Recognition with Large Language Models
Yuchen Hu
Chen Chen
Chengwei Qin
Qiushi Zhu
E. Chng
Ruizhe Li
AuLLM
KELM
36
5
0
16 May 2024
THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models
THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models
Prannay Kaul
Zhizhong Li
Hao-Yu Yang
Yonatan Dukler
Ashwin Swaminathan
C. Taylor
Stefano Soatto
HILM
49
15
0
08 May 2024
Hallucination of Multimodal Large Language Models: A Survey
Hallucination of Multimodal Large Language Models: A Survey
Zechen Bai
Pichao Wang
Tianjun Xiao
Tong He
Zongbo Han
Zheng Zhang
Mike Zheng Shou
VLM
LRM
80
139
0
29 Apr 2024
List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs
List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs
An Yan
Zhengyuan Yang
Junda Wu
Wanrong Zhu
Jianwei Yang
...
K. Lin
Jianfeng Wang
Julian McAuley
Jianfeng Gao
Lijuan Wang
LRM
34
12
0
25 Apr 2024
Describe-then-Reason: Improving Multimodal Mathematical Reasoning
  through Visual Comprehension Training
Describe-then-Reason: Improving Multimodal Mathematical Reasoning through Visual Comprehension Training
Mengzhao Jia
Zhihan Zhang
W. Yu
Fangkai Jiao
Meng-Long Jiang
VLM
ReLM
LRM
48
7
0
22 Apr 2024
UrbanCross: Enhancing Satellite Image-Text Retrieval with Cross-Domain
  Adaptation
UrbanCross: Enhancing Satellite Image-Text Retrieval with Cross-Domain Adaptation
Siru Zhong
Xixuan Hao
Yibo Yan
Ying Zhang
Yangqiu Song
Yuxuan Liang
35
8
0
22 Apr 2024
Prioritized Semantic Learning for Zero-shot Instance Navigation
Prioritized Semantic Learning for Zero-shot Instance Navigation
Xander Sun
Louis Lau
Hoyard Zhi
Ronghe Qiu
Junwei Liang
30
8
0
18 Mar 2024
NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning
NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning
Bingqian Lin
Yunshuang Nie
Ziming Wei
Jiaqi Chen
Shikui Ma
Jianhua Han
Hang Xu
Xiaojun Chang
Xiaodan Liang
LM&Ro
LRM
60
19
0
12 Mar 2024
XPSR: Cross-modal Priors for Diffusion-based Image Super-Resolution
XPSR: Cross-modal Priors for Diffusion-based Image Super-Resolution
Yunpeng Qu
Kun Yuan
Kai Zhao
Qizhi Xie
Jinhua Hao
Ming-hui Sun
Chao Zhou
27
16
0
08 Mar 2024
Embodied Understanding of Driving Scenarios
Embodied Understanding of Driving Scenarios
Yunsong Zhou
Linyan Huang
Qingwen Bu
Jia Zeng
Tianyu Li
Hang Qiu
Hongzi Zhu
Minyi Guo
Yu Qiao
Hongyang Li
LM&Ro
55
31
0
07 Mar 2024
Generalizable Whole Slide Image Classification with Fine-Grained
  Visual-Semantic Interaction
Generalizable Whole Slide Image Classification with Fine-Grained Visual-Semantic Interaction
Hao Li
Ying Chen
Yifei Chen
Wenxian Yang
Bowen Ding
Yuchen Han
Liansheng Wang
Rongshan Yu
31
15
0
29 Feb 2024
DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation
DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation
Sunghyeon Woo
Baeseong Park
Byeongwook Kim
Minjung Jo
S. Kwon
Dongsuk Jeon
Dongsoo Lee
57
2
0
27 Feb 2024
Visual Hallucinations of Multi-modal Large Language Models
Visual Hallucinations of Multi-modal Large Language Models
Wen Huang
Hongbin Liu
Minxin Guo
Neil Zhenqiang Gong
MLLM
VLM
32
24
0
22 Feb 2024
BBA: Bi-Modal Behavioral Alignment for Reasoning with Large
  Vision-Language Models
BBA: Bi-Modal Behavioral Alignment for Reasoning with Large Vision-Language Models
Xueliang Zhao
Xinting Huang
Tingchen Fu
Qintong Li
Shansan Gong
Lemao Liu
Wei Bi
Lingpeng Kong
LRM
33
1
0
21 Feb 2024
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
Chris Liu
Renrui Zhang
Longtian Qiu
Siyuan Huang
Weifeng Lin
...
Hao Shao
Pan Lu
Hongsheng Li
Yu Qiao
Peng Gao
MLLM
126
106
0
08 Feb 2024
LLM4SecHW: Leveraging Domain Specific Large Language Model for Hardware
  Debugging
LLM4SecHW: Leveraging Domain Specific Large Language Model for Hardware Debugging
Weimin Fu
Kaichen Yang
R. Dutta
Xiaolong Guo
Gang Qu
19
23
0
28 Jan 2024
Large Language Models are Efficient Learners of Noise-Robust Speech
  Recognition
Large Language Models are Efficient Learners of Noise-Robust Speech Recognition
Yuchen Hu
Chen Chen
Chao-Han Huck Yang
Ruizhe Li
Chao Zhang
Pin-Yu Chen
Ensiong Chng
25
20
0
19 Jan 2024
Jack of All Tasks, Master of Many: Designing General-purpose
  Coarse-to-Fine Vision-Language Model
Jack of All Tasks, Master of Many: Designing General-purpose Coarse-to-Fine Vision-Language Model
Shraman Pramanick
Guangxing Han
Rui Hou
Sayan Nag
Ser-Nam Lim
Nicolas Ballas
Qifan Wang
Rama Chellappa
Amjad Almahairi
VLM
MLLM
38
29
0
19 Dec 2023
MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active
  Perception
MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception
Yiran Qin
Enshen Zhou
Qichang Liu
Zhen-fei Yin
Lu Sheng
Ruimao Zhang
Yu Qiao
Jing Shao
LM&Ro
20
39
0
12 Dec 2023
12
Next