ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.04434
  4. Cited By
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts
  Language Model

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

7 May 2024
DeepSeek-AI
Aixin Liu
Bei Feng
Bin Wang
Bingxuan Wang
Bo Liu
Chenggang Zhao
Chengqi Dengr
Chong Ruan
Damai Dai
Daya Guo
Dejian Yang
Deli Chen
Dongjie Ji
Erhang Li
Fangyun Lin
Fuli Luo
Guangbo Hao
Guanting Chen
Guowei Li
Hai-Tao Zhang
Hanwei Xu
Hao-Yu Yang
Haowei Zhang
Honghui Ding
Huajian Xin
Huazuo Gao
Hui Li
Hui Qu
J. L. Cai
Jian Liang
Jianzhong Guo
Jiaqi Ni
Jiashi Li
Jin Chen
Jingyang Yuan
Junjie Qiu
Junxiao Song
Kai Dong
Kaige Gao
Kang Guan
Lean Wang
Lecong Zhang
Lei Xu
Leyi Xia
Liang Zhao
Liyue Zhang
Meng Li
Miaojun Wang
Mingchuan Zhang
Minghua Zhang
Minghui Tang
Mingming Li
Ning Tian
Panpan Huang
Peiyi Wang
Peng Zhang
Qihao Zhu
Qinyu Chen
Qiushi Du
R. J. Chen
R. L. Jin
Ruiqi Ge
Ruizhe Pan
Runxin Xu
Ruyi Chen
S. S. Li
Shanghao Lu
Shangyan Zhou
Shanhuang Chen
Shaoqing Wu
Shengfeng Ye
Shirong Ma
Shiyu Wang
Shuang Zhou
Shuiping Yu
Shunfeng Zhou
Size Zheng
Tao Wang
Tian Pei
Tian Yuan
Tianyu Sun
W. L. Xiao
Wangding Zeng
Wei An
Wen Liu
Wenfeng Liang
Wenjun Gao
Wentao Zhang
X. Q. Li
Xiangyue Jin
Xianzu Wang
Xiao Bi
Xiaodong Liu
Xiaohan Wang
Xiaojin Shen
Xiaokang Chen
Xiaosha Chen
Xiaotao Nie
Xiaowen Sun
Xiaoxiang Wang
Xin Liu
Xin Xie
Xingkai Yu
Xinnan Song
Xinyi Zhou
Xinyu Yang
Xuan Lu
Xuecheng Su
Ying Wu
Y. K. Li
Y. X. Wei
Y. X. Zhu
Yanhong Xu
Yanping Huang
Yao Li
Yao-Min Zhao
Yaofeng Sun
Yaohui Li
Yaohui Wang
Yi Zheng
Yichao Zhang
Yiliang Xiong
Yilong Zhao
Ying He
Ying Tang
Yishi Piao
Yixin Dong
Yixuan Tan
Yiyuan Liu
Yongji Wang
Yongqiang Guo
Yuchen Zhu
Yuduan Wang
Yuheng Zou
Yukun Zha
Yunxian Ma
Yuting Yan
Yuxiang You
Yuxuan Liu
Z. Z. Ren
Zehui Ren
Zhangli Sha
Zhe Fu
Zhen Huang
Zhen Zhang
Zhenda Xie
Zhewen Hao
Zhihong Shao
Zhiniu Wen
Zhipeng Xu
Zhongyu Zhang
Zhuoshu Li
Zihan Wang
Zihui Gu
Zilin Li
Ziwei Xie
    MoE
ArXivPDFHTML

Papers citing "DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model"

24 / 74 papers shown
Title
LLM The Genius Paradox: A Linguistic and Math Expert's Struggle with Simple Word-based Counting Problems
LLM The Genius Paradox: A Linguistic and Math Expert's Struggle with Simple Word-based Counting Problems
Nan Xu
Xuezhe Ma
LRM
29
3
0
18 Oct 2024
EPS-MoE: Expert Pipeline Scheduler for Cost-Efficient MoE Inference
EPS-MoE: Expert Pipeline Scheduler for Cost-Efficient MoE Inference
Yulei Qian
Fengcun Li
Xiangyang Ji
Xiaoyu Zhao
Jianchao Tan
K. Zhang
Xunliang Cai
MoE
47
2
0
16 Oct 2024
Revisiting Benchmark and Assessment: An Agent-based Exploratory Dynamic Evaluation Framework for LLMs
Revisiting Benchmark and Assessment: An Agent-based Exploratory Dynamic Evaluation Framework for LLMs
Wanying Wang
Zeyu Ma
Pengfei Liu
Mingang Chen
LLMAG
45
1
0
15 Oct 2024
Round and Round We Go! What makes Rotary Positional Encodings useful?
Round and Round We Go! What makes Rotary Positional Encodings useful?
Federico Barbero
Alex Vitvitskyi
Christos Perivolaropoulos
Razvan Pascanu
Petar Velickovic
54
16
0
08 Oct 2024
LongGenBench: Long-context Generation Benchmark
LongGenBench: Long-context Generation Benchmark
Xiang Liu
Peijie Dong
Xuming Hu
Xiaowen Chu
RALM
28
8
0
05 Oct 2024
ColaCare: Enhancing Electronic Health Record Modeling through Large Language Model-Driven Multi-Agent Collaboration
ColaCare: Enhancing Electronic Health Record Modeling through Large Language Model-Driven Multi-Agent Collaboration
Zixiang Wang
Yinghao Zhu
Huiya Zhao
Xiaochen Zheng
Tianlong Wang
...
Yasha Wang
Ewen M. Harrison
Junyi Gao
Liantao Ma
Liantao Ma
46
1
0
03 Oct 2024
Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Yao Teng
Han Shi
Xian Liu
Xuefei Ning
Guohao Dai
Yu Wang
Zhenguo Li
Xihui Liu
48
10
0
02 Oct 2024
Edu-Values: Towards Evaluating the Chinese Education Values of Large Language Models
Edu-Values: Towards Evaluating the Chinese Education Values of Large Language Models
Peiyi Zhang
Yazhou Zhang
Bo Wang
Lu Rong
Jing Qin
Jing Qin
AI4Ed
ELM
42
0
0
19 Sep 2024
LogicPro: Improving Complex Logical Reasoning via Program-Guided Learning
LogicPro: Improving Complex Logical Reasoning via Program-Guided Learning
Jin Jiang
Yuchen Yan
Yang Liu
Yonggang Jin
Shuai Peng
M. Zhang
Xunliang Cai
Yixin Cao
Liangcai Gao
Zhi Tang
LRM
32
3
0
19 Sep 2024
Large Language Models are Good Multi-lingual Learners : When LLMs Meet Cross-lingual Prompts
Large Language Models are Good Multi-lingual Learners : When LLMs Meet Cross-lingual Prompts
Teng Wang
Zhenqi He
Wing-Yin Yu
Xiaojin Fu
Xiongwei Han
LRM
41
5
0
17 Sep 2024
XG-NID: Dual-Modality Network Intrusion Detection using a Heterogeneous Graph Neural Network and Large Language Model
XG-NID: Dual-Modality Network Intrusion Detection using a Heterogeneous Graph Neural Network and Large Language Model
Yasir Ali Farrukh
S. Wali
I. Khan
Nathaniel D. Bastian
38
2
0
27 Aug 2024
Retrieve, Summarize, Plan: Advancing Multi-hop Question Answering with an Iterative Approach
Retrieve, Summarize, Plan: Advancing Multi-hop Question Answering with an Iterative Approach
Zhouyu Jiang
Mengshu Sun
Lei Liang
Zhiqiang Zhang
RALM
57
10
0
18 Jul 2024
KV Cache Compression, But What Must We Give in Return? A Comprehensive
  Benchmark of Long Context Capable Approaches
KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches
Jiayi Yuan
Hongyi Liu
Shaochen
Zhong
Yu-Neng Chuang
...
Hongye Jin
V. Chaudhary
Zhaozhuo Xu
Zirui Liu
Xia Hu
28
17
0
01 Jul 2024
Too Late to Train, Too Early To Use? A Study on Necessity and Viability
  of Low-Resource Bengali LLMs
Too Late to Train, Too Early To Use? A Study on Necessity and Viability of Low-Resource Bengali LLMs
Tamzeed Mahfuz
Satak Kumar Dey
Ruwad Naswan
Hasnaen Adil
Khondker Salman Sayeed
Haz Sameen Shahgir
16
0
0
29 Jun 2024
On the Transformations across Reward Model, Parameter Update, and
  In-Context Prompt
On the Transformations across Reward Model, Parameter Update, and In-Context Prompt
Deng Cai
Huayang Li
Tingchen Fu
Siheng Li
Weiwen Xu
...
Leyang Cui
Yan Wang
Lemao Liu
Taro Watanabe
Shuming Shi
KELM
26
2
0
24 Jun 2024
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
Terry Yue Zhuo
Minh Chien Vu
Jenny Chim
Han Hu
Wenhao Yu
...
David Lo
Daniel Fried
Xiaoning Du
H. D. Vries
Leandro von Werra
65
125
0
22 Jun 2024
UltraMedical: Building Specialized Generalists in Biomedicine
UltraMedical: Building Specialized Generalists in Biomedicine
Kaiyan Zhang
Sihang Zeng
Ermo Hua
Ning Ding
Zhang-Ren Chen
...
Xuekai Zhu
Xingtai Lv
Hu Jinfang
Zhiyuan Liu
Bowen Zhou
LM&MA
39
19
0
06 Jun 2024
Mitigate Position Bias in Large Language Models via Scaling a Single
  Dimension
Mitigate Position Bias in Large Language Models via Scaling a Single Dimension
Yijiong Yu
Huiqiang Jiang
Xufang Luo
Qianhui Wu
Chin-Yew Lin
Dongsheng Li
Yuqing Yang
Yongfeng Huang
L. Qiu
35
9
0
04 Jun 2024
COMAE: COMprehensive Attribute Exploration for Zero-shot Hashing
COMAE: COMprehensive Attribute Exploration for Zero-shot Hashing
Yuqi Li
Qingqing Long
Yihang Zhou
Yuchen Yan
Xiao Luo
Zeyu Dong
Xuezhi Wang
Zhen Meng
Pengfei Wang
VLM
39
3
0
26 Feb 2024
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models
Keisuke Kamahori
Tian Tang
Yile Gu
Kan Zhu
Baris Kasikci
56
17
0
10 Feb 2024
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek-AI Xiao Bi
:
Xiao Bi
Deli Chen
Guanting Chen
...
Yao Zhao
Shangyan Zhou
Shunfeng Zhou
Qihao Zhu
Yuheng Zou
LRM
ALM
133
298
0
05 Jan 2024
AlignBench: Benchmarking Chinese Alignment of Large Language Models
AlignBench: Benchmarking Chinese Alignment of Large Language Models
Xiao Liu
Xuanyu Lei
Sheng-Ping Wang
Yue Huang
Zhuoer Feng
...
Hongning Wang
Jing Zhang
Minlie Huang
Yuxiao Dong
Jie Tang
ELM
LM&MA
ALM
114
41
0
30 Nov 2023
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
236
1,508
0
31 Dec 2020
Previous
12