ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.12731
  4. Cited By
ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training
  for Language Understanding and Generation

ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation

23 December 2021
Shuohuan Wang
Yu Sun
Yang Xiang
Zhihua Wu
Siyu Ding
Weibao Gong
Shi Feng
Junyuan Shang
Yanbin Zhao
Chao Pang
Jiaxiang Liu
Xuyi Chen
Yuxiang Lu
Weixin Liu
Xi Wang
Yangfan Bai
Qiuliang Chen
Li Zhao
Shiyong Li
Peng Sun
Dianhai Yu
Yanjun Ma
Hao Tian
Hua-Hong Wu
Tian Wu
Wei Zeng
Ge Li
Wen Gao
Haifeng Wang
    ELM
ArXivPDFHTML

Papers citing "ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation"

50 / 50 papers shown
Title
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Jinhao Li
Jiaming Xu
Shan Huang
Yonghua Chen
Wen Li
...
Jiayi Pan
Li Ding
Hao Zhou
Yu Wang
Guohao Dai
46
13
0
06 Oct 2024
A Survey of Large Language Models for European Languages
A Survey of Large Language Models for European Languages
Wazir Ali
S. Pyysalo
39
2
0
27 Aug 2024
Recent Advances in Multi-Choice Machine Reading Comprehension: A Survey
  on Methods and Datasets
Recent Advances in Multi-Choice Machine Reading Comprehension: A Survey on Methods and Datasets
Shima Foolad
Kourosh Kiani
R. Rastgoo
FaML
26
0
0
04 Aug 2024
A Survey on Symbolic Knowledge Distillation of Large Language Models
A Survey on Symbolic Knowledge Distillation of Large Language Models
Kamal Acharya
Alvaro Velasquez
H. Song
SyDa
26
4
0
12 Jul 2024
PharmaGPT: Domain-Specific Large Language Models for Bio-Pharmaceutical
  and Chemistry
PharmaGPT: Domain-Specific Large Language Models for Bio-Pharmaceutical and Chemistry
Linqing Chen
Weilei Wang
Zilong Bai
Peng Xu
Yan Fang
...
Lisha Zhang
Fu Bian
Zhongkai Ye
Lidong Pei
Changyang Tu
AI4MH
LM&MA
34
2
0
26 Jun 2024
A Survey on Human Preference Learning for Large Language Models
A Survey on Human Preference Learning for Large Language Models
Ruili Jiang
Kehai Chen
Xuefeng Bai
Zhixuan He
Juntao Li
Muyun Yang
Tiejun Zhao
Liqiang Nie
Min Zhang
39
8
0
17 Jun 2024
CRE-LLM: A Domain-Specific Chinese Relation Extraction Framework with
  Fine-tuned Large Language Model
CRE-LLM: A Domain-Specific Chinese Relation Extraction Framework with Fine-tuned Large Language Model
Zhengpeng Shi
Haoran Luo
LRM
ALM
28
2
0
28 Apr 2024
SOS-1K: A Fine-grained Suicide Risk Classification Dataset for Chinese
  Social Media Analysis
SOS-1K: A Fine-grained Suicide Risk Classification Dataset for Chinese Social Media Analysis
Hongzhi Qi
Hanfei Liu
Jianqiang Li
Qing Zhao
Wei-dong Zhai
Dan Luo
Tianyu He
Shuo Liu
Bing Xiang Yang
Guanghui Fu
21
1
0
19 Apr 2024
A Two Dimensional Feature Engineering Method for Relation Extraction
A Two Dimensional Feature Engineering Method for Relation Extraction
Hao Wang
Yanping Chen
Weizhe Yang
Yongbin Qin
Ruizhang Huang
33
1
0
07 Apr 2024
Large Language Models Need Consultants for Reasoning: Becoming an Expert
  in a Complex Human System Through Behavior Simulation
Large Language Models Need Consultants for Reasoning: Becoming an Expert in a Complex Human System Through Behavior Simulation
Chuwen Wang
Shirong Zeng
Cheng Wang
LLMAG
LRM
19
2
0
27 Mar 2024
Pragmatic Competence Evaluation of Large Language Models for Korean
Pragmatic Competence Evaluation of Large Language Models for Korean
Dojun Park
Jiwoo Lee
Hyeyun Jeong
Seohyun Park
Sungeun Lee
ELM
30
2
0
19 Mar 2024
Who is leading in AI? An analysis of industry AI research
Who is leading in AI? An analysis of industry AI research
Ben Cottier
T. Besiroglu
David Owen
20
2
0
24 Nov 2023
Robot Learning in the Era of Foundation Models: A Survey
Robot Learning in the Era of Foundation Models: A Survey
Xuan Xiao
Jiahang Liu
Zhipeng Wang
Yanmin Zhou
Yong Qi
Qian Cheng
Bin He
Shuo Jiang
AI4CE
LM&Ro
13
25
0
24 Nov 2023
Improving Prompt Tuning with Learned Prompting Layers
Improving Prompt Tuning with Learned Prompting Layers
Wei Zhu
Ming Tan
VLM
15
1
0
31 Oct 2023
Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in
  Large Language Models
Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in Large Language Models
Wenxuan Wang
Wenxiang Jiao
Jingyuan Huang
Ruyi Dai
Jen-tse Huang
Zhaopeng Tu
Michael R. Lyu
14
27
0
19 Oct 2023
CodeFuse-13B: A Pretrained Multi-lingual Code Large Language Model
CodeFuse-13B: A Pretrained Multi-lingual Code Large Language Model
Peng Di
Jianguo Li
Hang Yu
Wei Jiang
Wenting Cai
...
Zelin Zhao
Xunjin Zheng
Hailian Zhou
Lifu Zhu
Xianying Zhu
ELM
ALM
AI4CE
21
12
0
10 Oct 2023
Can LLM-Generated Misinformation Be Detected?
Can LLM-Generated Misinformation Be Detected?
Canyu Chen
Kai Shu
DeLMO
27
144
0
25 Sep 2023
FLM-101B: An Open LLM and How to Train It with $100K Budget
FLM-101B: An Open LLM and How to Train It with 100KBudget100K Budget100KBudget
Xiang Li
Yiqun Yao
Xin Jiang
Xuezhi Fang
Xuying Meng
...
LI DU
Bowen Qin
Zheng-Wei Zhang
Aixin Sun
Yequan Wang
42
21
0
07 Sep 2023
Supervised Learning and Large Language Model Benchmarks on Mental Health
  Datasets: Cognitive Distortions and Suicidal Risks in Chinese Social Media
Supervised Learning and Large Language Model Benchmarks on Mental Health Datasets: Cognitive Distortions and Suicidal Risks in Chinese Social Media
Hongzhi Qi
Qing Zhao
Jianqiang Li
Changwei Song
Wei-dong Zhai
...
Y. Yu
Fan Wang
Huijing Zou
Bing Xiang Yang
Guanghui Fu
AI4MH
15
12
0
07 Sep 2023
Enhance Multi-domain Sentiment Analysis of Review Texts through
  Prompting Strategies
Enhance Multi-domain Sentiment Analysis of Review Texts through Prompting Strategies
Yajing Wang
Zongwei Luo
LRM
6
5
0
05 Sep 2023
UniSA: Unified Generative Framework for Sentiment Analysis
UniSA: Unified Generative Framework for Sentiment Analysis
Zaijing Li
Ting-En Lin
Yuchuan Wu
Meng Liu
Fengxiao Tang
Mingde Zhao
Yongbin Li
21
16
0
04 Sep 2023
RenAIssance: A Survey into AI Text-to-Image Generation in the Era of
  Large Model
RenAIssance: A Survey into AI Text-to-Image Generation in the Era of Large Model
Fengxiang Bie
Yibo Yang
Zhongzhu Zhou
Adam Ghanem
Minjia Zhang
...
Pareesa Ameneh Golnari
David A. Clifton
Yuxiong He
Dacheng Tao
S. Song
EGVM
15
15
0
02 Sep 2023
A Comprehensive Overview of Large Language Models
A Comprehensive Overview of Large Language Models
Humza Naveed
Asad Ullah Khan
Shi Qiu
Muhammad Saqib
Saeed Anwar
Muhammad Usman
Naveed Akhtar
Nick Barnes
Ajmal Saeed Mian
OffRL
43
499
0
12 Jul 2023
GPT-SW3: An Autoregressive Language Model for the Nordic Languages
GPT-SW3: An Autoregressive Language Model for the Nordic Languages
Ariel Ekgren
Amaru Cuba Gyllensten
Felix Stollenwerk
Joey Öhman
T. Isbister
Evangelia Gogoulou
F. Carlsson
Alice Heiman
Judit Casademont
Magnus Sahlgren
10
13
0
22 May 2023
Lifting the Curse of Capacity Gap in Distilling Language Models
Lifting the Curse of Capacity Gap in Distilling Language Models
Chen Zhang
Yang Yang
Jiahao Liu
Jingang Wang
Yunsen Xian
Benyou Wang
Dawei Song
MoE
12
19
0
20 May 2023
M3KE: A Massive Multi-Level Multi-Subject Knowledge Evaluation Benchmark
  for Chinese Large Language Models
M3KE: A Massive Multi-Level Multi-Subject Knowledge Evaluation Benchmark for Chinese Large Language Models
Chuang Liu
Renren Jin
Yuqi Ren
Linhao Yu
Tianyu Dong
...
Peiyi Zhang
Qingqing Lyu
Xiaowen Su
Qun Liu
Deyi Xiong
ELM
ALM
11
24
0
17 May 2023
Should ChatGPT and Bard Share Revenue with Their Data Providers? A New
  Business Model for the AI Era
Should ChatGPT and Bard Share Revenue with Their Data Providers? A New Business Model for the AI Era
Dong Zhang
13
2
0
04 May 2023
Cerebras-GPT: Open Compute-Optimal Language Models Trained on the
  Cerebras Wafer-Scale Cluster
Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster
Nolan Dey
Gurpreet Gosal
Zhiming Chen
Chen
Hemant Khachane
William Marshall
Ribhu Pathria
Marvin Tom
Joel Hestness
MoE
LRM
17
98
0
06 Apr 2023
PanGu-Σ: Towards Trillion Parameter Language Model with Sparse
  Heterogeneous Computing
PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing
Xiaozhe Ren
Pingyi Zhou
Xinfan Meng
Xinjing Huang
Yadao Wang
...
Jiansheng Wei
Xin Jiang
Teng Su
Qun Liu
Jun Yao
ALM
MoE
53
59
0
20 Mar 2023
MetaAID 2.0: An Extensible Framework for Developing Metaverse
  Applications via Human-controllable Pre-trained Models
MetaAID 2.0: An Extensible Framework for Developing Metaverse Applications via Human-controllable Pre-trained Models
Hongyin Zhu
12
5
0
25 Feb 2023
TikTalk: A Video-Based Dialogue Dataset for Multi-Modal Chitchat in Real
  World
TikTalk: A Video-Based Dialogue Dataset for Multi-Modal Chitchat in Real World
Hongpeng Lin
Ludan Ruan
Wenke Xia
Peiyu Liu
Jing Wen
...
Di Hu
Ruihua Song
Wayne Xin Zhao
Qin Jin
Zhiwu Lu
VGen
16
9
0
14 Jan 2023
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
22
2,297
0
09 Nov 2022
PLATO-K: Internal and External Knowledge Enhanced Dialogue Generation
PLATO-K: Internal and External Knowledge Enhanced Dialogue Generation
Siqi Bao
H. He
Jun Xu
Hua Lu
Fan Wang
Hua-Hong Wu
Han Zhou
Wenquan Wu
Zheng-Yu Niu
Haifeng Wang
14
4
0
02 Nov 2022
Clip-Tuning: Towards Derivative-free Prompt Learning with a Mixture of
  Rewards
Clip-Tuning: Towards Derivative-free Prompt Learning with a Mixture of Rewards
Yekun Chai
Shuohuan Wang
Yu Sun
Hao Tian
Hua-Hong Wu
Haifeng Wang
VLM
11
17
0
21 Oct 2022
Late Prompt Tuning: A Late Prompt Could Be Better Than Many Prompts
Late Prompt Tuning: A Late Prompt Could Be Better Than Many Prompts
Xiangyang Liu
Tianxiang Sun
Xuanjing Huang
Xipeng Qiu
VLM
15
27
0
20 Oct 2022
GLM-130B: An Open Bilingual Pre-trained Model
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng-Zhen Zhang
Yuxiao Dong
Jie Tang
BDL
LRM
240
1,070
0
05 Oct 2022
WeLM: A Well-Read Pre-trained Language Model for Chinese
WeLM: A Well-Read Pre-trained Language Model for Chinese
Hui Su
Xiao Zhou
Houjin Yu
Xiaoyu Shen
Yuwen Chen
Zilin Zhu
Yang Yu
Jie Zhou
11
22
0
21 Sep 2022
A Theoretical View on Sparsely Activated Networks
A Theoretical View on Sparsely Activated Networks
Cenk Baykal
Nishanth Dikkala
Rina Panigrahy
Cyrus Rashtchian
Xin Wang
14
10
0
08 Aug 2022
Nebula-I: A General Framework for Collaboratively Training Deep Learning
  Models on Low-Bandwidth Cloud Clusters
Nebula-I: A General Framework for Collaboratively Training Deep Learning Models on Low-Bandwidth Cloud Clusters
Yang Xiang
Zhihua Wu
Weibao Gong
Siyu Ding
Xianjie Mo
...
Yue Yu
Ge Li
Yu Sun
Yanjun Ma
Dianhai Yu
6
4
0
19 May 2022
EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with
  Large-Scale Pre-Training
EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with Large-Scale Pre-Training
Yuxian Gu
Jiaxin Wen
Hao-Lun Sun
Yi Song
Pei Ke
...
Zheng Zhang
Jianzhu Yao
Lei Liu
Xiaoyan Zhu
Minlie Huang
16
55
0
17 Mar 2022
ST-MoE: Designing Stable and Transferable Sparse Expert Models
ST-MoE: Designing Stable and Transferable Sparse Expert Models
Barret Zoph
Irwan Bello
Sameer Kumar
Nan Du
Yanping Huang
J. Dean
Noam M. Shazeer
W. Fedus
MoE
11
97
0
17 Feb 2022
HeterPS: Distributed Deep Learning With Reinforcement Learning Based
  Scheduling in Heterogeneous Environments
HeterPS: Distributed Deep Learning With Reinforcement Learning Based Scheduling in Heterogeneous Environments
Ji Liu
Zhihua Wu
Dianhai Yu
Yanjun Ma
Danlei Feng
Minxu Zhang
Xinxuan Wu
Xuefeng Yao
Dejing Dou
8
43
0
20 Nov 2021
M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion
  Parameter Pretraining
M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining
Junyang Lin
An Yang
Jinze Bai
Chang Zhou
Le Jiang
...
Jie M. Zhang
Yong Li
Wei Lin
Jingren Zhou
Hongxia Yang
MoE
84
42
0
08 Oct 2021
PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation
PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation
Siqi Bao
H. He
Fan Wang
Hua-Hong Wu
Haifeng Wang
...
Xinxian Huang
Xin Tian
Xinchao Xu
Yingzhan Lin
Zhengyu Niu
VLM
ALM
13
56
0
20 Sep 2021
Tailor: Generating and Perturbing Text with Semantic Controls
Tailor: Generating and Perturbing Text with Semantic Controls
Alexis Ross
Tongshuang Wu
Hao Peng
Matthew E. Peters
Matt Gardner
116
77
0
15 Jul 2021
ZEN 2.0: Continue Training and Adaption for N-gram Enhanced Text
  Encoders
ZEN 2.0: Continue Training and Adaption for N-gram Enhanced Text Encoders
Yan Song
Tong Zhang
Yonggang Wang
Kai-Fu Lee
24
42
0
04 May 2021
Carbon Emissions and Large Neural Network Training
Carbon Emissions and Large Neural Network Training
David A. Patterson
Joseph E. Gonzalez
Quoc V. Le
Chen Liang
Lluís-Miquel Munguía
D. Rothchild
David R. So
Maud Texier
J. Dean
AI4CE
236
626
0
21 Apr 2021
ERNIE-Doc: A Retrospective Long-Document Modeling Transformer
ERNIE-Doc: A Retrospective Long-Document Modeling Transformer
Siyu Ding
Junyuan Shang
Shuohuan Wang
Yu Sun
Hao Tian
Hua-Hong Wu
Haifeng Wang
60
52
0
31 Dec 2020
Exploiting Cloze Questions for Few Shot Text Classification and Natural
  Language Inference
Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference
Timo Schick
Hinrich Schütze
251
1,382
0
21 Jan 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
243
1,791
0
17 Sep 2019
1