ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.01068
  4. Cited By
OPT: Open Pre-trained Transformer Language Models
v1v2v3v4 (latest)

OPT: Open Pre-trained Transformer Language Models

2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
    VLMOSLMAI4CE
ArXiv (abs)PDFHTMLHuggingFace (2 upvotes)

Papers citing "OPT: Open Pre-trained Transformer Language Models"

50 / 2,924 papers shown
The Closeness of In-Context Learning and Weight Shifting for Softmax
  Regression
The Closeness of In-Context Learning and Weight Shifting for Softmax RegressionNeural Information Processing Systems (NeurIPS), 2023
Shuai Li
Zhao Song
Yu Xia
Tong Yu
Wanrong Zhu
201
49
0
26 Apr 2023
The Internal State of an LLM Knows When It's Lying
The Internal State of an LLM Knows When It's LyingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
A. Azaria
Tom Michael Mitchell
HILM
649
492
0
26 Apr 2023
SCM: Enhancing Large Language Model with Self-Controlled Memory Framework
SCM: Enhancing Large Language Model with Self-Controlled Memory Framework
Bin Wang
Xinnian Liang
Jian Yang
Huijia Huang
Shuangzhi Wu
Peihao Wu
Lu Lu
Zejun Ma
Zhoujun Li
LLMAGKELMRALM
383
63
0
26 Apr 2023
Stable and low-precision training for large-scale vision-language models
Stable and low-precision training for large-scale vision-language modelsNeural Information Processing Systems (NeurIPS), 2023
Mitchell Wortsman
Tim Dettmers
Luke Zettlemoyer
Ari S. Morcos
Ali Farhadi
Ludwig Schmidt
MQMLLMVLM
330
71
0
25 Apr 2023
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking
  Head
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking HeadAAAI Conference on Artificial Intelligence (AAAI), 2023
Rongjie Huang
Mingze Li
Dongchao Yang
Jiatong Shi
Xuankai Chang
...
Jia-Bin Huang
Jinglin Liu
Yixiang Ren
Zhou Zhao
Shinji Watanabe
LM&MAAuLLM
252
335
0
25 Apr 2023
PEFT-Ref: A Modular Reference Architecture and Typology for
  Parameter-Efficient Finetuning Techniques
PEFT-Ref: A Modular Reference Architecture and Typology for Parameter-Efficient Finetuning Techniques
Mohammed Sabry
Anya Belz
284
9
0
24 Apr 2023
Better Question-Answering Models on a Budget
Better Question-Answering Models on a Budget
Yudhanjaya Wijeratne
Ishan Marikar
ALM
66
0
0
24 Apr 2023
LLM+P: Empowering Large Language Models with Optimal Planning
  Proficiency
LLM+P: Empowering Large Language Models with Optimal Planning Proficiency
B. Liu
Yuqian Jiang
Xiaohan Zhang
Qian Liu
Shiqi Zhang
Joydeep Biswas
Peter Stone
LM&RoLLMAG
493
547
0
22 Apr 2023
Pipeline MoE: A Flexible MoE Implementation with Pipeline Parallelism
Pipeline MoE: A Flexible MoE Implementation with Pipeline Parallelism
Xin Chen
Hengheng Zhang
Xiaotao Gu
Kaifeng Bi
Lingxi Xie
Qi Tian
MoE
103
5
0
22 Apr 2023
ChatABL: Abductive Learning via Natural Language Interaction with
  ChatGPT
ChatABL: Abductive Learning via Natural Language Interaction with ChatGPTIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023
Tianyang Zhong
Yaonai Wei
Li Yang
Zihao Wu
Zheng Liu
...
Xi Jiang
Jun-Feng Han
Hongtu Zhu
Tianming Liu
Tuo Zhang
LRM
192
34
0
21 Apr 2023
MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large
  Language Models
MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Deyao Zhu
Jun Chen
Xiaoqian Shen
Xiang Li
Mohamed Elhoseiny
VLMMLLM
473
2,742
0
20 Apr 2023
Learning to Plan with Natural Language
Learning to Plan with Natural LanguageConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yiduo Guo
Yaobo Liang
Chenfei Wu
Wenshan Wu
Dongyan Zhao
Nan Duan
LLMAGLRM
210
6
0
20 Apr 2023
Attention Scheme Inspired Softmax Regression
Attention Scheme Inspired Softmax Regression
Yichuan Deng
Zhihang Li
Zhao Song
293
47
0
20 Apr 2023
Scaling Transformer to 1M tokens and beyond with RMT
Scaling Transformer to 1M tokens and beyond with RMT
Aydar Bulatov
Yuri Kuratov
Yermek Kapushev
Andrey Kravchenko
LRM
339
111
0
19 Apr 2023
A Theory on Adam Instability in Large-Scale Machine Learning
A Theory on Adam Instability in Large-Scale Machine Learning
Igor Molybog
Peter Albert
Moya Chen
Zach DeVito
David Esiobu
...
Puxin Xu
Yuchen Zhang
Melanie Kambadur
Stephen Roller
Susan Zhang
AI4CE
200
47
0
19 Apr 2023
Outlier Suppression+: Accurate quantization of large language models by
  equivalent and optimal shifting and scaling
Outlier Suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scalingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Xiuying Wei
Yunchen Zhang
Yuhang Li
Xiangguo Zhang
Yazhe Niu
Jian Ren
Zhengang Li
MQ
279
58
0
18 Apr 2023
Visual Instruction Tuning
Visual Instruction TuningNeural Information Processing Systems (NeurIPS), 2023
Haotian Liu
Chunyuan Li
Qingyang Wu
Yong Jae Lee
SyDaVLMMLLM
1.2K
7,615
0
17 Apr 2023
LongForm: Effective Instruction Tuning with Reverse Instructions
LongForm: Effective Instruction Tuning with Reverse InstructionsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Abdullatif Köksal
Timo Schick
Anna Korhonen
Hinrich Schütze
SyDaALM
279
48
0
17 Apr 2023
Supporting Qualitative Analysis with Large Language Models: Combining
  Codebook with GPT-3 for Deductive Coding
Supporting Qualitative Analysis with Large Language Models: Combining Codebook with GPT-3 for Deductive Coding
Ziang Xiao
Xingdi Yuan
Q. V. Liao
Rania Abdelghani
Pierre-Yves Oudeyer
203
222
0
17 Apr 2023
An Evaluation on Large Language Model Outputs: Discourse and Memorization
An Evaluation on Large Language Model Outputs: Discourse and MemorizationNatural Language Processing Journal (JNLP), 2023
Adrian de Wynter
Xun Wang
Alex Sokolov
Qilong Gu
Si-Qing Chen
ELM
335
42
0
17 Apr 2023
Towards Better Instruction Following Language Models for Chinese:
  Investigating the Impact of Training Data and Evaluation
Towards Better Instruction Following Language Models for Chinese: Investigating the Impact of Training Data and Evaluation
Yunjie Ji
Yan Gong
Yong Deng
Yiping Peng
Qiang Niu
Baochang Ma
Xiangang Li
ALMELM
242
28
0
16 Apr 2023
On the Opportunities and Challenges of Foundation Models for Geospatial
  Artificial Intelligence
On the Opportunities and Challenges of Foundation Models for Geospatial Artificial Intelligence
Gengchen Mai
Weiming Huang
Jin Sun
Suhang Song
Deepak Mishra
...
Yingjie Hu
Chris Cundy
Ziyuan Li
Rui Zhu
Ni Lao
AI4CE
320
154
0
13 Apr 2023
ChatGPT Needs SPADE (Sustainability, PrivAcy, Digital divide, and Ethics) Evaluation: A Review
ChatGPT Needs SPADE (Sustainability, PrivAcy, Digital divide, and Ethics) Evaluation: A ReviewCognitive Computation (Cogn. Comput.), 2023
Sunder Ali Khowaja
P. Khuwaja
Kapal Dev
Weizheng Wang
Lewis Nkenyereye
402
134
0
13 Apr 2023
Solving Tensor Low Cycle Rank Approximation
Solving Tensor Low Cycle Rank ApproximationBigData Congress [Services Society] (BSS), 2023
Yichuan Deng
Yeqi Gao
Zhao Song
195
7
0
13 Apr 2023
Are LLMs All You Need for Task-Oriented Dialogue?
Are LLMs All You Need for Task-Oriented Dialogue?SIGDIAL Conferences (SIGDIAL), 2023
Vojtvech Hudevcek
Ondrej Dusek
227
76
0
13 Apr 2023
AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models
AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models
Wanjun Zhong
Ruixiang Cui
Yiduo Guo
Yaobo Liang
Shuai Lu
Yanlin Wang
Amin Saied
Weizhu Chen
Nan Duan
ALMELM
380
731
0
13 Apr 2023
ImageReward: Learning and Evaluating Human Preferences for Text-to-Image
  Generation
ImageReward: Learning and Evaluating Human Preferences for Text-to-Image GenerationNeural Information Processing Systems (NeurIPS), 2023
Jiazheng Xu
Xiao Liu
Yuchen Wu
Yuxuan Tong
Qinkai Li
Ming Ding
Jie Tang
Yuxiao Dong
581
754
0
12 Apr 2023
HiPrompt: Few-Shot Biomedical Knowledge Fusion via Hierarchy-Oriented
  Prompting
HiPrompt: Few-Shot Biomedical Knowledge Fusion via Hierarchy-Oriented PromptingAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2023
Jiaying Lu
Jiaming Shen
Bo Xiong
Wenjing Ma
Steffen Staab
Carl Yang
165
16
0
12 Apr 2023
ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large
  Language Models in Multilingual Learning
ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large Language Models in Multilingual LearningConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Viet Dac Lai
Nghia Trung Ngo
Amir Pouran Ben Veyseh
Hieu Man
Franck Dernoncourt
Trung Bui
Thien Huu Nguyen
ELMLM&MA
263
362
0
12 Apr 2023
User Adaptive Language Learning Chatbots with a Curriculum
User Adaptive Language Learning Chatbots with a CurriculumInternational Conference on Artificial Intelligence in Education (AIED), 2023
Kun Qian
Ryan Shea
Yu Li
Luke K. Fryer
Zhou Yu
199
13
0
11 Apr 2023
Multilingual Machine Translation with Large Language Models: Empirical
  Results and Analysis
Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis
Wenhao Zhu
Hongyi Liu
Qingxiu Dong
Jingjing Xu
Shujian Huang
Lingpeng Kong
Jiajun Chen
Lei Li
LRM
373
229
0
10 Apr 2023
Randomized and Deterministic Attention Sparsification Algorithms for
  Over-parameterized Feature Dimension
Randomized and Deterministic Attention Sparsification Algorithms for Over-parameterized Feature Dimension
Yichuan Deng
Sridhar Mahadevan
Zhao Song
199
37
0
10 Apr 2023
OpenAGI: When LLM Meets Domain Experts
OpenAGI: When LLM Meets Domain ExpertsNeural Information Processing Systems (NeurIPS), 2023
Yingqiang Ge
Qingfeng Lan
Kai Mei
Jianchao Ji
Juntao Tan
Shuyuan Xu
Zelong Li
Zelong Li
VLMLRM
330
310
0
10 Apr 2023
Is ChatGPT a Good Sentiment Analyzer? A Preliminary Study
Is ChatGPT a Good Sentiment Analyzer? A Preliminary Study
Zengzhi Wang
Qiming Xie
Yi Feng
Zixiang Ding
Zinong Yang
Rui Xia
AI4MHLLMAG
332
193
0
10 Apr 2023
A Preliminary Evaluation of ChatGPT for Zero-shot Dialogue Understanding
A Preliminary Evaluation of ChatGPT for Zero-shot Dialogue Understanding
Wenbo Pan
Qiguang Chen
Xiao Xu
Wanxiang Che
Libo Qin
238
53
0
09 Apr 2023
Decoder-Only or Encoder-Decoder? Interpreting Language Model as a
  Regularized Encoder-Decoder
Decoder-Only or Encoder-Decoder? Interpreting Language Model as a Regularized Encoder-Decoder
Z. Fu
W. Lam
Qian Yu
Anthony Man-Cho So
Shengding Hu
Zhiyuan Liu
Nigel Collier
AuLLM
175
61
0
08 Apr 2023
From Retrieval to Generation: Efficient and Effective Entity Set
  Expansion
From Retrieval to Generation: Efficient and Effective Entity Set ExpansionInternational Conference on Information and Knowledge Management (CIKM), 2023
Shulin Huang
Shirong Ma
Yongqian Li
Hai-Tao Zheng
Yong Jiang
Haitao Zheng
Ying Shen
408
6
0
07 Apr 2023
Instruction Tuning with GPT-4
Instruction Tuning with GPT-4
Baolin Peng
Chunyuan Li
Pengcheng He
Michel Galley
Jianfeng Gao
SyDaALMLM&MA
493
752
0
06 Apr 2023
Cerebras-GPT: Open Compute-Optimal Language Models Trained on the
  Cerebras Wafer-Scale Cluster
Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster
Nolan Dey
Gurpreet Gosal
Zhiming Chen
Chen
Hemant Khachane
William Marshall
Ribhu Pathria
Marvin Tom
Joel Hestness
MoELRM
348
124
0
06 Apr 2023
Zero-Shot Next-Item Recommendation using Large Pretrained Language
  Models
Zero-Shot Next-Item Recommendation using Large Pretrained Language Models
Lei Wang
Ee-Peng Lim
LRM
165
83
0
06 Apr 2023
Conceptual structure coheres in human cognition but not in large
  language models
Conceptual structure coheres in human cognition but not in large language modelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Siddharth Suresh
Kushin Mukherjee
Xizheng Yu
Wei-Chun Huang
Lisa Padua
Timothy T. Rogers
288
13
0
05 Apr 2023
Adopting Two Supervisors for Efficient Use of Large-Scale Remote Deep
  Neural Networks
Adopting Two Supervisors for Efficient Use of Large-Scale Remote Deep Neural NetworksACM Transactions on Software Engineering and Methodology (TOSEM), 2023
Michael Weiss
Paolo Tonella
AI4CE
191
1
0
05 Apr 2023
Scalable and Accurate Self-supervised Multimodal Representation Learning
  without Aligned Video and Text Data
Scalable and Accurate Self-supervised Multimodal Representation Learning without Aligned Video and Text Data
Vladislav Lialin
Stephen Rawls
David M. Chan
Shalini Ghosh
Anna Rumshisky
Wael Hamza
VLMAI4TS
288
8
0
04 Apr 2023
Effective Theory of Transformers at Initialization
Effective Theory of Transformers at Initialization
Emily Dinan
Sho Yaida
Susan Zhang
179
20
0
04 Apr 2023
LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of
  Large Language Models
LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Zhiqiang Hu
Lei Wang
Yihuai Lan
Wanyu Xu
Ee-Peng Lim
Lidong Bing
Xing Xu
Soujanya Poria
Roy Ka-wei Lee
ALM
318
388
0
04 Apr 2023
Resources and Few-shot Learners for In-context Learning in Slavic
  Languages
Resources and Few-shot Learners for In-context Learning in Slavic LanguagesWorkshop on Balto-Slavic Natural Language Processing (BSNLP), 2023
Michal vStefánik
Marek Kadlcík
Piotr Gramacki
Petr Sojka
167
5
0
04 Apr 2023
Mastering Symbolic Operations: Augmenting Language Models with Compiled
  Neural Networks
Mastering Symbolic Operations: Augmenting Language Models with Compiled Neural NetworksInternational Conference on Learning Representations (ICLR), 2023
Yixuan Weng
Minjun Zhu
Fei Xia
Bin Li
Shizhu He
Kang Liu
Jun Zhao
505
12
0
04 Apr 2023
Pythia: A Suite for Analyzing Large Language Models Across Training and
  Scaling
Pythia: A Suite for Analyzing Large Language Models Across Training and ScalingInternational Conference on Machine Learning (ICML), 2023
Stella Biderman
Hailey Schoelkopf
Quentin G. Anthony
Herbie Bradley
Kyle O'Brien
...
USVSN Sai Prashanth
Edward Raff
Aviya Skowron
Lintang Sutawika
Oskar van der Wal
397
1,641
0
03 Apr 2023
RPTQ: Reorder-based Post-training Quantization for Large Language Models
RPTQ: Reorder-based Post-training Quantization for Large Language Models
Zhihang Yuan
Lin Niu
Jia-Wen Liu
Wenyu Liu
Xinggang Wang
Yuzhang Shang
Guangyu Sun
Qiang Wu
Jiaxiang Wu
Bingzhe Wu
MQ
593
113
0
03 Apr 2023
Can the Inference Logic of Large Language Models be Disentangled into
  Symbolic Concepts?
Can the Inference Logic of Large Language Models be Disentangled into Symbolic Concepts?
Wen Shen
Lei Cheng
Yuxiao Yang
Mingjie Li
Quanshi Zhang
LRM
196
10
0
03 Apr 2023
Previous
123...505152...575859
Next
Page 51 of 59
Pageof 59