Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2205.01068
Cited By
v1
v2
v3
v4 (latest)
OPT: Open Pre-trained Transformer Language Models
2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLM
OSLM
AI4CE
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (2 upvotes)
Papers citing
"OPT: Open Pre-trained Transformer Language Models"
50 / 2,924 papers shown
The Closeness of In-Context Learning and Weight Shifting for Softmax Regression
Neural Information Processing Systems (NeurIPS), 2023
Shuai Li
Zhao Song
Yu Xia
Tong Yu
Wanrong Zhu
201
49
0
26 Apr 2023
The Internal State of an LLM Knows When It's Lying
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
A. Azaria
Tom Michael Mitchell
HILM
649
492
0
26 Apr 2023
SCM: Enhancing Large Language Model with Self-Controlled Memory Framework
Bin Wang
Xinnian Liang
Jian Yang
Huijia Huang
Shuangzhi Wu
Peihao Wu
Lu Lu
Zejun Ma
Zhoujun Li
LLMAG
KELM
RALM
383
63
0
26 Apr 2023
Stable and low-precision training for large-scale vision-language models
Neural Information Processing Systems (NeurIPS), 2023
Mitchell Wortsman
Tim Dettmers
Luke Zettlemoyer
Ari S. Morcos
Ali Farhadi
Ludwig Schmidt
MQ
MLLM
VLM
330
71
0
25 Apr 2023
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
AAAI Conference on Artificial Intelligence (AAAI), 2023
Rongjie Huang
Mingze Li
Dongchao Yang
Jiatong Shi
Xuankai Chang
...
Jia-Bin Huang
Jinglin Liu
Yixiang Ren
Zhou Zhao
Shinji Watanabe
LM&MA
AuLLM
252
335
0
25 Apr 2023
PEFT-Ref: A Modular Reference Architecture and Typology for Parameter-Efficient Finetuning Techniques
Mohammed Sabry
Anya Belz
284
9
0
24 Apr 2023
Better Question-Answering Models on a Budget
Yudhanjaya Wijeratne
Ishan Marikar
ALM
66
0
0
24 Apr 2023
LLM+P: Empowering Large Language Models with Optimal Planning Proficiency
B. Liu
Yuqian Jiang
Xiaohan Zhang
Qian Liu
Shiqi Zhang
Joydeep Biswas
Peter Stone
LM&Ro
LLMAG
493
547
0
22 Apr 2023
Pipeline MoE: A Flexible MoE Implementation with Pipeline Parallelism
Xin Chen
Hengheng Zhang
Xiaotao Gu
Kaifeng Bi
Lingxi Xie
Qi Tian
MoE
103
5
0
22 Apr 2023
ChatABL: Abductive Learning via Natural Language Interaction with ChatGPT
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023
Tianyang Zhong
Yaonai Wei
Li Yang
Zihao Wu
Zheng Liu
...
Xi Jiang
Jun-Feng Han
Hongtu Zhu
Tianming Liu
Tuo Zhang
LRM
192
34
0
21 Apr 2023
MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models
International Conference on Learning Representations (ICLR), 2023
Deyao Zhu
Jun Chen
Xiaoqian Shen
Xiang Li
Mohamed Elhoseiny
VLM
MLLM
473
2,742
0
20 Apr 2023
Learning to Plan with Natural Language
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yiduo Guo
Yaobo Liang
Chenfei Wu
Wenshan Wu
Dongyan Zhao
Nan Duan
LLMAG
LRM
210
6
0
20 Apr 2023
Attention Scheme Inspired Softmax Regression
Yichuan Deng
Zhihang Li
Zhao Song
293
47
0
20 Apr 2023
Scaling Transformer to 1M tokens and beyond with RMT
Aydar Bulatov
Yuri Kuratov
Yermek Kapushev
Andrey Kravchenko
LRM
339
111
0
19 Apr 2023
A Theory on Adam Instability in Large-Scale Machine Learning
Igor Molybog
Peter Albert
Moya Chen
Zach DeVito
David Esiobu
...
Puxin Xu
Yuchen Zhang
Melanie Kambadur
Stephen Roller
Susan Zhang
AI4CE
200
47
0
19 Apr 2023
Outlier Suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Xiuying Wei
Yunchen Zhang
Yuhang Li
Xiangguo Zhang
Yazhe Niu
Jian Ren
Zhengang Li
MQ
279
58
0
18 Apr 2023
Visual Instruction Tuning
Neural Information Processing Systems (NeurIPS), 2023
Haotian Liu
Chunyuan Li
Qingyang Wu
Yong Jae Lee
SyDa
VLM
MLLM
1.2K
7,615
0
17 Apr 2023
LongForm: Effective Instruction Tuning with Reverse Instructions
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Abdullatif Köksal
Timo Schick
Anna Korhonen
Hinrich Schütze
SyDa
ALM
279
48
0
17 Apr 2023
Supporting Qualitative Analysis with Large Language Models: Combining Codebook with GPT-3 for Deductive Coding
Ziang Xiao
Xingdi Yuan
Q. V. Liao
Rania Abdelghani
Pierre-Yves Oudeyer
203
222
0
17 Apr 2023
An Evaluation on Large Language Model Outputs: Discourse and Memorization
Natural Language Processing Journal (JNLP), 2023
Adrian de Wynter
Xun Wang
Alex Sokolov
Qilong Gu
Si-Qing Chen
ELM
335
42
0
17 Apr 2023
Towards Better Instruction Following Language Models for Chinese: Investigating the Impact of Training Data and Evaluation
Yunjie Ji
Yan Gong
Yong Deng
Yiping Peng
Qiang Niu
Baochang Ma
Xiangang Li
ALM
ELM
242
28
0
16 Apr 2023
On the Opportunities and Challenges of Foundation Models for Geospatial Artificial Intelligence
Gengchen Mai
Weiming Huang
Jin Sun
Suhang Song
Deepak Mishra
...
Yingjie Hu
Chris Cundy
Ziyuan Li
Rui Zhu
Ni Lao
AI4CE
320
154
0
13 Apr 2023
ChatGPT Needs SPADE (Sustainability, PrivAcy, Digital divide, and Ethics) Evaluation: A Review
Cognitive Computation (Cogn. Comput.), 2023
Sunder Ali Khowaja
P. Khuwaja
Kapal Dev
Weizheng Wang
Lewis Nkenyereye
402
134
0
13 Apr 2023
Solving Tensor Low Cycle Rank Approximation
BigData Congress [Services Society] (BSS), 2023
Yichuan Deng
Yeqi Gao
Zhao Song
195
7
0
13 Apr 2023
Are LLMs All You Need for Task-Oriented Dialogue?
SIGDIAL Conferences (SIGDIAL), 2023
Vojtvech Hudevcek
Ondrej Dusek
227
76
0
13 Apr 2023
AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models
Wanjun Zhong
Ruixiang Cui
Yiduo Guo
Yaobo Liang
Shuai Lu
Yanlin Wang
Amin Saied
Weizhu Chen
Nan Duan
ALM
ELM
380
731
0
13 Apr 2023
ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation
Neural Information Processing Systems (NeurIPS), 2023
Jiazheng Xu
Xiao Liu
Yuchen Wu
Yuxuan Tong
Qinkai Li
Ming Ding
Jie Tang
Yuxiao Dong
581
754
0
12 Apr 2023
HiPrompt: Few-Shot Biomedical Knowledge Fusion via Hierarchy-Oriented Prompting
Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2023
Jiaying Lu
Jiaming Shen
Bo Xiong
Wenjing Ma
Steffen Staab
Carl Yang
165
16
0
12 Apr 2023
ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large Language Models in Multilingual Learning
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Viet Dac Lai
Nghia Trung Ngo
Amir Pouran Ben Veyseh
Hieu Man
Franck Dernoncourt
Trung Bui
Thien Huu Nguyen
ELM
LM&MA
263
362
0
12 Apr 2023
User Adaptive Language Learning Chatbots with a Curriculum
International Conference on Artificial Intelligence in Education (AIED), 2023
Kun Qian
Ryan Shea
Yu Li
Luke K. Fryer
Zhou Yu
199
13
0
11 Apr 2023
Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis
Wenhao Zhu
Hongyi Liu
Qingxiu Dong
Jingjing Xu
Shujian Huang
Lingpeng Kong
Jiajun Chen
Lei Li
LRM
373
229
0
10 Apr 2023
Randomized and Deterministic Attention Sparsification Algorithms for Over-parameterized Feature Dimension
Yichuan Deng
Sridhar Mahadevan
Zhao Song
199
37
0
10 Apr 2023
OpenAGI: When LLM Meets Domain Experts
Neural Information Processing Systems (NeurIPS), 2023
Yingqiang Ge
Qingfeng Lan
Kai Mei
Jianchao Ji
Juntao Tan
Shuyuan Xu
Zelong Li
Zelong Li
VLM
LRM
330
310
0
10 Apr 2023
Is ChatGPT a Good Sentiment Analyzer? A Preliminary Study
Zengzhi Wang
Qiming Xie
Yi Feng
Zixiang Ding
Zinong Yang
Rui Xia
AI4MH
LLMAG
332
193
0
10 Apr 2023
A Preliminary Evaluation of ChatGPT for Zero-shot Dialogue Understanding
Wenbo Pan
Qiguang Chen
Xiao Xu
Wanxiang Che
Libo Qin
238
53
0
09 Apr 2023
Decoder-Only or Encoder-Decoder? Interpreting Language Model as a Regularized Encoder-Decoder
Z. Fu
W. Lam
Qian Yu
Anthony Man-Cho So
Shengding Hu
Zhiyuan Liu
Nigel Collier
AuLLM
175
61
0
08 Apr 2023
From Retrieval to Generation: Efficient and Effective Entity Set Expansion
International Conference on Information and Knowledge Management (CIKM), 2023
Shulin Huang
Shirong Ma
Yongqian Li
Hai-Tao Zheng
Yong Jiang
Haitao Zheng
Ying Shen
408
6
0
07 Apr 2023
Instruction Tuning with GPT-4
Baolin Peng
Chunyuan Li
Pengcheng He
Michel Galley
Jianfeng Gao
SyDa
ALM
LM&MA
493
752
0
06 Apr 2023
Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster
Nolan Dey
Gurpreet Gosal
Zhiming Chen
Chen
Hemant Khachane
William Marshall
Ribhu Pathria
Marvin Tom
Joel Hestness
MoE
LRM
348
124
0
06 Apr 2023
Zero-Shot Next-Item Recommendation using Large Pretrained Language Models
Lei Wang
Ee-Peng Lim
LRM
165
83
0
06 Apr 2023
Conceptual structure coheres in human cognition but not in large language models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Siddharth Suresh
Kushin Mukherjee
Xizheng Yu
Wei-Chun Huang
Lisa Padua
Timothy T. Rogers
288
13
0
05 Apr 2023
Adopting Two Supervisors for Efficient Use of Large-Scale Remote Deep Neural Networks
ACM Transactions on Software Engineering and Methodology (TOSEM), 2023
Michael Weiss
Paolo Tonella
AI4CE
191
1
0
05 Apr 2023
Scalable and Accurate Self-supervised Multimodal Representation Learning without Aligned Video and Text Data
Vladislav Lialin
Stephen Rawls
David M. Chan
Shalini Ghosh
Anna Rumshisky
Wael Hamza
VLM
AI4TS
288
8
0
04 Apr 2023
Effective Theory of Transformers at Initialization
Emily Dinan
Sho Yaida
Susan Zhang
179
20
0
04 Apr 2023
LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Zhiqiang Hu
Lei Wang
Yihuai Lan
Wanyu Xu
Ee-Peng Lim
Lidong Bing
Xing Xu
Soujanya Poria
Roy Ka-wei Lee
ALM
318
388
0
04 Apr 2023
Resources and Few-shot Learners for In-context Learning in Slavic Languages
Workshop on Balto-Slavic Natural Language Processing (BSNLP), 2023
Michal vStefánik
Marek Kadlcík
Piotr Gramacki
Petr Sojka
167
5
0
04 Apr 2023
Mastering Symbolic Operations: Augmenting Language Models with Compiled Neural Networks
International Conference on Learning Representations (ICLR), 2023
Yixuan Weng
Minjun Zhu
Fei Xia
Bin Li
Shizhu He
Kang Liu
Jun Zhao
505
12
0
04 Apr 2023
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
International Conference on Machine Learning (ICML), 2023
Stella Biderman
Hailey Schoelkopf
Quentin G. Anthony
Herbie Bradley
Kyle O'Brien
...
USVSN Sai Prashanth
Edward Raff
Aviya Skowron
Lintang Sutawika
Oskar van der Wal
397
1,641
0
03 Apr 2023
RPTQ: Reorder-based Post-training Quantization for Large Language Models
Zhihang Yuan
Lin Niu
Jia-Wen Liu
Wenyu Liu
Xinggang Wang
Yuzhang Shang
Guangyu Sun
Qiang Wu
Jiaxiang Wu
Bingzhe Wu
MQ
593
113
0
03 Apr 2023
Can the Inference Logic of Large Language Models be Disentangled into Symbolic Concepts?
Wen Shen
Lei Cheng
Yuxiao Yang
Mingjie Li
Quanshi Zhang
LRM
196
10
0
03 Apr 2023
Previous
1
2
3
...
50
51
52
...
57
58
59
Next
Page 51 of 59
Page
of 59
Go