ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1905.07830
  4. Cited By
HellaSwag: Can a Machine Really Finish Your Sentence?

HellaSwag: Can a Machine Really Finish Your Sentence?

Annual Meeting of the Association for Computational Linguistics (ACL), 2019
19 May 2019
Rowan Zellers
Ari Holtzman
Yonatan Bisk
Ali Farhadi
Yejin Choi
ArXiv (abs)PDFHTML

Papers citing "HellaSwag: Can a Machine Really Finish Your Sentence?"

50 / 2,252 papers shown
Clues Before Answers: Generation-Enhanced Multiple-Choice QA
Clues Before Answers: Generation-Enhanced Multiple-Choice QANorth American Chapter of the Association for Computational Linguistics (NAACL), 2022
Zixian Huang
Ao Wu
Jiaying Zhou
Yu Gu
Yue Zhao
Gong Cheng
148
29
0
30 Apr 2022
Prompt Consistency for Zero-Shot Task Generalization
Prompt Consistency for Zero-Shot Task GeneralizationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Chunting Zhou
Junxian He
Xuezhe Ma
Taylor Berg-Kirkpatrick
Graham Neubig
VLM
358
86
0
29 Apr 2022
Learning to Split for Automatic Bias Detection
Learning to Split for Automatic Bias Detection
Yujia Bao
Regina Barzilay
215
21
0
28 Apr 2022
On the Limitations of Dataset Balancing: The Lost Battle Against
  Spurious Correlations
On the Limitations of Dataset Balancing: The Lost Battle Against Spurious Correlations
Roy Schwartz
Gabriel Stanovsky
188
31
0
27 Apr 2022
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
...
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
368
949
0
14 Apr 2022
Training a Helpful and Harmless Assistant with Reinforcement Learning
  from Human Feedback
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
Yuntao Bai
Andy Jones
Kamal Ndousse
Amanda Askell
Anna Chen
...
Jack Clark
Sam McCandlish
C. Olah
Benjamin Mann
Jared Kaplan
901
3,458
0
12 Apr 2022
What Language Model Architecture and Pretraining Objective Work Best for
  Zero-Shot Generalization?
What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?International Conference on Machine Learning (ICML), 2022
Thomas Wang
Adam Roberts
Daniel Hesslow
Teven Le Scao
Hyung Won Chung
Iz Beltagy
Julien Launay
Colin Raffel
282
215
0
12 Apr 2022
FoundationLayerNorm: Scaling BERT and GPT to 1,000 Layers
FoundationLayerNorm: Scaling BERT and GPT to 1,000 Layers
Dezhou Shen
AI4CE
62
1
0
09 Apr 2022
Checking HateCheck: a cross-functional analysis of behaviour-aware
  learning for hate speech detection
Checking HateCheck: a cross-functional analysis of behaviour-aware learning for hate speech detection
Pedro Henrique Luz de Araujo
Benjamin Roth
136
2
0
08 Apr 2022
PaLM: Scaling Language Modeling with Pathways
PaLM: Scaling Language Modeling with PathwaysJournal of machine learning research (JMLR), 2022
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
...
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
PILMLRM
1.2K
7,418
0
05 Apr 2022
Training Compute-Optimal Large Language Models
Training Compute-Optimal Large Language Models
Jordan Hoffmann
Sebastian Borgeaud
A. Mensch
Elena Buchatskaya
Trevor Cai
...
Karen Simonyan
Erich Elsen
Jack W. Rae
Oriol Vinyals
Laurent Sifre
AI4TS
792
2,613
0
29 Mar 2022
REx: Data-Free Residual Quantization Error Expansion
REx: Data-Free Residual Quantization Error ExpansionNeural Information Processing Systems (NeurIPS), 2022
Edouard Yvinec
Arnaud Dapgony
Matthieu Cord
Kévin Bailly
MQ
338
9
0
28 Mar 2022
When Chosen Wisely, More Data Is What You Need: A Universal
  Sample-Efficient Strategy For Data Augmentation
When Chosen Wisely, More Data Is What You Need: A Universal Sample-Efficient Strategy For Data AugmentationFindings (Findings), 2022
Ehsan Kamalloo
Mehdi Rezagholizadeh
A. Ghodsi
200
11
0
17 Mar 2022
Show Me More Details: Discovering Hierarchies of Procedures from
  Semi-structured Web Data
Show Me More Details: Discovering Hierarchies of Procedures from Semi-structured Web DataAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Shuyan Zhou
Li Zhang
Yue Yang
Qing Lyu
Pengcheng Yin
Chris Callison-Burch
Graham Neubig
166
32
0
14 Mar 2022
Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for
  Pre-trained Language Models
Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models
Ning Ding
Yujia Qin
Guang Yang
Fu Wei
Zonghan Yang
...
Jianfei Chen
Yang Liu
Jie Tang
Juan Li
Maosong Sun
350
225
0
14 Mar 2022
Efficient Language Modeling with Sparse all-MLP
Efficient Language Modeling with Sparse all-MLP
Ping Yu
Mikel Artetxe
Myle Ott
Sam Shleifer
Hongyu Gong
Ves Stoyanov
Xian Li
MoE
182
15
0
14 Mar 2022
CoDA21: Evaluating Language Understanding Capabilities of NLP Models
  With Context-Definition Alignment
CoDA21: Evaluating Language Understanding Capabilities of NLP Models With Context-Definition AlignmentAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Lutfi Kerem Senel
Timo Schick
Hinrich Schütze
ELMALM
131
6
0
11 Mar 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedbackNeural Information Processing Systems (NeurIPS), 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLMALM
2.1K
17,490
0
04 Mar 2022
A Survey of Knowledge-Intensive NLP with Pre-Trained Language Models
A Survey of Knowledge-Intensive NLP with Pre-Trained Language Models
Da Yin
Li Dong
Hao Cheng
Xiaodong Liu
Kai-Wei Chang
Furu Wei
Jianfeng Gao
KELM
198
36
0
17 Feb 2022
Exploring the Limits of Domain-Adaptive Training for Detoxifying
  Large-Scale Language Models
Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language ModelsNeural Information Processing Systems (NeurIPS), 2022
Wei Ping
Ming-Yu Liu
Chaowei Xiao
Peng Xu
M. Patwary
Mohammad Shoeybi
Yue Liu
Anima Anandkumar
Bryan Catanzaro
297
79
0
08 Feb 2022
Commonsense Knowledge Reasoning and Generation with Pre-trained Language
  Models: A Survey
Commonsense Knowledge Reasoning and Generation with Pre-trained Language Models: A SurveyAAAI Conference on Artificial Intelligence (AAAI), 2022
Prajjwal Bhargava
Vincent Ng
ReLMLRM
327
74
0
28 Jan 2022
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A
  Large-Scale Generative Language Model
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model
Shaden Smith
M. Patwary
Brandon Norick
P. LeGresley
Samyam Rajbhandari
...
Mohammad Shoeybi
Yuxiong He
Michael Houston
Saurabh Tiwary
Bryan Catanzaro
MoE
427
810
0
28 Jan 2022
WANLI: Worker and AI Collaboration for Natural Language Inference
  Dataset Creation
WANLI: Worker and AI Collaboration for Natural Language Inference Dataset CreationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Alisa Liu
Swabha Swayamdipta
Noah A. Smith
Yejin Choi
634
250
0
16 Jan 2022
CommonsenseQA 2.0: Exposing the Limits of AI through Gamification
CommonsenseQA 2.0: Exposing the Limits of AI through Gamification
Alon Talmor
Ori Yoran
Ronan Le Bras
Chandrasekhar Bhagavatula
Yoav Goldberg
Yejin Choi
Jonathan Berant
ELM
293
167
0
14 Jan 2022
Efficient Large Scale Language Modeling with Mixtures of Experts
Efficient Large Scale Language Modeling with Mixtures of ExpertsConference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Mikel Artetxe
Shruti Bhosale
Naman Goyal
Todor Mihaylov
Myle Ott
...
Jeff Wang
Luke Zettlemoyer
Mona T. Diab
Zornitsa Kozareva
Ves Stoyanov
MoE
463
223
0
20 Dec 2021
Few-shot Learning with Multilingual Language Models
Few-shot Learning with Multilingual Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Xi Lin
Todor Mihaylov
Mikel Artetxe
Tianlu Wang
Shuohui Chen
...
Luke Zettlemoyer
Zornitsa Kozareva
Mona T. Diab
Ves Stoyanov
Xian Li
BDLELMLRM
353
354
0
20 Dec 2021
KGR^4: Retrieval, Retrospect, Refine and Rethink for Commonsense
  Generation
KGR^4: Retrieval, Retrospect, Refine and Rethink for Commonsense Generation
Xin Liu
Dayiheng Liu
Baosong Yang
Haibo Zhang
Junwei Ding
Wenqing Yao
Weihua Luo
Haiying Zhang
Jinsong Su
LRM
97
8
0
15 Dec 2021
GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
Nan Du
Yanping Huang
Andrew M. Dai
Simon Tong
Dmitry Lepikhin
...
Kun Zhang
Quoc V. Le
Yonghui Wu
Zhiwen Chen
Claire Cui
ALMMoE
680
1,045
0
13 Dec 2021
Human Parity on CommonsenseQA: Augmenting Self-Attention with External
  Attention
Human Parity on CommonsenseQA: Augmenting Self-Attention with External AttentionInternational Joint Conference on Artificial Intelligence (IJCAI), 2021
Yichong Xu
Chenguang Zhu
Shuohang Wang
Siqi Sun
Hao Cheng
Xiaodong Liu
Jianfeng Gao
Pengcheng He
Michael Zeng
Xuedong Huang
LRM
464
62
0
06 Dec 2021
MetaQA: Combining Expert Agents for Multi-Skill Question Answering
MetaQA: Combining Expert Agents for Multi-Skill Question Answering
Haritz Puerto
Gözde Gül Sahin
Iryna Gurevych
LLMAG
454
27
0
03 Dec 2021
A General Language Assistant as a Laboratory for Alignment
A General Language Assistant as a Laboratory for Alignment
Amanda Askell
Yuntao Bai
Anna Chen
Dawn Drain
Deep Ganguli
...
Tom B. Brown
Jack Clark
Sam McCandlish
C. Olah
Jared Kaplan
ALM
443
966
0
01 Dec 2021
ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning
ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning
V. Aribandi
Yi Tay
Tal Schuster
J. Rao
H. Zheng
...
Jianmo Ni
Jai Gupta
Kai Hui
Sebastian Ruder
Donald Metzler
MoE
297
230
0
22 Nov 2021
Adversarially Constructed Evaluation Sets Are More Challenging, but May
  Not Be Fair
Adversarially Constructed Evaluation Sets Are More Challenging, but May Not Be Fair
Jason Phang
Angelica Chen
William Huang
Samuel R. Bowman
AAML
166
14
0
16 Nov 2021
Uncertainty Calibration for Ensemble-Based Debiasing Methods
Uncertainty Calibration for Ensemble-Based Debiasing Methods
Ruibin Xiong
Yimeng Chen
Liang Pang
Xueqi Chen
Yanyan Lan
147
23
0
07 Nov 2021
A Systematic Investigation of Commonsense Knowledge in Large Language
  Models
A Systematic Investigation of Commonsense Knowledge in Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Xiang Lorraine Li
A. Kuncoro
Jordan Hoffmann
Cyprien de Masson dÁutume
Phil Blunsom
Aida Nematzadeh
LRM
266
74
0
31 Oct 2021
MetaICL: Learning to Learn In Context
MetaICL: Learning to Learn In ContextNorth American Chapter of the Association for Computational Linguistics (NAACL), 2021
Sewon Min
M. Lewis
Luke Zettlemoyer
Hannaneh Hajishirzi
LRM
672
573
0
29 Oct 2021
NormFormer: Improved Transformer Pretraining with Extra Normalization
NormFormer: Improved Transformer Pretraining with Extra Normalization
Sam Shleifer
Jason Weston
Myle Ott
AI4CE
272
85
0
18 Oct 2021
Coherence boosting: When your pretrained language model is not paying
  enough attention
Coherence boosting: When your pretrained language model is not paying enough attention
Nikolay Malkin
Zhen Wang
Nebojsa Jojic
RALM
209
42
0
15 Oct 2021
Jurassic is (almost) All You Need: Few-Shot Meaning-to-Text Generation
  for Open-Domain Dialogue
Jurassic is (almost) All You Need: Few-Shot Meaning-to-Text Generation for Open-Domain Dialogue
Lena Reed
Cecilia Li
Angela Ramirez
Liren Wu
M. Walker
221
8
0
15 Oct 2021
SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer
SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer
Tu Vu
Brian Lester
Noah Constant
Rami Al-Rfou
Daniel Cer
VLMLRM
458
314
0
15 Oct 2021
Can Machines Learn Morality? The Delphi Experiment
Can Machines Learn Morality? The Delphi Experiment
Liwei Jiang
Jena D. Hwang
Chandra Bhagavatula
Ronan Le Bras
Jenny T Liang
...
Yulia Tsvetkov
Oren Etzioni
Maarten Sap
Regina A. Rini
Yejin Choi
FaML
333
152
0
14 Oct 2021
Does Vision-and-Language Pretraining Improve Lexical Grounding?
Does Vision-and-Language Pretraining Improve Lexical Grounding?
Tian Yun
Chen Sun
Ellie Pavlick
VLMCoGe
228
36
0
21 Sep 2021
Fine-Tuned Transformers Show Clusters of Similar Representations Across
  Layers
Fine-Tuned Transformers Show Clusters of Similar Representations Across Layers
Jason Phang
Haokun Liu
Samuel R. Bowman
242
36
0
17 Sep 2021
Avoiding Inference Heuristics in Few-shot Prompt-based Finetuning
Avoiding Inference Heuristics in Few-shot Prompt-based FinetuningConference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Prasetya Ajie Utama
N. Moosavi
Victor Sanh
Iryna Gurevych
AAML
217
36
0
09 Sep 2021
CREAK: A Dataset for Commonsense Reasoning over Entity Knowledge
CREAK: A Dataset for Commonsense Reasoning over Entity Knowledge
Yasumasa Onoe
Michael J.Q. Zhang
Eunsol Choi
Greg Durrett
HILM
252
94
0
03 Sep 2021
Finetuned Language Models Are Zero-Shot Learners
Finetuned Language Models Are Zero-Shot Learners
Jason W. Wei
Maarten Bosma
Vincent Zhao
Kelvin Guu
Adams Wei Yu
Brian Lester
Nan Du
Andrew M. Dai
Quoc V. Le
ALMUQCV
1.6K
4,587
0
03 Sep 2021
An Empirical Exploration in Quality Filtering of Text Data
An Empirical Exploration in Quality Filtering of Text Data
Leo Gao
126
12
0
02 Sep 2021
Rethinking Why Intermediate-Task Fine-Tuning Works
Rethinking Why Intermediate-Task Fine-Tuning WorksConference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Ting-Yun Chang
Chi-Jen Lu
LRM
201
32
0
26 Aug 2021
The Stability-Efficiency Dilemma: Investigating Sequence Length Warmup
  for Training GPT Models
The Stability-Efficiency Dilemma: Investigating Sequence Length Warmup for Training GPT ModelsNeural Information Processing Systems (NeurIPS), 2021
Conglong Li
Minjia Zhang
Yuxiong He
326
50
0
13 Aug 2021
Goal-Oriented Script Construction
Goal-Oriented Script ConstructionInternational Conference on Natural Language Generation (INLG), 2021
Qing Lyu
Li Zhang
Chris Callison-Burch
193
35
0
28 Jul 2021
Previous
123...4243444546
Next