Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2010.11982
Cited By
The Turking Test: Can Language Models Understand Instructions?
22 October 2020
Avia Efrat
Omer Levy
ELM
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Turking Test: Can Language Models Understand Instructions?"
50 / 80 papers shown
Title
Safer Prompts: Reducing IP Risk in Visual Generative AI
Lena Reissinger
Yuanyuan Li
Anna-Carolina Haensch
Neeraj Sarna
33
0
0
06 May 2025
TAIJI: Textual Anchoring for Immunizing Jailbreak Images in Vision Language Models
Xiangyu Yin
Yi Qi
Jinwei Hu
Zhen Chen
Yi Dong
Xingyu Zhao
Xiaowei Huang
Wenjie Ruan
50
0
0
13 Mar 2025
A Comprehensive Evaluation of Cognitive Biases in LLMs
Simon Malberg
Roman Poletukhin
Carolin M. Schuster
Georg Groh
ELM
40
5
0
20 Oct 2024
End User Authoring of Personalized Content Classifiers: Comparing Example Labeling, Rule Writing, and LLM Prompting
Leijie Wang
Kathryn Yurechko
Pranati Dani
Quan Ze Chen
Amy X. Zhang
50
3
0
05 Sep 2024
PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action
Yijia Shao
Tianshi Li
Weiyan Shi
Yanchen Liu
Diyi Yang
PILM
58
18
0
29 Aug 2024
LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models
Mihir Parmar
Nisarg Patel
Neeraj Varshney
Mutsumi Nakamura
Man Luo
Santosh Mashetty
Arindam Mitra
Chitta Baral
LRM
ReLM
ELM
38
25
0
23 Apr 2024
CodecLM: Aligning Language Models with Tailored Synthetic Data
Zifeng Wang
Chun-Liang Li
Vincent Perot
Long T. Le
Jin Miao
Zizhao Zhang
Chen-Yu Lee
Tomas Pfister
SyDa
ALM
31
18
0
08 Apr 2024
Language Models for Text Classification: Is In-Context Learning Enough?
A. Edwards
Jose Camacho-Collados
LRM
49
18
0
26 Mar 2024
Automated Data Curation for Robust Language Model Fine-Tuning
Jiuhai Chen
Jonas W. Mueller
ALM
42
20
0
19 Mar 2024
Tur[k]ingBench: A Challenge Benchmark for Web Agents
Kevin Xu
Yeganeh Kordi
Kate Sanders
Yizhong Wang
Adam Byerly
Kate Sanders
Adam Byerly
Jingyu Zhang
Benjamin Van Durme
Daniel Khashabi
LLMAG
75
6
0
18 Mar 2024
Say More with Less: Understanding Prompt Learning Behaviors through Gist Compression
Xinze Li
Zhenghao Liu
Chenyan Xiong
Shi Yu
Yukun Yan
Shuo Wang
Ge Yu
VLM
43
4
0
25 Feb 2024
Large Language Models for Data Annotation: A Survey
Zhen Tan
Dawei Li
Song Wang
Alimohammad Beigi
Bohan Jiang
Amrita Bhattacharjee
Mansooreh Karami
Wenlin Yao
Lu Cheng
Huan Liu
SyDa
56
53
0
21 Feb 2024
Chain-of-Instructions: Compositional Instruction Tuning on Large Language Models
S. Hayati
Taehee Jung
Tristan Bodding-Long
Sudipta Kar
A. Sethy
Joo-Kyung Kim
Dongyeop Kang
ALM
LRM
38
6
0
18 Feb 2024
How Reliable Are Automatic Evaluation Methods for Instruction-Tuned LLMs?
Ehsan Doostmohammadi
Oskar Holmstrom
Marco Kuhlmann
40
8
0
16 Feb 2024
OpenToM: A Comprehensive Benchmark for Evaluating Theory-of-Mind Reasoning Capabilities of Large Language Models
Hainiu Xu
Runcong Zhao
Lixing Zhu
Bin Liang
Yulan He
84
21
0
08 Feb 2024
WSC+: Enhancing The Winograd Schema Challenge Using Tree-of-Experts
Pardis Sadat Zahraei
Ali Emami
27
6
0
31 Jan 2024
Taxonomy-based CheckList for Large Language Model Evaluation
Damin Zhang
27
0
0
15 Dec 2023
Training-free Zero-shot Composed Image Retrieval with Local Concept Reranking
Shitong Sun
Fanghua Ye
Shaogang Gong
34
13
0
14 Dec 2023
BioInstruct: Instruction Tuning of Large Language Models for Biomedical Natural Language Processing
Hieu Tran
Zhichao Yang
Zonghai Yao
Hong-ye Yu
ALM
LM&MA
40
23
0
30 Oct 2023
Interpreting Answers to Yes-No Questions in User-Generated Content
Shivam Mathur
Keun Hee Park
Dhivya Chinnappa
Saketh Kotamraju
Eduardo Blanco
20
0
0
24 Oct 2023
InstructExcel: A Benchmark for Natural Language Instruction in Excel
Justin Payan
Swaroop Mishra
Mukul Singh
Carina Negreanu
Christian Poelitz
Chitta Baral
Subhro Roy
Rasika Chakravarthy
Benjamin Van Durme
E. Nouri
LMTD
ELM
41
10
0
23 Oct 2023
In-context Pretraining: Language Modeling Beyond Document Boundaries
Weijia Shi
Sewon Min
Maria Lomeli
Chunting Zhou
Margaret Li
...
Victoria Lin
Noah A. Smith
Luke Zettlemoyer
Scott Yih
Mike Lewis
LRM
RALM
SyDa
34
48
0
16 Oct 2023
Welfare Diplomacy: Benchmarking Language Model Cooperation
Gabriel Mukobi
Hannah Erlebach
Niklas Lauffer
Lewis Hammond
Alan Chan
Jesse Clifton
LM&Ro
38
21
0
13 Oct 2023
Data-Centric Financial Large Language Models
Zhixuan Chu
Huaiyu Guo
Xinyuan Zhou
Yijia Wang
Fei Yu
...
Xin Lu
Daixin Wang
Longfei Li
Junqing Zhou
Sheng Li
AIFin
30
7
0
07 Oct 2023
Synthetic Data Generation in Low-Resource Settings via Fine-Tuning of Large Language Models
Jean Kaddour
Qi Liu
SyDa
35
2
0
02 Oct 2023
Model Leeching: An Extraction Attack Targeting LLMs
Lewis Birch
William Hackett
Stefan Trawicki
N. Suri
Peter Garraghan
32
13
0
19 Sep 2023
Fighting Fire with Fire: Can ChatGPT Detect AI-generated Text?
Amrita Bhattacharjee
Huang Liu
DeLMO
30
56
0
02 Aug 2023
Evaluating the Moral Beliefs Encoded in LLMs
Nino Scherrer
Claudia Shi
Amir Feder
David M. Blei
33
117
0
26 Jul 2023
A Zero-shot and Few-shot Study of Instruction-Finetuned Large Language Models Applied to Clinical and Biomedical Tasks
Yanis Labrak
Mickael Rouvier
Richard Dufour
LM&MA
29
25
0
22 Jul 2023
On Conditional and Compositional Language Model Differentiable Prompting
Jonathan Pilault
Can Liu
Joey Tianyi Zhou
Markus Dreyer
30
1
0
04 Jul 2023
Understanding Social Reasoning in Language Models with Language Models
Kanishk Gandhi
Jan-Philipp Fränken
Tobias Gerstenberg
Noah D. Goodman
LRM
39
115
0
21 Jun 2023
GIMLET: A Unified Graph-Text Model for Instruction-Based Molecule Zero-Shot Learning
Haiteng Zhao
Shengchao Liu
Chang Ma
Hannan Xu
Jie Fu
Zhihong Deng
Lingpeng Kong
Qi Liu
31
61
0
28 May 2023
Large Language Models Can be Lazy Learners: Analyze Shortcuts in In-Context Learning
Ruixiang Tang
Dehan Kong
Lo-li Huang
Hui Xue
37
50
0
26 May 2023
EDM3: Event Detection as Multi-task Text Generation
Ujjwala Anantheswaran
Himanshu Gupta
Mihir Parmar
Kuntal Kumar Pal
Chitta Baral
35
5
0
25 May 2023
Prompting Large Language Models for Counterfactual Generation: An Empirical Study
Yongqi Li
Mayi Xu
Xin Miao
Shen Zhou
T. Qian
ELM
LRM
32
21
0
24 May 2023
ZeroSCROLLS: A Zero-Shot Benchmark for Long Text Understanding
Uri Shaham
Maor Ivgi
Avia Efrat
Jonathan Berant
Omer Levy
VLM
46
127
0
23 May 2023
LIMA: Less Is More for Alignment
Chunting Zhou
Pengfei Liu
Puxin Xu
Srini Iyer
Jiao Sun
...
Susan Zhang
Gargi Ghosh
M. Lewis
Luke Zettlemoyer
Omer Levy
ALM
36
783
0
18 May 2023
Working Memory Capacity of ChatGPT: An Empirical Study
Dongyu Gong
Xingchen Wan
Dingmin Wang
LLMAG
KELM
AI4MH
32
13
0
30 Apr 2023
LINGO : Visually Debiasing Natural Language Instructions to Support Task Diversity
Anjana Arunkumar
Shubham Sharma
Rakhi Agrawal
Sriramakrishnan Chandrasekaran
Chris Bryan
34
0
0
12 Apr 2023
Large Language Model Instruction Following: A Survey of Progresses and Challenges
Renze Lou
Kai Zhang
Wenpeng Yin
ALM
LRM
35
20
0
18 Mar 2023
Towards the Scalable Evaluation of Cooperativeness in Language Models
Alan Chan
Maxime Riché
Jesse Clifton
LLMAG
33
6
0
16 Mar 2023
SLCA: Slow Learner with Classifier Alignment for Continual Learning on a Pre-trained Model
Gengwei Zhang
Liyuan Wang
Guoliang Kang
Ling-Hao Chen
Yunchao Wei
CLL
34
106
0
09 Mar 2023
Finding Support Examples for In-Context Learning
Xiaonan Li
Xipeng Qiu
27
89
0
27 Feb 2023
InstructABSA: Instruction Learning for Aspect Based Sentiment Analysis
Kevin Scaria
Himanshu Gupta
Siddharth Goyal
Saurabh Arjun Sawant
Swaroop Mishra
Chitta Baral
29
25
0
16 Feb 2023
ScatterShot: Interactive In-context Example Curation for Text Transformation
Tongshuang Wu
Hua Shen
Daniel S. Weld
Jeffrey Heer
Marco Tulio Ribeiro
19
23
0
14 Feb 2023
CrossCodeBench: Benchmarking Cross-Task Generalization of Source Code Models
Changan Niu
Chuanyi Li
Vincent Ng
Bin Luo
ELM
ALM
34
9
0
08 Feb 2023
A Comprehensive Survey of Continual Learning: Theory, Method and Application
Liyuan Wang
Xingxing Zhang
Hang Su
Jun Zhu
KELM
CLL
51
611
0
31 Jan 2023
Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor
Or Honovich
Thomas Scialom
Omer Levy
Timo Schick
ALM
48
363
0
19 Dec 2022
Discovering Language Model Behaviors with Model-Written Evaluations
Ethan Perez
Sam Ringer
Kamilė Lukošiūtė
Karina Nguyen
Edwin Chen
...
Danny Hernandez
Deep Ganguli
Evan Hubinger
Nicholas Schiefer
Jared Kaplan
ALM
22
367
0
19 Dec 2022
LaSQuE: Improved Zero-Shot Classification from Explanations Through Quantifier Modeling and Curriculum Learning
Sayan Ghosh
Rakesh R Menon
Shashank Srivastava
30
2
0
18 Dec 2022
1
2
Next