Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.08433
Cited By
A Confederacy of Models: a Comprehensive Evaluation of LLMs on Creative Writing
12 October 2023
Carlos Gómez-Rodríguez
Paul Williams
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Confederacy of Models: a Comprehensive Evaluation of LLMs on Creative Writing"
12 / 12 papers shown
Title
WritingBench: A Comprehensive Benchmark for Generative Writing
Yuning Wu
Jiahao Mei
M. Yan
Chenliang Li
Shaopeng Lai
...
Zijia Wang
J. Zhang
Mengyue Wu
Qin Jin
Fei Huang
61
1
0
07 Mar 2025
Preference Optimization with Multi-Sample Comparisons
Chaoqi Wang
Zhuokai Zhao
Chen Zhu
Karthik Abinav Sankararaman
Michal Valko
...
Zhaorun Chen
Madian Khabsa
Yuxin Chen
Hao Ma
Sinong Wang
43
10
0
16 Oct 2024
RevisEval: Improving LLM-as-a-Judge via Response-Adapted References
Qiyuan Zhang
Yufei Wang
Tiezheng YU
Yuxin Jiang
Chuhan Wu
...
Xin Jiang
Lifeng Shang
Ruiming Tang
Fuyuan Lyu
Chen Ma
26
4
0
07 Oct 2024
Benchmarking Language Model Creativity: A Case Study on Code Generation
Yining Lu
Dixuan Wang
Tianjian Li
Dongwei Jiang
Daniel Khashabi
Meng Jiang
Daniel Khashabi
LRM
49
10
0
12 Jul 2024
Pron vs Prompt: Can Large Language Models already Challenge a World-Class Fiction Author at Creative Text Writing?
Guillermo Marco
Julio Gonzalo
Ramón del Castillo
M. Girona
19
10
0
01 Jul 2024
The Unlikely Duel: Evaluating Creative Writing in LLMs through a Unique Scenario
Carlos Gómez-Rodríguez
Paul Williams
17
1
0
22 Jun 2024
Scalable Qualitative Coding with LLMs: Chain-of-Thought Reasoning Matches Human Performance in Some Hermeneutic Tasks
Zackary Dunivin
12
16
0
26 Jan 2024
Co-Writing Screenplays and Theatre Scripts with Language Models: An Evaluation by Industry Professionals
Piotr Wojciech Mirowski
Kory W. Mathewson
Jaylen Pittman
Richard Evans
HAI
53
247
0
29 Sep 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
203
1,651
0
15 Oct 2021
The Perils of Using Mechanical Turk to Evaluate Open-Ended Text Generation
Marzena Karpinska
Nader Akoury
Mohit Iyyer
198
106
0
14 Sep 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
236
1,508
0
31 Dec 2020
1