ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2303.15056
  4. Cited By
ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks

ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks

27 March 2023
Fabrizio Gilardi
Meysam Alizadeh
M. Kubli
    AI4MH
ArXivPDFHTML

Papers citing "ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks"

50 / 489 papers shown
Title
STAYKATE: Hybrid In-Context Example Selection Combining Representativeness Sampling and Retrieval-based Approach -- A Case Study on Science Domains
STAYKATE: Hybrid In-Context Example Selection Combining Representativeness Sampling and Retrieval-based Approach -- A Case Study on Science Domains
Chencheng Zhu
Kazutaka Shimada
Tomoki Taniguchi
Tomoko Ohkuma
33
0
0
31 Dec 2024
Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agents
Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agents
Weiwei Sun
Lingyong Yan
Xinyu Ma
Shuaiqiang Wang
Pengjie Ren
Zhumin Chen
Dawei Yin
Z. Ren
RALM
ALM
ELM
LRM
LM&MA
74
284
0
31 Dec 2024
Just What You Desire: Constrained Timeline Summarization with
  Self-Reflection for Enhanced Relevance
Just What You Desire: Constrained Timeline Summarization with Self-Reflection for Enhanced Relevance
Muhammad Reza Qorib
Qisheng Hu
Hwee Tou Ng
19
0
0
23 Dec 2024
Fearful Falcons and Angry Llamas: Emotion Category Annotations of Arguments by Humans and LLMs
Fearful Falcons and Angry Llamas: Emotion Category Annotations of Arguments by Humans and LLMs
Lynn Greschner
Roman Klinger
83
2
0
20 Dec 2024
Measuring, Modeling, and Helping People Account for Privacy Risks in
  Online Self-Disclosures with AI
Measuring, Modeling, and Helping People Account for Privacy Risks in Online Self-Disclosures with AI
Isadora Krsek
Anubha Kabra
Yao Dou
Tarek Naous
Laura A. Dabbish
Alan Ritter
Wei-ping Xu
Sauvik Das
68
1
0
19 Dec 2024
Empowering LLMs to Understand and Generate Complex Vector Graphics
Empowering LLMs to Understand and Generate Complex Vector Graphics
Ximing Xing
Juncheng Hu
Guotao Liang
Jing Zhang
Dong Xu
Qian Yu
92
7
0
15 Dec 2024
LLMs-in-the-Loop Part 2: Expert Small AI Models for Anonymization and
  De-identification of PHI Across Multiple Languages
LLMs-in-the-Loop Part 2: Expert Small AI Models for Anonymization and De-identification of PHI Across Multiple Languages
Murat Gunay
Bunyamin Keles
Raife Hizlan
67
0
0
14 Dec 2024
A Scoping Review of ChatGPT Research in Accounting and Finance
A Scoping Review of ChatGPT Research in Accounting and Finance
Mengming Michael Dong
Theophanis C. Stratopoulos
Victor Xiaoqi Wang
69
15
0
07 Dec 2024
TextClass Benchmark: A Continuous Elo Rating of LLMs in Social Sciences
TextClass Benchmark: A Continuous Elo Rating of LLMs in Social Sciences
Bastián González-Bustamante
VLM
64
0
0
30 Nov 2024
Know Your RAG: Dataset Taxonomy and Generation Strategies for Evaluating
  RAG Systems
Know Your RAG: Dataset Taxonomy and Generation Strategies for Evaluating RAG Systems
Rafael Teixeira de Lima
Shubham Gupta
Cesar Berrospi
Lokesh Mishra
Michele Dolfi
Peter W. J. Staar
Panagiotis Vagenas
63
1
0
29 Nov 2024
Advancing Large Language Models for Spatiotemporal and Semantic
  Association Mining of Similar Environmental Events
Advancing Large Language Models for Spatiotemporal and Semantic Association Mining of Similar Environmental Events
Yuanyuan Tian
Wenwen Li
Lei Hu
X. Chen
Michael Brook
Michael Brubaker
Fan Zhang
A. Liljedahl
KELM
79
1
0
19 Nov 2024
Large corpora and large language models: a replicable method for automating grammatical annotation
Cameron Morin
Matti Marttinen Larsson
38
0
0
18 Nov 2024
The Promises and Pitfalls of LLM Annotations in Dataset Labeling: a Case Study on Media Bias Detection
Tomas Horych
Christoph Mandl
Terry Ruas
André Greiner-Petter
Bela Gipp
Akiko Aizawa
Timo Spinde
96
4
0
17 Nov 2024
A Large-Scale Study of Relevance Assessments with Large Language Models:
  An Initial Look
A Large-Scale Study of Relevance Assessments with Large Language Models: An Initial Look
Shivani Upadhyay
Ronak Pradeep
Nandan Thakur
Daniel Fernando Campos
Nick Craswell
I. Soboroff
Hoa Trang Dang
Jimmy J. Lin
29
16
0
13 Nov 2024
One fish, two fish, but not the whole sea: Alignment reduces language
  models' conceptual diversity
One fish, two fish, but not the whole sea: Alignment reduces language models' conceptual diversity
Sonia K. Murthy
Tomer Ullman
Jennifer Hu
ALM
41
10
0
07 Nov 2024
Performance-Guided LLM Knowledge Distillation for Efficient Text
  Classification at Scale
Performance-Guided LLM Knowledge Distillation for Efficient Text Classification at Scale
Flavio Di Palo
Prateek Singhi
Bilal Fadlallah
21
3
0
07 Nov 2024
Harmful YouTube Video Detection: A Taxonomy of Online Harm and MLLMs as
  Alternative Annotators
Harmful YouTube Video Detection: A Taxonomy of Online Harm and MLLMs as Alternative Annotators
Claire Wonjeong Jo
Miki Wesołowska
Magdalena Wojcieszak
18
4
0
06 Nov 2024
Evaluating Moral Beliefs across LLMs through a Pluralistic Framework
Evaluating Moral Beliefs across LLMs through a Pluralistic Framework
Xuelin Liu
Yanfei Zhu
Shucheng Zhu
Pengyuan Liu
Ying Liu
Dong Yu
23
1
0
06 Nov 2024
A Multi-Task Role-Playing Agent Capable of Imitating Character
  Linguistic Styles
A Multi-Task Role-Playing Agent Capable of Imitating Character Linguistic Styles
Siyuan Chen
Q. Si
Chenxu Yang
Yunzhi Liang
Zheng-Shen Lin
Huan Liu
Weiping Wang
40
1
0
04 Nov 2024
Evaluating Creative Short Story Generation in Humans and Large Language Models
Evaluating Creative Short Story Generation in Humans and Large Language Models
Mete Ismayilzada
Claire Stevenson
Lonneke van der Plas
LM&MA
LRM
30
3
0
04 Nov 2024
A Deep Dive Into Large Language Model Code Generation Mistakes: What and Why?
A Deep Dive Into Large Language Model Code Generation Mistakes: What and Why?
QiHong Chen
Jiawei Li
Jiecheng Deng
Jiachen Yu
Justin Tian Jin Chen
Iftekhar Ahmed
48
0
0
03 Nov 2024
Auditing Google's Search Algorithm: Measuring News Diversity Across
  Brazil, the UK, and the US
Auditing Google's Search Algorithm: Measuring News Diversity Across Brazil, the UK, and the US
Raphael Hernandes
Giulio Corsi
MLAU
31
0
0
31 Oct 2024
Beyond Label Attention: Transparency in Language Models for Automated Medical Coding via Dictionary Learning
Beyond Label Attention: Transparency in Language Models for Automated Medical Coding via Dictionary Learning
John Wu
David Wu
Jimeng Sun
27
1
0
31 Oct 2024
LongReward: Improving Long-context Large Language Models with AI
  Feedback
LongReward: Improving Long-context Large Language Models with AI Feedback
J. Zhang
Zhongni Hou
Xin Lv
S. Cao
Zhenyu Hou
Yilin Niu
Lei Hou
Yuxiao Dong
Ling Feng
Juanzi Li
OffRL
LRM
33
7
0
28 Oct 2024
NewTerm: Benchmarking Real-Time New Terms for Large Language Models with
  Annual Updates
NewTerm: Benchmarking Real-Time New Terms for Large Language Models with Annual Updates
Hexuan Deng
Wenxiang Jiao
Xuebo Liu
Min Zhang
Zhaopeng Tu
36
2
0
28 Oct 2024
PRISM: A Methodology for Auditing Biases in Large Language Models
PRISM: A Methodology for Auditing Biases in Large Language Models
Leif Azzopardi
Yashar Moshfeghi
16
0
0
24 Oct 2024
Are LLMs Better than Reported? Detecting Label Errors and Mitigating
  Their Effect on Model Performance
Are LLMs Better than Reported? Detecting Label Errors and Mitigating Their Effect on Model Performance
Omer Nahum
Nitay Calderon
Orgad Keller
Idan Szpektor
Roi Reichart
23
1
0
24 Oct 2024
Knowledge Distillation Using Frontier Open-source LLMs: Generalizability
  and the Role of Synthetic Data
Knowledge Distillation Using Frontier Open-source LLMs: Generalizability and the Role of Synthetic Data
Anup Shirgaonkar
Nikhil Pandey
Nazmiye Ceren Abay
Tolga Aktas
Vijay Aski
ALM
SyDa
24
0
0
24 Oct 2024
Human-LLM Hybrid Text Answer Aggregation for Crowd Annotations
Human-LLM Hybrid Text Answer Aggregation for Crowd Annotations
Jiyi Li
27
1
0
22 Oct 2024
PAPILLON: Privacy Preservation from Internet-based and Local Language Model Ensembles
PAPILLON: Privacy Preservation from Internet-based and Local Language Model Ensembles
Li Siyan
Vethavikashini Chithrra Raghuram
Omar Khattab
Julia Hirschberg
Zhou Yu
21
7
0
22 Oct 2024
Reducing Hallucinations in Vision-Language Models via Latent Space
  Steering
Reducing Hallucinations in Vision-Language Models via Latent Space Steering
Sheng Liu
Haotian Ye
Lei Xing
James Zou
VLM
LLMSV
45
5
0
21 Oct 2024
XForecast: Evaluating Natural Language Explanations for Time Series
  Forecasting
XForecast: Evaluating Natural Language Explanations for Time Series Forecasting
Taha Aksu
Chenghao Liu
Amrita Saha
Sarah Tan
Caiming Xiong
Doyen Sahoo
AI4TS
16
1
0
18 Oct 2024
De-mark: Watermark Removal in Large Language Models
De-mark: Watermark Removal in Large Language Models
Ruibo Chen
Yihan Wu
Junfeng Guo
Heng Huang
WaLM
VLM
27
0
0
17 Oct 2024
Towards Hybrid Intelligence in Journalism: Findings and Lessons Learnt
  from a Collaborative Analysis of Greek Political Rhetoric by ChatGPT and
  Humans
Towards Hybrid Intelligence in Journalism: Findings and Lessons Learnt from a Collaborative Analysis of Greek Political Rhetoric by ChatGPT and Humans
Thanasis Troboukis
Kelly Kiki
Antonis Galanopoulos
Pavlos Sermpezis
Stelios Karamanidis
Ilias Dimitriadis
Athena Vakali
16
0
0
17 Oct 2024
Limits to scalable evaluation at the frontier: LLM as Judge won't beat twice the data
Limits to scalable evaluation at the frontier: LLM as Judge won't beat twice the data
Florian E. Dorner
Vivian Y. Nastl
Moritz Hardt
ELM
ALM
35
5
0
17 Oct 2024
MCQG-SRefine: Multiple Choice Question Generation and Evaluation with Iterative Self-Critique, Correction, and Comparison Feedback
MCQG-SRefine: Multiple Choice Question Generation and Evaluation with Iterative Self-Critique, Correction, and Comparison Feedback
Zonghai Yao
Aditya Parashar
Huixue Zhou
Won Seok Jang
Feiyun Ouyang
Zhichao Yang
Hong-ye Yu
ELM
42
2
0
17 Oct 2024
Uncovering the Internet's Hidden Values: An Empirical Study of Desirable
  Behavior Using Highly-Upvoted Content on Reddit
Uncovering the Internet's Hidden Values: An Empirical Study of Desirable Behavior Using Highly-Upvoted Content on Reddit
Agam Goyal
Charlotte Lambert
Eshwar Chandrasekharan
28
2
0
16 Oct 2024
Learning to Predict Usage Options of Product Reviews with LLM-Generated
  Labels
Learning to Predict Usage Options of Product Reviews with LLM-Generated Labels
Leo Kohlenberg
Leonard Horns
Frederic Sadrieh
Nils Kiele
Matthis Clausen
Konstantin Ketterer
Avetis Navasardyan
Tamara Czinczoll
Gerard de Melo
Ralf Herbrich
21
0
0
16 Oct 2024
REFINE on Scarce Data: Retrieval Enhancement through Fine-Tuning via
  Model Fusion of Embedding Models
REFINE on Scarce Data: Retrieval Enhancement through Fine-Tuning via Model Fusion of Embedding Models
Ambuje Gupta
Mrinal Rawat
Andreas Stolcke
Roberto Pieraccini
RALM
14
1
0
16 Oct 2024
Personas with Attitudes: Controlling LLMs for Diverse Data Annotation
Personas with Attitudes: Controlling LLMs for Diverse Data Annotation
Leon Fröhling
Gianluca Demartini
Dennis Assenmacher
24
2
0
15 Oct 2024
Human-LLM Collaborative Construction of a Cantonese Emotion Lexicon
Human-LLM Collaborative Construction of a Cantonese Emotion Lexicon
Yusong Zhang
Dong Dong
Chi-tim Hung
Leonard Heyerdahl
Tamara Giles-Vernick
Eng-kiong Yeoh
16
0
0
15 Oct 2024
EasyJudge: an Easy-to-use Tool for Comprehensive Response Evaluation of
  LLMs
EasyJudge: an Easy-to-use Tool for Comprehensive Response Evaluation of LLMs
Yijie Li
Yuan Sun
ELM
26
0
0
13 Oct 2024
Single Ground Truth Is Not Enough: Adding Flexibility to Aspect-Based Sentiment Analysis Evaluation
Single Ground Truth Is Not Enough: Adding Flexibility to Aspect-Based Sentiment Analysis Evaluation
S. Yang
Hojun Cho
Jiyoung Lee
Sohee Yoon
E. Choi
Jaegul Choo
Won Ik Cho
19
0
0
13 Oct 2024
JurEE not Judges: safeguarding llm interactions with small, specialised
  Encoder Ensembles
JurEE not Judges: safeguarding llm interactions with small, specialised Encoder Ensembles
Dom Nasrabadi
24
1
0
11 Oct 2024
RevisEval: Improving LLM-as-a-Judge via Response-Adapted References
RevisEval: Improving LLM-as-a-Judge via Response-Adapted References
Qiyuan Zhang
Yufei Wang
Tiezheng YU
Yuxin Jiang
Chuhan Wu
...
Xin Jiang
Lifeng Shang
Ruiming Tang
Fuyuan Lyu
Chen Ma
26
4
0
07 Oct 2024
Post-hoc Study of Climate Microtargeting on Social Media Ads with LLMs: Thematic Insights and Fairness Evaluation
Post-hoc Study of Climate Microtargeting on Social Media Ads with LLMs: Thematic Insights and Fairness Evaluation
Tunazzina Islam
Dan Goldwasser
36
1
0
07 Oct 2024
CS4: Measuring the Creativity of Large Language Models Automatically by
  Controlling the Number of Story-Writing Constraints
CS4: Measuring the Creativity of Large Language Models Automatically by Controlling the Number of Story-Writing Constraints
Anirudh Atmakuru
Jatin Nainani
Rohith Siddhartha Reddy Bheemreddy
Anirudh Lakkaraju
Zonghai Yao
Hamed Zamani
Haw-Shiuan Chang
60
2
0
05 Oct 2024
Misinformation with Legal Consequences (MisLC): A New Task Towards
  Harnessing Societal Harm of Misinformation
Misinformation with Legal Consequences (MisLC): A New Task Towards Harnessing Societal Harm of Misinformation
Chu Fei Luo
Radin Shayanfar
R. Bhambhoria
Samuel Dahan
Xiaodan Zhu
AILaw
21
0
0
04 Oct 2024
Are Expert-Level Language Models Expert-Level Annotators?
Are Expert-Level Language Models Expert-Level Annotators?
Yu-Min Tseng
Wei-Lin Chen
Chung-Chi Chen
Hsin-Hsi Chen
ALM
34
0
0
04 Oct 2024
On Unsupervised Prompt Learning for Classification with Black-box
  Language Models
On Unsupervised Prompt Learning for Classification with Black-box Language Models
Zhen-Yu Zhang
Jiandong Zhang
Huaxiu Yao
Gang Niu
Masashi Sugiyama
21
2
0
04 Oct 2024
Previous
12345...8910
Next