ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2212.08620
  4. Cited By
POTATO: The Portable Text Annotation Tool

POTATO: The Portable Text Annotation Tool

16 December 2022
Jiaxin Pei
Aparna Ananthasubramaniam
Xingyao Wang
Naitian Zhou
Jackson Sargent
Apostolos Dedeloudis
David Jurgens
    VLM
ArXivPDFHTML

Papers citing "POTATO: The Portable Text Annotation Tool"

44 / 44 papers shown
Title
From Precision to Perception: User-Centred Evaluation of Keyword Extraction Algorithms for Internet-Scale Contextual Advertising
From Precision to Perception: User-Centred Evaluation of Keyword Extraction Algorithms for Internet-Scale Contextual Advertising
Jingwen Cai
Sara Leckner
Johanna Björklund
36
0
0
30 Apr 2025
Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale
Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale
Bowen Jiang
Zhuoqun Hao
Y. Cho
B. Li
Yuan Yuan
Sihao Chen
Lyle Ungar
Camillo J. Taylor
Dan Roth
37
0
0
19 Apr 2025
Modifying Large Language Model Post-Training for Diverse Creative Writing
Modifying Large Language Model Post-Training for Diverse Creative Writing
John Joon Young Chung
Vishakh Padmakumar
Melissa Roemmele
Yuqian Sun
Max Kreminski
MoMe
46
0
0
21 Mar 2025
Have LLMs Made Active Learning Obsolete? Surveying the NLP Community
Julia Romberg
Christopher Schröder
Julius Gonsior
Katrin Tomanek
Fredrik Olsson
62
0
0
12 Mar 2025
CULEMO: Cultural Lenses on Emotion -- Benchmarking LLMs for Cross-Cultural Emotion Understanding
Tadesse Destaw Belay
Ahmed Haj Ahmed
Alvin Grissom II
Iqra Ameer
Grigori Sidorov
Olga Kolesnikova
Seid Muhie Yimam
41
0
0
12 Mar 2025
ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments
ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments
Hojae Han
Seung-won Hwang
Rajhans Samdani
Yuxiong He
ALM
65
2
0
27 Feb 2025
Are Rules Meant to be Broken? Understanding Multilingual Moral Reasoning as a Computational Pipeline with UniMoral
Are Rules Meant to be Broken? Understanding Multilingual Moral Reasoning as a Computational Pipeline with UniMoral
Shivani Kumar
David Jurgens
LRM
41
0
0
21 Feb 2025
BRIGHTER: BRIdging the Gap in Human-Annotated Textual Emotion Recognition Datasets for 28 Languages
BRIGHTER: BRIdging the Gap in Human-Annotated Textual Emotion Recognition Datasets for 28 Languages
Shamsuddeen Hassan Muhammad
N. Ousidhoum
Idris Abdulmumin
Jan Philip Wahle
Terry Ruas
...
Florian Valentin Wunderlich
Hanif Muhammad Zhafran
Tianhui Zhang
Yi Zhou
Saif M. Mohammad
33
3
0
17 Feb 2025
Self-Rationalization in the Wild: A Large Scale Out-of-Distribution Evaluation on NLI-related tasks
Self-Rationalization in the Wild: A Large Scale Out-of-Distribution Evaluation on NLI-related tasks
Jing Yang
Max Glockner
Anderson de Rezende Rocha
Iryna Gurevych
LRM
62
1
0
07 Feb 2025
Toyteller: AI-powered Visual Storytelling Through Toy-Playing with Character Symbols
Toyteller: AI-powered Visual Storytelling Through Toy-Playing with Character Symbols
John Joon Young Chung
Melissa Roemmele
Max Kreminski
VGen
67
0
0
23 Jan 2025
A Reality Check on Context Utilisation for Retrieval-Augmented
  Generation
A Reality Check on Context Utilisation for Retrieval-Augmented Generation
Lovisa Hagström
Sara Vera Marjanović
Haeun Yu
Arnav Arora
Christina Lioma
Maria Maistro
Pepa Atanasova
Isabelle Augenstein
70
0
0
22 Dec 2024
Mitigating Trauma in Qualitative Research Infrastructure: Roles for
  Machine Assistance and Trauma-Informed Design
Mitigating Trauma in Qualitative Research Infrastructure: Roles for Machine Assistance and Trauma-Informed Design
Emily Tseng
Thomas Ristenpart
Nicola Dell
72
1
0
22 Dec 2024
Fearful Falcons and Angry Llamas: Emotion Category Annotations of Arguments by Humans and LLMs
Fearful Falcons and Angry Llamas: Emotion Category Annotations of Arguments by Humans and LLMs
Lynn Greschner
Roman Klinger
83
2
0
20 Dec 2024
MetaphorShare: A Dynamic Collaborative Repository of Open Metaphor Datasets
MetaphorShare: A Dynamic Collaborative Repository of Open Metaphor Datasets
Joanne Boisson
Arif Mehmood
Jose Camacho-Collados
66
0
0
27 Nov 2024
ConsistencyTrack: A Robust Multi-Object Tracker with a Generation
  Strategy of Consistency Model
ConsistencyTrack: A Robust Multi-Object Tracker with a Generation Strategy of Consistency Model
Lifan Jiang
Zhihui Wang
Siqi Yin
Guangxiao Ma
Peng Zhang
Boxi Wu
DiffM
51
0
0
28 Aug 2024
The Language of Trauma: Modeling Traumatic Event Descriptions Across
  Domains with Explainable AI
The Language of Trauma: Modeling Traumatic Event Descriptions Across Domains with Explainable AI
Miriam Schirmer
Tobias Leemann
Gjergji Kasneci
Jürgen Pfeffer
David Jurgens
80
0
0
12 Aug 2024
BotEval: Facilitating Interactive Human Evaluation
BotEval: Facilitating Interactive Human Evaluation
Hyundong Justin Cho
Thamme Gowda
Yuyang Huang
Zixun Lu
Tianli Tong
Jonathan May
ALM
37
1
0
25 Jul 2024
Why does in-context learning fail sometimes? Evaluating in-context
  learning on open and closed questions
Why does in-context learning fail sometimes? Evaluating in-context learning on open and closed questions
Xiang Li
Haoran Tang
Siyu Chen
Ziwei Wang
Ryan Chen
Marcin Abram
LRM
29
1
0
02 Jul 2024
AMBROSIA: A Benchmark for Parsing Ambiguous Questions into Database
  Queries
AMBROSIA: A Benchmark for Parsing Ambiguous Questions into Database Queries
Irina Saparina
Mirella Lapata
30
10
0
27 Jun 2024
MASSW: A New Dataset and Benchmark Tasks for AI-Assisted Scientific
  Workflows
MASSW: A New Dataset and Benchmark Tasks for AI-Assisted Scientific Workflows
Xingjian Zhang
Yutong Xie
Jin Huang
Jinge Ma
Zhaoying Pan
...
Ziyang Xiong
Tolga Ergen
Dongsub Shim
Honglak Lee
Qiaozhu Mei
41
10
0
10 Jun 2024
Beyond Accuracy: Investigating Error Types in GPT-4 Responses to USMLE
  Questions
Beyond Accuracy: Investigating Error Types in GPT-4 Responses to USMLE Questions
Soumyadeep Roy
A. Khatua
Fatemeh Ghoochani
Uwe Hadler
Wolfgang Nejdl
Niloy Ganguly
ELM
LM&MA
33
8
0
20 Apr 2024
Cross-cultural Inspiration Detection and Analysis in Real and
  LLM-generated Social Media Data
Cross-cultural Inspiration Detection and Analysis in Real and LLM-generated Social Media Data
Oana Ignat
Gayathri Ganesh Lakshmy
Rada Mihalcea
DeLMO
19
1
0
19 Apr 2024
SemEval-2024 Shared Task 6: SHROOM, a Shared-task on Hallucinations and
  Related Observable Overgeneration Mistakes
SemEval-2024 Shared Task 6: SHROOM, a Shared-task on Hallucinations and Related Observable Overgeneration Mistakes
Timothee Mickus
Elaine Zosa
Raúl Vázquez
Teemu Vahtola
Jörg Tiedemann
Vincent Segonne
Alessandro Raganato
Marianna Apidianaki
HILM
LRM
21
20
0
12 Mar 2024
Understanding Fine-grained Distortions in Reports of Scientific Findings
Understanding Fine-grained Distortions in Reports of Scientific Findings
Amelie Wuhrl
Dustin Wright
Roman Klinger
Isabelle Augenstein
25
3
0
19 Feb 2024
EEVEE: An Easy Annotation Tool for Natural Language Processing
EEVEE: An Easy Annotation Tool for Natural Language Processing
Axel Sorensen
Siyao Peng
Barbara Plank
Rob van der Goot
18
1
0
05 Feb 2024
The DURel Annotation Tool: Human and Computational Measurement of
  Semantic Proximity, Sense Clusters and Semantic Change
The DURel Annotation Tool: Human and Computational Measurement of Semantic Proximity, Sense Clusters and Semantic Change
Dominik Schlechtweg
S. Virk
Pauline Sander
Emma Sköldberg
Lukas Theuer Linke
Tuo Zhang
Nina Tahmasebi
Jonas Kuhn
Sabine Schulte im Walde
15
10
0
21 Nov 2023
Evaluation Metrics in the Era of GPT-4: Reliably Evaluating Large
  Language Models on Sequence to Sequence Tasks
Evaluation Metrics in the Era of GPT-4: Reliably Evaluating Large Language Models on Sequence to Sequence Tasks
Andrea Sottana
Bin Liang
Kai Zou
Zheng Yuan
ALM
ELM
LM&MA
25
54
0
20 Oct 2023
Unsupervised Candidate Answer Extraction through Differentiable
  Masker-Reconstructor Model
Unsupervised Candidate Answer Extraction through Differentiable Masker-Reconstructor Model
Zhuoer Wang
Yicheng Wang
Ziwei Zhu
James Caverlee
21
0
0
19 Oct 2023
An Emulator for Fine-Tuning Large Language Models using Small Language
  Models
An Emulator for Fine-Tuning Large Language Models using Small Language Models
Eric Mitchell
Rafael Rafailov
Archit Sharma
Chelsea Finn
Christopher D. Manning
ALM
27
51
0
19 Oct 2023
Human Feedback is not Gold Standard
Human Feedback is not Gold Standard
Tom Hosking
Phil Blunsom
Max Bartolo
ALM
14
48
0
28 Sep 2023
MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language
  Feedback
MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback
Xingyao Wang
Zihan Wang
Jiateng Liu
Yangyi Chen
Lifan Yuan
Hao Peng
Heng Ji
LRM
125
138
0
19 Sep 2023
On the Challenges of Building Datasets for Hate Speech Detection
On the Challenges of Building Datasets for Hate Speech Detection
Vitthal Bhandari
6
1
0
06 Sep 2023
Thresh: A Unified, Customizable and Deployable Platform for Fine-Grained
  Text Evaluation
Thresh: A Unified, Customizable and Deployable Platform for Fine-Grained Text Evaluation
David Heineman
Yao Dou
Wei-ping Xu
22
7
0
14 Aug 2023
Your spouse needs professional help: Determining the Contextual
  Appropriateness of Messages through Modeling Social Relationships
Your spouse needs professional help: Determining the Contextual Appropriateness of Messages through Modeling Social Relationships
David Jurgens
Agrima Seth
Jack E. Sargent
Athena Aghighi
Michael Geraci
9
7
0
06 Jul 2023
When Do Annotator Demographics Matter? Measuring the Influence of
  Annotator Demographics with the POPQUORN Dataset
When Do Annotator Demographics Matter? Measuring the Influence of Annotator Demographics with the POPQUORN Dataset
Jiaxin Pei
David Jurgens
12
31
0
12 Jun 2023
Chinese Open Instruction Generalist: A Preliminary Release
Chinese Open Instruction Generalist: A Preliminary Release
Ge Zhang
Yemin Shi
Ruibo Liu
Ruibin Yuan
Yizhi Li
...
Zhaoqun Li
Zekun Wang
Chenghua Lin
Wen-Fen Huang
Jie Fu
ALM
17
28
0
17 Apr 2023
DMOps: Data Management Operation and Recipes
DMOps: Data Management Operation and Recipes
E. Choi
Chanjun Park
17
7
0
02 Jan 2023
Modeling Information Change in Science Communication with Semantically
  Matched Paraphrases
Modeling Information Change in Science Communication with Semantically Matched Paraphrases
Dustin Wright
Jiaxin Pei
David Jurgens
Isabelle Augenstein
21
14
0
24 Oct 2022
SemEval 2023 Task 9: Multilingual Tweet Intimacy Analysis
SemEval 2023 Task 9: Multilingual Tweet Intimacy Analysis
Jiaxin Pei
Vítor Silva
Maarten W. Bos
Yozon Liu
Leonardo Neves
David Jurgens
Francesco Barbieri
42
28
0
03 Oct 2022
Measuring Sentence-Level and Aspect-Level (Un)certainty in Science
  Communications
Measuring Sentence-Level and Aspect-Level (Un)certainty in Science Communications
Jiaxin Pei
David Jurgens
23
28
0
30 Sep 2021
An animated picture says at least a thousand words: Selecting Gif-based
  Replies in Multimodal Dialog
An animated picture says at least a thousand words: Selecting Gif-based Replies in Multimodal Dialog
Xingyao Wang
David Jurgens
17
5
0
24 Sep 2021
Beyond Fair Pay: Ethical Implications of NLP Crowdsourcing
Beyond Fair Pay: Ethical Implications of NLP Crowdsourcing
Boaz Shmueli
Jan Fell
Soumya Ray
Lun-Wei Ku
100
86
0
20 Apr 2021
Conversations Gone Alright: Quantifying and Predicting Prosocial
  Outcomes in Online Conversations
Conversations Gone Alright: Quantifying and Predicting Prosocial Outcomes in Online Conversations
Jiajun Bao
J. Wu
Yiming Zhang
Eshwar Chandrasekharan
David Jurgens
38
45
0
16 Feb 2021
A Survey on Bias and Fairness in Machine Learning
A Survey on Bias and Fairness in Machine Learning
Ninareh Mehrabi
Fred Morstatter
N. Saxena
Kristina Lerman
Aram Galstyan
SyDa
FaML
294
4,187
0
23 Aug 2019
1