SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for
Generative Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 |
Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation
in Natural Language GenerationInternational Conference on Learning Representations (ICLR), 2023 |
A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on
Reasoning, Hallucination, and InteractivityInternational Joint Conference on Natural Language Processing (IJCNLP), 2023 |
Discovering Latent Knowledge in Language Models Without SupervisionInternational Conference on Learning Representations (ICLR), 2022 |
RARR: Researching and Revising What Language Models Say, Using Language
ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022 |
Looking for a Needle in a Haystack: A Comprehensive Study of
Hallucinations in Neural Machine TranslationConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022 |
RED-ACE: Robust Error Detection for ASR using Confidence EmbeddingsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 |
Locating and Editing Factual Associations in GPTNeural Information Processing Systems (NeurIPS), 2022 |
Survey of Hallucination in Natural Language GenerationACM Computing Surveys (ACM CSUR), 2022 |
Learning Compact Metrics for MTConference on Empirical Methods in Natural Language Processing (EMNLP), 2021 |
TruthfulQA: Measuring How Models Mimic Human FalsehoodsAnnual Meeting of the Association for Computational Linguistics (ACL), 2021 |
A Token-level Reference-free Hallucination Detection Benchmark for
Free-form Text GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2021 |
: Evaluating Factual Consistency in Knowledge-Grounded Dialogues
via Question Generation and Question AnsweringConference on Empirical Methods in Natural Language Processing (EMNLP), 2021 |
QuestEval: Summarization Asks for Fact-based EvaluationConference on Empirical Methods in Natural Language Processing (EMNLP), 2021 |
Probing Classifiers: Promises, Shortcomings, and AdvancesInternational Conference on Computational Logic (ICCL), 2021 |
KoBE: Knowledge-Based Machine Translation EvaluationFindings (Findings), 2020 |
COMET: A Neural Framework for MT EvaluationConference on Empirical Methods in Natural Language Processing (EMNLP), 2020 |
BLEURT: Learning Robust Metrics for Text GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2020 |
Evaluating the Factual Consistency of Abstractive Text SummarizationConference on Empirical Methods in Natural Language Processing (EMNLP), 2019 |
On Identifiability in TransformersInternational Conference on Learning Representations (ICLR), 2019 |
ERNIE: Enhanced Language Representation with Informative EntitiesAnnual Meeting of the Association for Computational Linguistics (ACL), 2019 |
Wronging a Right: Generating Better Errors to Improve Grammatical Error
DetectionConference on Empirical Methods in Natural Language Processing (EMNLP), 2018 |
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question
AnsweringConference on Empirical Methods in Natural Language Processing (EMNLP), 2018 |
Gender Bias in Coreference Resolution: Evaluation and Debiasing MethodsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2018 |