
Title |
|---|
![]() ARES: An Automated Evaluation Framework for Retrieval-Augmented
Generation SystemsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023 |
![]() How Well Do Large Language Models Truly Ground?North American Chapter of the Association for Computational Linguistics (NAACL), 2023 |
![]() Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic
Fact-checkersConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 |
![]() Fine-tuning Language Models for FactualityInternational Conference on Learning Representations (ICLR), 2023 |
![]() LLatrieval: LLM-Verified Retrieval for Verifiable GenerationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023 |
![]() FAITHSCORE: Evaluating Hallucinations in Large Vision-Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 |
![]() LitCab: Lightweight Language Model Calibration over Short- and Long-form
ResponsesInternational Conference on Learning Representations (ICLR), 2023 |
![]() Davidsonian Scene Graph: Improving Reliability in Fine-grained
Evaluation for Text-to-Image GenerationInternational Conference on Learning Representations (ICLR), 2023 |
![]() Language Models Hallucinate, but May Excel at Fact VerificationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023 |
![]() Large Language Models Help Humans Verify Truthfulness -- Except When
They Are Convincingly WrongNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023 |
![]() Self-RAG: Learning to Retrieve, Generate, and Critique through
Self-ReflectionInternational Conference on Learning Representations (ICLR), 2023 |
![]() KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level
Hallucination DetectionConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 |
![]() Prometheus: Inducing Fine-grained Evaluation Capability in Language
ModelsInternational Conference on Learning Representations (ICLR), 2023 |
![]() Beyond Factuality: A Comprehensive Evaluation of Large Language Models
as Knowledge GeneratorsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 |
![]() Teaching Language Models to Hallucinate Less with Synthetic TasksInternational Conference on Learning Representations (ICLR), 2023 |
![]() Knowledge Crosswords: Geometric Knowledge Reasoning with Large Language
ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 |
![]() BooookScore: A systematic exploration of book-length summarization in
the era of LLMsInternational Conference on Learning Representations (ICLR), 2023 |
![]() FELM: Benchmarking Factuality Evaluation of Large Language ModelsNeural Information Processing Systems (NeurIPS), 2023 |
![]() STRONG -- Structure Controllable Legal Opinion Summary GenerationInternational Joint Conference on Natural Language Processing (IJCNLP), 2023 |
![]() Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of
Language ModelsInternational Conference on Learning Representations (ICLR), 2023 |
![]() Ragas: Automated Evaluation of Retrieval Augmented GenerationConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023 |
![]() Calibrating LLM-Based EvaluatorInternational Conference on Language Resources and Evaluation (LREC), 2023 |
![]() LongDocFACTScore: Evaluating the Factuality of Long Document Abstractive
SummarisationInternational Conference on Language Resources and Evaluation (LREC), 2023 |
![]() Chain-of-Verification Reduces Hallucination in Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 |
![]() ExpertQA: Expert-Curated Questions and Attributed AnswersNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023 |
![]() Zero-shot Audio Topic Reranking using Large Language ModelsSpoken Language Technology Workshop (SLT), 2023 |
![]() Retrieving Evidence from EHRs with LLMs: Possibilities and ChallengesACM Conference on Health, Inference, and Learning (CHIL), 2023 |
![]() Zero-Resource Hallucination Prevention for Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 |
![]() On the Trustworthiness Landscape of State-of-the-art Generative Models:
A Survey and OutlookInternational Journal of Computer Vision (IJCV), 2023 |