ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.14251
  4. Cited By
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long
  Form Text Generation

FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation

23 May 2023
Sewon Min
Kalpesh Krishna
Xinxi Lyu
M. Lewis
Wen-tau Yih
Pang Wei Koh
Mohit Iyyer
Luke Zettlemoyer
Hannaneh Hajishirzi
    HILM
    ALM
ArXivPDFHTML

Papers citing "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"

50 / 454 papers shown
Title
Measuring and Reducing LLM Hallucination without Gold-Standard Answers
Measuring and Reducing LLM Hallucination without Gold-Standard Answers
Jiaheng Wei
Yuanshun Yao
Jean-François Ton
Hongyi Guo
Andrew Estornell
Yang Liu
HILM
50
18
0
16 Feb 2024
Language Models with Conformal Factuality Guarantees
Language Models with Conformal Factuality Guarantees
Christopher Mohri
Tatsunori Hashimoto
HILM
39
33
0
15 Feb 2024
Do LLMs Know about Hallucination? An Empirical Investigation of LLM's
  Hidden States
Do LLMs Know about Hallucination? An Empirical Investigation of LLM's Hidden States
Hanyu Duan
Yi Yang
K. Tam
HILM
19
27
0
15 Feb 2024
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via
  Self-Evaluation
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation
Xiaoying Zhang
Baolin Peng
Ye Tian
Jingyan Zhou
Lifeng Jin
Linfeng Song
Haitao Mi
Helen Meng
HILM
28
43
0
14 Feb 2024
Into the Unknown: Self-Learning Large Language Models
Into the Unknown: Self-Learning Large Language Models
Teddy Ferdinan
Jan Kocoñ
P. Kazienko
20
2
0
14 Feb 2024
InstructGraph: Boosting Large Language Models via Graph-centric
  Instruction Tuning and Preference Alignment
InstructGraph: Boosting Large Language Models via Graph-centric Instruction Tuning and Preference Alignment
Jianing Wang
Junda Wu
Yupeng Hou
Yao Liu
Ming Gao
Julian McAuley
20
32
0
13 Feb 2024
Towards Faithful and Robust LLM Specialists for Evidence-Based
  Question-Answering
Towards Faithful and Robust LLM Specialists for Evidence-Based Question-Answering
Tobias Schimanski
Jingwei Ni
Mathias Kraus
Elliott Ash
Markus Leippold
21
4
0
13 Feb 2024
Is it safe to cross? Interpretable Risk Assessment with GPT-4V for
  Safety-Aware Street Crossing
Is it safe to cross? Interpretable Risk Assessment with GPT-4V for Safety-Aware Street Crossing
Hochul Hwang
Sunjae Kwon
Yekyung Kim
Donghyun Kim
32
11
0
09 Feb 2024
Calibrating Long-form Generations from Large Language Models
Calibrating Long-form Generations from Large Language Models
Yukun Huang
Yixin Liu
Raghuveer Thirukovalluru
Arman Cohan
Bhuwan Dhingra
16
6
0
09 Feb 2024
Large Language Models: A Survey
Large Language Models: A Survey
Shervin Minaee
Tomáš Mikolov
Narjes Nikzad
M. Asgari-Chenaghlu
R. Socher
Xavier Amatriain
Jianfeng Gao
ALM
LM&MA
ELM
120
353
0
09 Feb 2024
Merging Facts, Crafting Fallacies: Evaluating the Contradictory Nature
  of Aggregated Factual Claims in Long-Form Generations
Merging Facts, Crafting Fallacies: Evaluating the Contradictory Nature of Aggregated Factual Claims in Long-Form Generations
Cheng-Han Chiang
Hung-yi Lee
HILM
62
8
0
08 Feb 2024
Training Language Models to Generate Text with Citations via
  Fine-grained Rewards
Training Language Models to Generate Text with Citations via Fine-grained Rewards
Chengyu Huang
Zeqiu Wu
Yushi Hu
Wenya Wang
HILM
LRM
75
25
0
06 Feb 2024
Factuality of Large Language Models in the Year 2024
Factuality of Large Language Models in the Year 2024
Yuxia Wang
Minghan Wang
Muhammad Arslan Manzoor
Fei Liu
Georgi Georgiev
Rocktim Jyoti Das
Preslav Nakov
LRM
HILM
30
7
0
04 Feb 2024
How well do LLMs cite relevant medical references? An evaluation
  framework and analyses
How well do LLMs cite relevant medical references? An evaluation framework and analyses
Kevin Wu
Eric Wu
Ally Cassasola
Angela Zhang
Kevin Wei
Teresa Nguyen
Sith Riantawan
Patricia Shi Riantawan
Daniel E. Ho
James Y. Zou
LM&MA
ELM
AI4MH
10
23
0
03 Feb 2024
Rethinking the Role of Proxy Rewards in Language Model Alignment
Rethinking the Role of Proxy Rewards in Language Model Alignment
Sungdong Kim
Minjoon Seo
SyDa
ALM
23
0
0
02 Feb 2024
A Survey on Hallucination in Large Vision-Language Models
A Survey on Hallucination in Large Vision-Language Models
Hanchao Liu
Wenyuan Xue
Yifei Chen
Dapeng Chen
Xiutian Zhao
Ke Wang
Liping Hou
Rong-Zhi Li
Wei Peng
LRM
MLLM
14
112
0
01 Feb 2024
Corrective Retrieval Augmented Generation
Corrective Retrieval Augmented Generation
Shi-Qi Yan
Jia-Chen Gu
Yun Zhu
Zhen-Hua Ling
RALM
100
71
0
29 Jan 2024
PROXYQA: An Alternative Framework for Evaluating Long-Form Text
  Generation with Large Language Models
PROXYQA: An Alternative Framework for Evaluating Long-Form Text Generation with Large Language Models
Haochen Tan
Zhijiang Guo
Zhan Shi
Lu Xu
Zhili Liu
...
Xiaoguang Li
Yasheng Wang
Lifeng Shang
Qun Liu
Linqi Song
30
12
0
26 Jan 2024
K-QA: A Real-World Medical Q&A Benchmark
K-QA: A Real-World Medical Q&A Benchmark
Itay Manes
Naama Ronn
David Cohen
Ran Ilan Ber
Zehavi Horowitz-Kugler
Gabriel Stanovsky
LM&MA
HILM
AI4MH
20
11
0
25 Jan 2024
Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs
Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs
Haritz Puerto
Martin Tutek
Somak Aditya
Xiaodan Zhu
Iryna Gurevych
ReCod
ReLM
LRM
43
9
0
18 Jan 2024
RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on
  Agriculture
RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture
M. A. D. L. Balaguer
Vinamra Benara
Renato Luiz de Freitas Cunha
Roberto de M. Estevao Filho
Todd Hendry
...
Morris Sharp
B. Silva
Swati Sharma
Vijay Aski
Ranveer Chandra
FaML
20
79
0
16 Jan 2024
Leveraging Large Language Models for NLG Evaluation: Advances and
  Challenges
Leveraging Large Language Models for NLG Evaluation: Advances and Challenges
Zhen Li
Xiaohan Xu
Tao Shen
Can Xu
Jia-Chen Gu
Yuxuan Lai
Chongyang Tao
Shuai Ma
LM&MA
ELM
31
9
0
13 Jan 2024
Fine-grained Hallucination Detection and Editing for Language Models
Fine-grained Hallucination Detection and Editing for Language Models
Abhika Mishra
Akari Asai
Vidhisha Balachandran
Yizhong Wang
Graham Neubig
Yulia Tsvetkov
Hannaneh Hajishirzi
HILM
24
78
0
12 Jan 2024
EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction
EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction
Siyu Yuan
Kaitao Song
Jiangjie Chen
Xu Tan
Yongliang Shen
Ren Kan
Dongsheng Li
Deqing Yang
LLMAG
15
53
0
11 Jan 2024
Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language
  Model Systems
Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems
Tianyu Cui
Yanling Wang
Chuanpu Fu
Yong Xiao
Sijia Li
...
Junwu Xiong
Xinyu Kong
Zujie Wen
Ke Xu
Qi Li
55
56
0
11 Jan 2024
LightHouse: A Survey of AGI Hallucination
LightHouse: A Survey of AGI Hallucination
Feng Wang
LRM
HILM
VLM
24
3
0
08 Jan 2024
Large Language Models for Social Networks: Applications, Challenges, and
  Solutions
Large Language Models for Social Networks: Applications, Challenges, and Solutions
Jingying Zeng
Richard Huang
Waleed Malik
Langxuan Yin
Bojan Babic
Danny Shacham
Xiao Yan
Jaewon Yang
Qi He
22
3
0
04 Jan 2024
Large Legal Fictions: Profiling Legal Hallucinations in Large Language
  Models
Large Legal Fictions: Profiling Legal Hallucinations in Large Language Models
Matthew Dahl
Varun Magesh
Mirac Suzgun
Daniel E. Ho
HILM
AILaw
25
72
0
02 Jan 2024
Do Androids Know They're Only Dreaming of Electric Sheep?
Do Androids Know They're Only Dreaming of Electric Sheep?
Sky CH-Wang
Benjamin Van Durme
Jason Eisner
Chris Kedzie
HILM
22
27
0
28 Dec 2023
Alleviating Hallucinations of Large Language Models through Induced
  Hallucinations
Alleviating Hallucinations of Large Language Models through Induced Hallucinations
Yue Zhang
Leyang Cui
Wei Bi
Shuming Shi
HILM
34
49
0
25 Dec 2023
DSPy Assertions: Computational Constraints for Self-Refining Language
  Model Pipelines
DSPy Assertions: Computational Constraints for Self-Refining Language Model Pipelines
Arnav Singhvi
Manish Shetty
Shangyin Tan
Christopher Potts
Koushik Sen
Matei A. Zaharia
Omar Khattab
15
17
0
20 Dec 2023
Towards Verifiable Text Generation with Evolving Memory and
  Self-Reflection
Towards Verifiable Text Generation with Evolving Memory and Self-Reflection
Hao-Lun Sun
Hengyi Cai
Bo Wang
Yingyan Hou
Xiaochi Wei
Shuaiqiang Wang
Yan Zhang
Dawei Yin
37
8
0
14 Dec 2023
Evaluating Large Language Models for Health-related Queries with
  Presuppositions
Evaluating Large Language Models for Health-related Queries with Presuppositions
Navreet Kaur
Monojit Choudhury
Danish Pruthi
HILM
ELM
25
2
0
14 Dec 2023
Alignment for Honesty
Alignment for Honesty
Yuqing Yang
Ethan Chern
Xipeng Qiu
Graham Neubig
Pengfei Liu
31
28
0
12 Dec 2023
Dense X Retrieval: What Retrieval Granularity Should We Use?
Dense X Retrieval: What Retrieval Granularity Should We Use?
Tong Chen
Hongwei Wang
Sihao Chen
Wenhao Yu
Kaixin Ma
Xinran Zhao
Hongming Zhang
Dong Yu
15
29
0
11 Dec 2023
User Modeling in the Era of Large Language Models: Current Research and
  Future Directions
User Modeling in the Era of Large Language Models: Current Research and Future Directions
Zhaoxuan Tan
Meng-Long Jiang
16
8
0
11 Dec 2023
ChatGPT's One-year Anniversary: Are Open-Source Large Language Models
  Catching up?
ChatGPT's One-year Anniversary: Are Open-Source Large Language Models Catching up?
Hailin Chen
Fangkai Jiao
Xingxuan Li
Chengwei Qin
Mathieu Ravaut
Ruochen Zhao
Caiming Xiong
Shafiq R. Joty
ELM
CLL
AI4MH
LRM
ALM
77
27
0
28 Nov 2023
RELIC: Investigating Large Language Model Responses using
  Self-Consistency
RELIC: Investigating Large Language Model Responses using Self-Consistency
Furui Cheng
Vilém Zouhar
Simran Arora
Mrinmaya Sachan
Hendrik Strobelt
Mennatallah El-Assady
HILM
19
18
0
28 Nov 2023
A Survey of the Evolution of Language Model-Based Dialogue Systems
A Survey of the Evolution of Language Model-Based Dialogue Systems
Hongru Wang
Lingzhi Wang
Yiming Du
Liang Chen
Jing Zhou
Yufei Wang
Kam-Fai Wong
LRM
49
20
0
28 Nov 2023
Deficiency of Large Language Models in Finance: An Empirical Examination
  of Hallucination
Deficiency of Large Language Models in Finance: An Empirical Examination of Hallucination
Haoqiang Kang
Xiao-Yang Liu
RALM
25
26
0
27 Nov 2023
UHGEval: Benchmarking the Hallucination of Chinese Large Language Models
  via Unconstrained Generation
UHGEval: Benchmarking the Hallucination of Chinese Large Language Models via Unconstrained Generation
Xun Liang
Shichao Song
Simin Niu
Zhiyu Li
Feiyu Xiong
...
Zhaohui Wy
Dawei He
Peng Cheng
Zhonghao Wang
Haiying Deng
HILM
27
18
0
26 Nov 2023
Enhancing Uncertainty-Based Hallucination Detection with Stronger Focus
Enhancing Uncertainty-Based Hallucination Detection with Stronger Focus
Tianhang Zhang
Lin Qiu
Qipeng Guo
Cheng Deng
Yue Zhang
Zheng-Wei Zhang
Cheng Zhou
Xinbing Wang
Luoyi Fu
HILM
75
47
0
22 Nov 2023
Deceptive Semantic Shortcuts on Reasoning Chains: How Far Can Models Go
  without Hallucination?
Deceptive Semantic Shortcuts on Reasoning Chains: How Far Can Models Go without Hallucination?
Bangzheng Li
Ben Zhou
Fei Wang
Xingyu Fu
Dan Roth
Muhao Chen
HILM
LRM
16
15
0
16 Nov 2023
DocLens: Multi-aspect Fine-grained Evaluation for Medical Text
  Generation
DocLens: Multi-aspect Fine-grained Evaluation for Medical Text Generation
Yiqing Xie
Sheng Zhang
Hao Cheng
Pengfei Liu
Zelalem Gero
Cliff Wong
Tristan Naumann
Hoifung Poon
Carolyn Rose
MedIm
11
4
0
16 Nov 2023
Effective Large Language Model Adaptation for Improved Grounding and
  Citation Generation
Effective Large Language Model Adaptation for Improved Grounding and Citation Generation
Xi Ye
Ruoxi Sun
Sercan Ö. Arik
Tomas Pfister
HILM
26
25
0
16 Nov 2023
AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven
  Negative Samples Generation
AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven Negative Samples Generation
Haoyi Qiu
Kung-Hsiang Huang
Jingnong Qu
Nanyun Peng
HILM
21
6
0
16 Nov 2023
ARES: An Automated Evaluation Framework for Retrieval-Augmented
  Generation Systems
ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems
Jon Saad-Falcon
Omar Khattab
Christopher Potts
Matei A. Zaharia
RALM
8
103
0
16 Nov 2023
Ever: Mitigating Hallucination in Large Language Models through
  Real-Time Verification and Rectification
Ever: Mitigating Hallucination in Large Language Models through Real-Time Verification and Rectification
Haoqiang Kang
Juntong Ni
Huaxiu Yao
HILM
LRM
19
33
0
15 Nov 2023
How Well Do Large Language Models Truly Ground?
How Well Do Large Language Models Truly Ground?
Hyunji Lee
Se June Joo
Chaeeun Kim
Joel Jang
Doyoung Kim
Kyoung-Woon On
Minjoon Seo
HILM
21
6
0
15 Nov 2023
Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic
  Fact-checkers
Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic Fact-checkers
Yuxia Wang
Revanth Gangi Reddy
Zain Muhammad Mujahid
Arnav Arora
Aleksandr Rubashevskii
...
Nadav Borenstein
Aditya Pillai
Isabelle Augenstein
Iryna Gurevych
Preslav Nakov
HILM
11
33
0
15 Nov 2023
Previous
123...10789
Next