ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.07626
  4. Cited By
BERTScore is Unfair: On Social Bias in Language Model-Based Metrics for
  Text Generation

BERTScore is Unfair: On Social Bias in Language Model-Based Metrics for Text Generation

14 October 2022
Tianxiang Sun
Junliang He
Xipeng Qiu
Xuanjing Huang
ArXivPDFHTML

Papers citing "BERTScore is Unfair: On Social Bias in Language Model-Based Metrics for Text Generation"

32 / 32 papers shown
Title
Reasoning Towards Fairness: Mitigating Bias in Language Models through Reasoning-Guided Fine-Tuning
Reasoning Towards Fairness: Mitigating Bias in Language Models through Reasoning-Guided Fine-Tuning
Sanchit Kabra
Akshita Jha
Chandan K. Reddy
LRM
21
0
0
08 Apr 2025
DAFE: LLM-Based Evaluation Through Dynamic Arbitration for Free-Form Question-Answering
Sher Badshah
Hassan Sajjad
60
1
0
11 Mar 2025
BatchGEMBA: Token-Efficient Machine Translation Evaluation with Batched Prompting and Prompt Compression
Daniil Larionov
Steffen Eger
VLM
MQ
74
0
0
04 Mar 2025
Reference-Guided Verdict: LLMs-as-Judges in Automatic Evaluation of
  Free-Form Text
Reference-Guided Verdict: LLMs-as-Judges in Automatic Evaluation of Free-Form Text
Sher Badshah
Hassan Sajjad
ELM
36
9
0
17 Aug 2024
A Comparative Study of Quality Evaluation Methods for Text Summarization
A Comparative Study of Quality Evaluation Methods for Text Summarization
Huyen Nguyen
Haihua Chen
Lavanya Pobbathi
Junhua Ding
ELM
24
5
0
30 Jun 2024
Measuring Retrieval Complexity in Question Answering Systems
Measuring Retrieval Complexity in Question Answering Systems
Matteo Gabburo
Nicolaas Paul Jedema
Siddhant Garg
Leonardo F. R. Ribeiro
Alessandro Moschitti
21
0
0
05 Jun 2024
Expert-Guided Extinction of Toxic Tokens for Debiased Generation
Expert-Guided Extinction of Toxic Tokens for Debiased Generation
Xueyao Sun
Kaize Shi
Haoran Tang
Guandong Xu
Qing Li
MU
35
1
0
29 May 2024
A Survey on Multilingual Large Language Models: Corpora, Alignment, and
  Bias
A Survey on Multilingual Large Language Models: Corpora, Alignment, and Bias
Yuemei Xu
Ling Hu
Jiayi Zhao
Zihan Qiu
Yuqi Ye
Hanwen Gu
LRM
19
36
0
01 Apr 2024
Fairness in Large Language Models: A Taxonomic Survey
Fairness in Large Language Models: A Taxonomic Survey
Zhibo Chu
Zichong Wang
Wenbin Zhang
AILaw
33
31
0
31 Mar 2024
Measuring Political Bias in Large Language Models: What Is Said and How
  It Is Said
Measuring Political Bias in Large Language Models: What Is Said and How It Is Said
Yejin Bang
Delong Chen
Nayeon Lee
Pascale Fung
21
25
0
27 Mar 2024
Take Care of Your Prompt Bias! Investigating and Mitigating Prompt Bias
  in Factual Knowledge Extraction
Take Care of Your Prompt Bias! Investigating and Mitigating Prompt Bias in Factual Knowledge Extraction
Ziyang Xu
Keqin Peng
Liang Ding
Dacheng Tao
Xiliang Lu
32
9
0
15 Mar 2024
Fine-Tuned Machine Translation Metrics Struggle in Unseen Domains
Fine-Tuned Machine Translation Metrics Struggle in Unseen Domains
Vilém Zouhar
Shuoyang Ding
Anna Currey
Tatyana Badeka
Jenyuan Wang
Brian Thompson
25
14
0
28 Feb 2024
LLM-based NLG Evaluation: Current Status and Challenges
LLM-based NLG Evaluation: Current Status and Challenges
Mingqi Gao
Xinyu Hu
Jie Ruan
Xiao Pu
Xiaojun Wan
ELM
LM&MA
53
28
0
02 Feb 2024
AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven
  Negative Samples Generation
AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven Negative Samples Generation
Haoyi Qiu
Kung-Hsiang Huang
Jingnong Qu
Nanyun Peng
HILM
14
6
0
16 Nov 2023
ContraDoc: Understanding Self-Contradictions in Documents with Large
  Language Models
ContraDoc: Understanding Self-Contradictions in Documents with Large Language Models
Jierui Li
Vipul Raheja
Dhruv Kumar
SyDa
13
3
0
15 Nov 2023
Defining a New NLP Playground
Defining a New NLP Playground
Sha Li
Chi Han
Pengfei Yu
Carl N. Edwards
Manling Li
...
Yi Ren Fung
Charles Yu
Joel R. Tetreault
Eduard H. Hovy
Heng Ji
31
5
0
31 Oct 2023
Reference Free Domain Adaptation for Translation of Noisy Questions with
  Question Specific Rewards
Reference Free Domain Adaptation for Translation of Noisy Questions with Question Specific Rewards
Baban Gain
Ramakrishna Appicharla
Soumya Chennabasavaraj
Nikesh Garera
Asif Ekbal
M. Chelliah
14
0
0
23 Oct 2023
That was the last straw, we need more: Are Translation Systems Sensitive
  to Disambiguating Context?
That was the last straw, we need more: Are Translation Systems Sensitive to Disambiguating Context?
Jaechan Lee
Alisa Liu
Orevaoghene Ahia
Hila Gonen
Noah A. Smith
13
3
0
23 Oct 2023
Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model
  Collaboration
Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration
Qiushi Sun
Zhangyue Yin
Xiang Li
Zhiyong Wu
Xipeng Qiu
Lingpeng Kong
LRM
LLMAG
15
43
0
30 Sep 2023
Investigating Subtler Biases in LLMs: Ageism, Beauty, Institutional, and
  Nationality Bias in Generative Models
Investigating Subtler Biases in LLMs: Ageism, Beauty, Institutional, and Nationality Bias in Generative Models
M. Kamruzzaman
M. M. I. Shovon
Gene Louis Kim
38
12
0
16 Sep 2023
An Appraisal-Based Chain-Of-Emotion Architecture for Affective Language
  Model Game Agents
An Appraisal-Based Chain-Of-Emotion Architecture for Affective Language Model Game Agents
Maximilian Croissant
Madeleine Frister
Guy Schofield
Cade McCall
LLMAG
21
14
0
10 Sep 2023
BLEURT Has Universal Translations: An Analysis of Automatic Metrics by
  Minimum Risk Training
BLEURT Has Universal Translations: An Analysis of Automatic Metrics by Minimum Risk Training
Yiming Yan
Tao Wang
Chengqi Zhao
Shujian Huang
Jiajun Chen
Mingxuan Wang
14
22
0
06 Jul 2023
Towards Explainable Evaluation Metrics for Machine Translation
Towards Explainable Evaluation Metrics for Machine Translation
Christoph Leiter
Piyawat Lertvittayakumjorn
M. Fomicheva
Wei-Ye Zhao
Yang Gao
Steffen Eger
ELM
12
11
0
22 Jun 2023
Overview of Robust and Multilingual Automatic Evaluation Metrics for
  Open-Domain Dialogue Systems at DSTC 11 Track 4
Overview of Robust and Multilingual Automatic Evaluation Metrics for Open-Domain Dialogue Systems at DSTC 11 Track 4
Mario Rodríguez-Cantelar
Chen Zhang
Chengguang Tang
Ke Shi
Sarik Ghazarian
João Sedoc
L. F. D’Haro
Alexander I. Rudnicky
22
8
0
22 Jun 2023
Dior-CVAE: Pre-trained Language Models and Diffusion Priors for
  Variational Dialog Generation
Dior-CVAE: Pre-trained Language Models and Diffusion Priors for Variational Dialog Generation
Tianyu Yang
Thy Thy Tran
Iryna Gurevych
DiffM
13
1
0
24 May 2023
Gender Biases in Automatic Evaluation Metrics for Image Captioning
Gender Biases in Automatic Evaluation Metrics for Image Captioning
Haoyi Qiu
Zi-Yi Dou
Tianlu Wang
Asli Celikyilmaz
Nanyun Peng
EGVM
13
8
0
24 May 2023
NLG Evaluation Metrics Beyond Correlation Analysis: An Empirical Metric
  Preference Checklist
NLG Evaluation Metrics Beyond Correlation Analysis: An Empirical Metric Preference Checklist
Iftitahu Ni'mah
Meng Fang
Vlado Menkovski
Mykola Pechenizkiy
12
8
0
15 May 2023
Elastic Weight Removal for Faithful and Abstractive Dialogue Generation
Elastic Weight Removal for Faithful and Abstractive Dialogue Generation
Nico Daheim
Nouha Dziri
Mrinmaya Sachan
Iryna Gurevych
E. Ponti
MoMe
21
30
0
30 Mar 2023
DERA: Enhancing Large Language Model Completions with Dialog-Enabled
  Resolving Agents
DERA: Enhancing Large Language Model Completions with Dialog-Enabled Resolving Agents
Varun Nair
Elliot Schumacher
Geoffrey Tso
Anitha Kannan
VLM
17
60
0
30 Mar 2023
MENLI: Robust Evaluation Metrics from Natural Language Inference
MENLI: Robust Evaluation Metrics from Natural Language Inference
Yanran Chen
Steffen Eger
16
15
0
15 Aug 2022
Rethinking embedding coupling in pre-trained language models
Rethinking embedding coupling in pre-trained language models
Hyung Won Chung
Thibault Févry
Henry Tsai
Melvin Johnson
Sebastian Ruder
90
142
0
24 Oct 2020
Pre-trained Models for Natural Language Processing: A Survey
Pre-trained Models for Natural Language Processing: A Survey
Xipeng Qiu
Tianxiang Sun
Yige Xu
Yunfan Shao
Ning Dai
Xuanjing Huang
LM&MA
VLM
229
1,281
0
18 Mar 2020
1