Detecting Prefix Bias in LLM-based Reward ModelsConference on Fairness, Accountability and Transparency (FAccT), 2025 |
FairTranslate: An English-French Dataset for Gender Bias Evaluation in Machine Translation by Overcoming Gender BinarityConference on Fairness, Accountability and Transparency (FAccT), 2025 |
Benchmarking Adversarial Robustness to Bias Elicitation in Large Language Models: Scalable Automated Assessment with LLM-as-a-JudgeMachine-mediated learning (ML), 2025 |