v1v2v3 (latest)

Reliability Testing for Natural Language Processing Systems

Annual Meeting of the Association for Computational Linguistics (ACL), 2021

6 May 2021

Papers citing "Reliability Testing for Natural Language Processing Systems"

21 / 21 papers shown

Title
AutoTestForge: A Multidimensional Automated Testing Framework for Natural Language Processing Models Hengrui Xing Cong Tian Liang Zhao Tianhao Shen WenSheng Wang N. Zhang Chao Huang Zhenhua Duan 172 0 0 07 Mar 2025
Reliability of Topic ModelingNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024 Kayla Schroeder Zach Wood-Doughty 120 0 0 30 Oct 2024
Ethics Whitepaper: Whitepaper on Ethical Research into Large Language Models Eddie L. Ungless Nikolas Vitsakis Zeerak Talat James Garforth Bjorn Ross Arno Onken Atoosa Kasirzadeh Alexandra Birch 226 3 0 17 Oct 2024
Risks and NLP Design: A Case Study on Procedural Document QAAnnual Meeting of the Association for Computational Linguistics (ACL), 2024 Nikita Haduong Alice Gao Noah A. Smith 182 5 0 16 Aug 2024
Chaos with Keywords: Exposing Large Language Models Sycophantic Hallucination to Misleading Keywords and Evaluating Defense Strategies Aswin Rrv Nemika Tyagi Md Nayem Uddin Neeraj Varshney Chitta Baral 145 0 0 06 Jun 2024
Red-Teaming for Generative AI: Silver Bullet or Security Theater?AAAI/ACM Conference on AI, Ethics, and Society (AIES), 2024 Michael Feffer Anusha Sinha Wesley Hanwen Deng Zachary Chase Lipton Hoda Heidari AAML 345 102 0 29 Jan 2024
"One-Size-Fits-All"? Examining Expectations around What Constitute "Fair" or "Good" NLG System BehaviorsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023 Li Lucy Su Lin Blodgett Milad Shokouhi Hanna M. Wallach Alexandra Olteanu 249 12 0 23 Oct 2023
LEAP: Efficient and Automated Test Method for NLP SoftwareInternational Conference on Automated Software Engineering (ASE), 2023 Ming-Ming Xiao Yan Xiao Hai Dong Shunhui Ji Pengcheng Zhang AAML 146 13 0 22 Aug 2023
A Group-Specific Approach to NLP for Hate Speech Detection Karina Halevy 126 1 0 21 Apr 2023
BotSIM: An End-to-End Bot Simulation Toolkit for Commercial Task-Oriented Dialog Systems Guangsen Wang Shafiq Joty Junnan Li Steven C. H. Hoi 103 1 0 29 Nov 2022
BotSIM: An End-to-End Bot Simulation Framework for Commercial Task-Oriented Dialog SystemsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 Guangsen Wang Samson Tan Shafiq Joty Ganglu Wu Jimmy Au Steven C. H. Hoi 207 3 0 22 Nov 2022
Prompting GPT-3 To Be ReliableInternational Conference on Learning Representations (ICLR), 2022 Chenglei Si Zhe Gan Zhengyuan Yang Shuohang Wang Jianfeng Wang Jordan L. Boyd-Graber Lijuan Wang KELM LRM 287 331 0 17 Oct 2022
The Birth of Bias: A case study on the evolution of gender bias in an English language model Oskar van der Wal Jaap Jumelet K. Schulz Willem H. Zuidema 266 18 0 21 Jul 2022
Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language ModelsNeural Information Processing Systems (NeurIPS), 2022 Maribeth Rauh John F. J. Mellor J. Uesato Po-Sen Huang Johannes Welbl ... Amelia Glaese G. Irving Iason Gabriel William S. Isaac Lisa Anne Hendricks 213 59 0 16 Jun 2022
The Risks of Machine Learning Systems Samson Tan Araz Taeihagh K. Baxter 94 7 0 21 Apr 2022
Measure and Improve Robustness in NLP Models: A Survey Xuezhi Wang Haohan Wang Diyi Yang 428 155 0 15 Dec 2021
NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation Kaustubh D. Dhole Varun Gangal Sebastian Gehrmann Aadesh Gupta Zhenhao Li ... Tianbao Xie Usama Yaseen Michael A. Yee Jing Zhang Yue Zhang 382 95 0 06 Dec 2021
TraVLR: Now You See It, Now You Don't! A Bimodal Dataset for Evaluating Visio-Linguistic ReasoningConference of the European Chapter of the Association for Computational Linguistics (EACL), 2021 Keng Ji Chow Samson Tan MingSung Kan LRM 146 5 0 21 Nov 2021
Language Invariant Properties in Natural Language Processing Federico Bianchi Debora Nozza Dirk Hovy 172 4 0 27 Sep 2021
Automatic Construction of Evaluation Suites for Natural Language Generation Datasets Simon Mille Kaustubh D. Dhole Saad Mahamood Laura Perez-Beltrachini Varun Gangal Mihir Kale Emiel van Miltenburg Sebastian Gehrmann ELM 155 25 0 16 Jun 2021
"Why Should I Trust You?": Explaining the Predictions of Any Classifier Marco Tulio Ribeiro Sameer Singh Carlos Guestrin FAtt FaML 1.8K 19,183 0 16 Feb 2016