ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2105.02590
  4. Cited By
Reliability Testing for Natural Language Processing Systems
v1v2v3 (latest)

Reliability Testing for Natural Language Processing Systems

Annual Meeting of the Association for Computational Linguistics (ACL), 2021
6 May 2021
Samson Tan
Shafiq Joty
K. Baxter
Araz Taeihagh
G. Bennett
Min-Yen Kan
ArXiv (abs)PDFHTML

Papers citing "Reliability Testing for Natural Language Processing Systems"

21 / 21 papers shown
Title
AutoTestForge: A Multidimensional Automated Testing Framework for Natural Language Processing Models
Hengrui Xing
Cong Tian
Liang Zhao
Tianhao Shen
WenSheng Wang
N. Zhang
Chao Huang
Zhenhua Duan
172
0
0
07 Mar 2025
Reliability of Topic Modeling
Reliability of Topic ModelingNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Kayla Schroeder
Zach Wood-Doughty
120
0
0
30 Oct 2024
Ethics Whitepaper: Whitepaper on Ethical Research into Large Language
  Models
Ethics Whitepaper: Whitepaper on Ethical Research into Large Language Models
Eddie L. Ungless
Nikolas Vitsakis
Zeerak Talat
James Garforth
Bjorn Ross
Arno Onken
Atoosa Kasirzadeh
Alexandra Birch
226
3
0
17 Oct 2024
Risks and NLP Design: A Case Study on Procedural Document QA
Risks and NLP Design: A Case Study on Procedural Document QAAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Nikita Haduong
Alice Gao
Noah A. Smith
182
5
0
16 Aug 2024
Chaos with Keywords: Exposing Large Language Models Sycophantic
  Hallucination to Misleading Keywords and Evaluating Defense Strategies
Chaos with Keywords: Exposing Large Language Models Sycophantic Hallucination to Misleading Keywords and Evaluating Defense Strategies
Aswin Rrv
Nemika Tyagi
Md Nayem Uddin
Neeraj Varshney
Chitta Baral
145
0
0
06 Jun 2024
Red-Teaming for Generative AI: Silver Bullet or Security Theater?
Red-Teaming for Generative AI: Silver Bullet or Security Theater?AAAI/ACM Conference on AI, Ethics, and Society (AIES), 2024
Michael Feffer
Anusha Sinha
Wesley Hanwen Deng
Zachary Chase Lipton
Hoda Heidari
AAML
345
102
0
29 Jan 2024
"One-Size-Fits-All"? Examining Expectations around What Constitute
  "Fair" or "Good" NLG System Behaviors
"One-Size-Fits-All"? Examining Expectations around What Constitute "Fair" or "Good" NLG System BehaviorsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023
Li Lucy
Su Lin Blodgett
Milad Shokouhi
Hanna M. Wallach
Alexandra Olteanu
249
12
0
23 Oct 2023
LEAP: Efficient and Automated Test Method for NLP Software
LEAP: Efficient and Automated Test Method for NLP SoftwareInternational Conference on Automated Software Engineering (ASE), 2023
Ming-Ming Xiao
Yan Xiao
Hai Dong
Shunhui Ji
Pengcheng Zhang
AAML
146
13
0
22 Aug 2023
A Group-Specific Approach to NLP for Hate Speech Detection
A Group-Specific Approach to NLP for Hate Speech Detection
Karina Halevy
126
1
0
21 Apr 2023
BotSIM: An End-to-End Bot Simulation Toolkit for Commercial
  Task-Oriented Dialog Systems
BotSIM: An End-to-End Bot Simulation Toolkit for Commercial Task-Oriented Dialog Systems
Guangsen Wang
Shafiq Joty
Junnan Li
Steven C. H. Hoi
103
1
0
29 Nov 2022
BotSIM: An End-to-End Bot Simulation Framework for Commercial
  Task-Oriented Dialog Systems
BotSIM: An End-to-End Bot Simulation Framework for Commercial Task-Oriented Dialog SystemsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Guangsen Wang
Samson Tan
Shafiq Joty
Ganglu Wu
Jimmy Au
Steven C. H. Hoi
207
3
0
22 Nov 2022
Prompting GPT-3 To Be Reliable
Prompting GPT-3 To Be ReliableInternational Conference on Learning Representations (ICLR), 2022
Chenglei Si
Zhe Gan
Zhengyuan Yang
Shuohang Wang
Jianfeng Wang
Jordan L. Boyd-Graber
Lijuan Wang
KELMLRM
287
331
0
17 Oct 2022
The Birth of Bias: A case study on the evolution of gender bias in an
  English language model
The Birth of Bias: A case study on the evolution of gender bias in an English language model
Oskar van der Wal
Jaap Jumelet
K. Schulz
Willem H. Zuidema
266
18
0
21 Jul 2022
Characteristics of Harmful Text: Towards Rigorous Benchmarking of
  Language Models
Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language ModelsNeural Information Processing Systems (NeurIPS), 2022
Maribeth Rauh
John F. J. Mellor
J. Uesato
Po-Sen Huang
Johannes Welbl
...
Amelia Glaese
G. Irving
Iason Gabriel
William S. Isaac
Lisa Anne Hendricks
213
59
0
16 Jun 2022
The Risks of Machine Learning Systems
The Risks of Machine Learning Systems
Samson Tan
Araz Taeihagh
K. Baxter
94
7
0
21 Apr 2022
Measure and Improve Robustness in NLP Models: A Survey
Measure and Improve Robustness in NLP Models: A Survey
Xuezhi Wang
Haohan Wang
Diyi Yang
428
155
0
15 Dec 2021
NL-Augmenter: A Framework for Task-Sensitive Natural Language
  Augmentation
NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation
Kaustubh D. Dhole
Varun Gangal
Sebastian Gehrmann
Aadesh Gupta
Zhenhao Li
...
Tianbao Xie
Usama Yaseen
Michael A. Yee
Jing Zhang
Yue Zhang
382
95
0
06 Dec 2021
TraVLR: Now You See It, Now You Don't! A Bimodal Dataset for Evaluating
  Visio-Linguistic Reasoning
TraVLR: Now You See It, Now You Don't! A Bimodal Dataset for Evaluating Visio-Linguistic ReasoningConference of the European Chapter of the Association for Computational Linguistics (EACL), 2021
Keng Ji Chow
Samson Tan
MingSung Kan
LRM
146
5
0
21 Nov 2021
Language Invariant Properties in Natural Language Processing
Language Invariant Properties in Natural Language Processing
Federico Bianchi
Debora Nozza
Dirk Hovy
172
4
0
27 Sep 2021
Automatic Construction of Evaluation Suites for Natural Language
  Generation Datasets
Automatic Construction of Evaluation Suites for Natural Language Generation Datasets
Simon Mille
Kaustubh D. Dhole
Saad Mahamood
Laura Perez-Beltrachini
Varun Gangal
Mihir Kale
Emiel van Miltenburg
Sebastian Gehrmann
ELM
155
25
0
16 Jun 2021
"Why Should I Trust You?": Explaining the Predictions of Any Classifier
"Why Should I Trust You?": Explaining the Predictions of Any Classifier
Marco Tulio Ribeiro
Sameer Singh
Carlos Guestrin
FAttFaML
1.8K
19,183
0
16 Feb 2016
1