ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2107.03451
  4. Cited By
Anticipating Safety Issues in E2E Conversational AI: Framework and
  Tooling

Anticipating Safety Issues in E2E Conversational AI: Framework and Tooling

7 July 2021
Emily Dinan
Gavin Abercrombie
A. S. Bergman
Shannon L. Spruit
Dirk Hovy
Y-Lan Boureau
Verena Rieser
ArXivPDFHTML

Papers citing "Anticipating Safety Issues in E2E Conversational AI: Framework and Tooling"

50 / 74 papers shown
Title
Ethical Concern Identification in NLP: A Corpus of ACL Anthology Ethics
  Statements
Ethical Concern Identification in NLP: A Corpus of ACL Anthology Ethics Statements
Antonia Karamolegkou
Sandrine Schiller Hansen
Ariadni Christopoulou
Filippos Stamatiou
Anne Lauscher
Anders Søgaard
21
0
0
12 Nov 2024
Responsible Retrieval Augmented Generation for Climate Decision Making
  from Documents
Responsible Retrieval Augmented Generation for Climate Decision Making from Documents
Matyas Juhasz
Kalyan Dutia
Henry Franks
Conor Delahunty
Patrick Fawbert Mills
Harrison Pim
29
1
0
31 Oct 2024
Active Learning for Robust and Representative LLM Generation in
  Safety-Critical Scenarios
Active Learning for Robust and Representative LLM Generation in Safety-Critical Scenarios
Sabit Hassan
Anthony Sicilia
Malihe Alikhani
24
2
0
14 Oct 2024
RePD: Defending Jailbreak Attack through a Retrieval-based Prompt
  Decomposition Process
RePD: Defending Jailbreak Attack through a Retrieval-based Prompt Decomposition Process
Peiran Wang
Xiaogeng Liu
Chaowei Xiao
AAML
24
3
0
11 Oct 2024
Safe Generative Chats in a WhatsApp Intelligent Tutoring System
Safe Generative Chats in a WhatsApp Intelligent Tutoring System
Zachary Levonian
Owen Henkel
KELM
24
0
0
06 Jul 2024
AI Alignment through Reinforcement Learning from Human Feedback?
  Contradictions and Limitations
AI Alignment through Reinforcement Learning from Human Feedback? Contradictions and Limitations
Adam Dahlgren Lindstrom
Leila Methnani
Lea Krause
Petter Ericson
Ínigo Martínez de Rituerto de Troya
Dimitri Coelho Mollo
Roel Dobbe
ALM
31
2
0
26 Jun 2024
Shortcomings of LLMs for Low-Resource Translation: Retrieval and
  Understanding are Both the Problem
Shortcomings of LLMs for Low-Resource Translation: Retrieval and Understanding are Both the Problem
Sara Court
Micha Elsner
32
6
0
21 Jun 2024
AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks
AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks
Yifan Zeng
Yiran Wu
Xiao Zhang
Huazheng Wang
Qingyun Wu
LLMAG
AAML
35
59
0
02 Mar 2024
Prompt Stealing Attacks Against Large Language Models
Prompt Stealing Attacks Against Large Language Models
Zeyang Sha
Yang Zhang
SILM
AAML
27
28
0
20 Feb 2024
Mapping the Ethics of Generative AI: A Comprehensive Scoping Review
Mapping the Ethics of Generative AI: A Comprehensive Scoping Review
Thilo Hagendorff
21
35
0
13 Feb 2024
Cheap Learning: Maximising Performance of Language Models for Social
  Data Science Using Minimal Data
Cheap Learning: Maximising Performance of Language Models for Social Data Science Using Minimal Data
Leonardo Castro-Gonzalez
Yi-Ling Chung
Hannak Rose Kirk
John Francis
Angus R. Williams
Pica Johansson
Jonathan Bright
37
1
0
22 Jan 2024
Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations
Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations
Hakan Inan
Kartikeya Upasani
Jianfeng Chi
Rashi Rungta
Krithika Iyer
...
Michael Tontchev
Qing Hu
Brian Fuller
Davide Testuggine
Madian Khabsa
AI4MH
11
371
0
07 Dec 2023
Comprehensive Assessment of Toxicity in ChatGPT
Comprehensive Assessment of Toxicity in ChatGPT
Boyang Zhang
Xinyue Shen
Waiman Si
Zeyang Sha
Z. Chen
Ahmed Salem
Yun Shen
Michael Backes
Yang Zhang
SILM
8
3
0
03 Nov 2023
Enhancing Pipeline-Based Conversational Agents with Large Language
  Models
Enhancing Pipeline-Based Conversational Agents with Large Language Models
Mina Foosherian
Hendrik Purwins
Purna Rathnayake
Touhidul Alam
Rui Teimao
K. Thoben
LLMAG
19
2
0
07 Sep 2023
Rethinking Machine Ethics -- Can LLMs Perform Moral Reasoning through
  the Lens of Moral Theories?
Rethinking Machine Ethics -- Can LLMs Perform Moral Reasoning through the Lens of Moral Theories?
Jingyan Zhou
Minda Hu
Junan Li
Xiaoying Zhang
Xixin Wu
Irwin King
Helen M. Meng
LRM
29
24
0
29 Aug 2023
Challenges of GPT-3-based Conversational Agents for Healthcare
Challenges of GPT-3-based Conversational Agents for Healthcare
Fabian Lechner
Allison Lahnala
Charles F Welch
Lucie Flek
LM&MA
15
2
0
28 Aug 2023
Neural Conversation Models and How to Rein Them in: A Survey of Failures
  and Fixes
Neural Conversation Models and How to Rein Them in: A Survey of Failures and Fixes
Fabian Galetzka
Anne Beyer
David Schlangen
AI4CE
11
1
0
11 Aug 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MH
ALM
88
10,947
0
18 Jul 2023
CARE-MI: Chinese Benchmark for Misinformation Evaluation in Maternity
  and Infant Care
CARE-MI: Chinese Benchmark for Misinformation Evaluation in Maternity and Infant Care
Tong Xiang
Liangzhi Li
Wangyue Li
Min‐Jun Bai
Lu Wei
Bowen Wang
Noa Garcia
23
5
0
04 Jul 2023
CBBQ: A Chinese Bias Benchmark Dataset Curated with Human-AI
  Collaboration for Large Language Models
CBBQ: A Chinese Bias Benchmark Dataset Curated with Human-AI Collaboration for Large Language Models
Yufei Huang
Deyi Xiong
ALM
29
17
0
28 Jun 2023
DICES Dataset: Diversity in Conversational AI Evaluation for Safety
DICES Dataset: Diversity in Conversational AI Evaluation for Safety
Lora Aroyo
Alex S. Taylor
Mark Díaz
Christopher Homan
Alicia Parrish
Greg Serapio-García
Vinodkumar Prabhakaran
Ding Wang
19
33
0
20 Jun 2023
Evaluating the Social Impact of Generative AI Systems in Systems and
  Society
Evaluating the Social Impact of Generative AI Systems in Systems and Society
Irene Solaiman
Zeerak Talat
William Agnew
Lama Ahmad
Dylan K. Baker
...
Marie-Therese Png
Shubham Singh
A. Strait
Lukas Struppek
Arjun Subramonian
ELM
EGVM
31
103
0
09 Jun 2023
Improving Open Language Models by Learning from Organic Interactions
Improving Open Language Models by Learning from Organic Interactions
Jing Xu
Da Ju
Joshua Lane
M. Komeili
Eric Michael Smith
...
Rashel Moritz
Sainbayar Sukhbaatar
Y-Lan Boureau
Jason Weston
Kurt Shuster
17
8
0
07 Jun 2023
On "Scientific Debt" in NLP: A Case for More Rigour in Language Model
  Pre-Training Research
On "Scientific Debt" in NLP: A Case for More Rigour in Language Model Pre-Training Research
Made Nindyatama Nityasya
Haryo Akbarianto Wibowo
Alham Fikri Aji
Genta Indra Winata
Radityo Eko Prasojo
Phil Blunsom
A. Kuncoro
8
8
0
05 Jun 2023
Reimagining Retrieval Augmented Language Models for Answering Queries
Reimagining Retrieval Augmented Language Models for Answering Queries
W. Tan
Yuliang Li
Pedro Rodriguez
Rich James
Xi Victoria Lin
A. Halevy
Scott Yih
KELM
LRM
21
9
0
01 Jun 2023
ANTONIO: Towards a Systematic Method of Generating NLP Benchmarks for
  Verification
ANTONIO: Towards a Systematic Method of Generating NLP Benchmarks for Verification
Marco Casadio
Luca Arnaboldi
M. Daggitt
Omri Isac
Tanvi Dinkar
Daniel Kienitz
Verena Rieser
Ekaterina Komendantskaya
17
4
0
06 May 2023
Quality-agnostic Image Captioning to Safely Assist People with Vision
  Impairment
Quality-agnostic Image Captioning to Safely Assist People with Vision Impairment
Lu Yu
Malvina Nikandrou
Jiali Jin
Verena Rieser
37
5
0
28 Apr 2023
Towards Explainable and Safe Conversational Agents for Mental Health: A
  Survey
Towards Explainable and Safe Conversational Agents for Mental Health: A Survey
Surjodeep Sarkar
Manas Gaur
L. Chen
Muskan Garg
Biplav Srivastava
B. Dongaonkar
AI4MH
19
1
0
25 Apr 2023
Towards Safer Generative Language Models: A Survey on Safety Risks,
  Evaluations, and Improvements
Towards Safer Generative Language Models: A Survey on Safety Risks, Evaluations, and Improvements
Jiawen Deng
Jiale Cheng
Hao-Lun Sun
Zhexin Zhang
Minlie Huang
LM&MA
ELM
21
15
0
18 Feb 2023
The Capacity for Moral Self-Correction in Large Language Models
The Capacity for Moral Self-Correction in Large Language Models
Deep Ganguli
Amanda Askell
Nicholas Schiefer
Thomas I. Liao
Kamil.e Lukovsiut.e
...
Tom B. Brown
C. Olah
Jack Clark
Sam Bowman
Jared Kaplan
LRM
ReLM
26
157
0
15 Feb 2023
Using In-Context Learning to Improve Dialogue Safety
Using In-Context Learning to Improve Dialogue Safety
Nicholas Meade
Spandana Gella
Devamanyu Hazarika
Prakhar Gupta
Di Jin
Siva Reddy
Yang Liu
Dilek Z. Hakkani-Tür
25
37
0
02 Feb 2023
Fillers in Spoken Language Understanding: Computational and
  Psycholinguistic Perspectives
Fillers in Spoken Language Understanding: Computational and Psycholinguistic Perspectives
Tanvi Dinkar
Chloé Clavel
I. Vasilescu
16
10
0
25 Jan 2023
Computer says "No": The Case Against Empathetic Conversational AI
Computer says "No": The Case Against Empathetic Conversational AI
Alba Curry
A. C. Curry
32
8
0
21 Dec 2022
MoralDial: A Framework to Train and Evaluate Moral Dialogue Systems via
  Moral Discussions
MoralDial: A Framework to Train and Evaluate Moral Dialogue Systems via Moral Discussions
Hao-Lun Sun
Zhexin Zhang
Fei Mi
Yasheng Wang
W. Liu
Jianwei Cui
Bin Wang
Qun Liu
Minlie Huang
17
19
0
21 Dec 2022
DialGuide: Aligning Dialogue Model Behavior with Developer Guidelines
DialGuide: Aligning Dialogue Model Behavior with Developer Guidelines
Prakhar Gupta
Yang Liu
Di Jin
Behnam Hedayatnia
Spandana Gella
Sijia Liu
P. Lange
Julia Hirschberg
Dilek Z. Hakkani-Tür
18
5
0
20 Dec 2022
MULTI3NLU++: A Multilingual, Multi-Intent, Multi-Domain Dataset for
  Natural Language Understanding in Task-Oriented Dialogue
MULTI3NLU++: A Multilingual, Multi-Intent, Multi-Domain Dataset for Natural Language Understanding in Task-Oriented Dialogue
Nikita Moghe
E. Razumovskaia
Liane Guillou
Ivan Vulić
Anna Korhonen
Alexandra Birch
19
13
0
20 Dec 2022
AutoReply: Detecting Nonsense in Dialogue Introspectively with
  Discriminative Replies
AutoReply: Detecting Nonsense in Dialogue Introspectively with Discriminative Replies
Weiyan Shi
Emily Dinan
Adithya Renduchintala
Daniel Fried
Athul Paul Jacob
Zhou Yu
M. Lewis
AAML
15
2
0
22 Nov 2022
Cultural Incongruencies in Artificial Intelligence
Cultural Incongruencies in Artificial Intelligence
Vinodkumar Prabhakaran
Rida Qadri
Ben Hutchinson
19
22
0
19 Nov 2022
The CRINGE Loss: Learning what language not to model
The CRINGE Loss: Learning what language not to model
Leonard Adolphs
Tianyu Gao
Jing Xu
Kurt Shuster
Sainbayar Sukhbaatar
Jason Weston
MU
15
34
0
10 Nov 2022
PLATO-K: Internal and External Knowledge Enhanced Dialogue Generation
PLATO-K: Internal and External Knowledge Enhanced Dialogue Generation
Siqi Bao
H. He
Jun Xu
Hua Lu
Fan Wang
Hua-Hong Wu
Han Zhou
Wenquan Wu
Zheng-Yu Niu
Haifeng Wang
19
4
0
02 Nov 2022
Risk-graded Safety for Handling Medical Queries in Conversational AI
Risk-graded Safety for Handling Medical Queries in Conversational AI
Gavin Abercrombie
Verena Rieser
AI4MH
22
11
0
02 Oct 2022
Data Feedback Loops: Model-driven Amplification of Dataset Biases
Data Feedback Loops: Model-driven Amplification of Dataset Biases
Rohan Taori
Tatsunori B. Hashimoto
64
42
0
08 Sep 2022
Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain
  Chatbots
Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain Chatbots
Waiman Si
Michael Backes
Jeremy Blackburn
Emiliano De Cristofaro
Gianluca Stringhini
Savvas Zannettou
Yang Zhang
16
57
0
07 Sep 2022
Towards Boosting the Open-Domain Chatbot with Human Feedback
Towards Boosting the Open-Domain Chatbot with Human Feedback
Hua Lu
Siqi Bao
H. He
Fan Wang
Hua-Hong Wu
Haifeng Wang
ALM
13
17
0
30 Aug 2022
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors,
  and Lessons Learned
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
Deep Ganguli
Liane Lovitt
John Kernion
Amanda Askell
Yuntao Bai
...
Nicholas Joseph
Sam McCandlish
C. Olah
Jared Kaplan
Jack Clark
218
441
0
23 Aug 2022
Learning from data in the mixed adversarial non-adversarial case:
  Finding the helpers and ignoring the trolls
Learning from data in the mixed adversarial non-adversarial case: Finding the helpers and ignoring the trolls
Da Ju
Jing Xu
Y-Lan Boureau
Jason Weston
AAML
17
17
0
05 Aug 2022
BlenderBot 3: a deployed conversational agent that continually learns to
  responsibly engage
BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage
Kurt Shuster
Jing Xu
M. Komeili
Da Ju
Eric Michael Smith
...
Naman Goyal
Arthur Szlam
Y-Lan Boureau
Melanie Kambadur
Jason Weston
LM&Ro
KELM
22
233
0
05 Aug 2022
AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq
  Model
AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model
Saleh Soltan
Shankar Ananthakrishnan
Jack G. M. FitzGerald
Rahul Gupta
Wael Hamza
...
Mukund Sridhar
Fabian Triefenbach
Apurv Verma
Gökhan Tür
Premkumar Natarajan
34
82
0
02 Aug 2022
Neural Generation Meets Real People: Building a Social, Informative
  Open-Domain Dialogue Agent
Neural Generation Meets Real People: Building a Social, Informative Open-Domain Dialogue Agent
Ethan A. Chi
Ashwin Paranjape
A. See
Caleb Chiam
Trenton Chang
...
Dilara Soylu
Jillian Tang
A. Narayan
Giovanni Campagna
Christopher D. Manning
21
7
0
25 Jul 2022
Why Robust Natural Language Understanding is a Challenge
Why Robust Natural Language Understanding is a Challenge
Marco Casadio
Ekaterina Komendantskaya
Verena Rieser
M. Daggitt
Daniel Kienitz
Luca Arnaboldi
Wen Kokke
OOD
AAML
18
0
0
21 Jun 2022
12
Next