ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.07589
  4. Cited By
Goodtriever: Adaptive Toxicity Mitigation with Retrieval-augmented
  Models

Goodtriever: Adaptive Toxicity Mitigation with Retrieval-augmented Models

11 October 2023
Luiza Amador Pozzobon
B. Ermiş
Patrick Lewis
Sara Hooker
ArXivPDFHTML

Papers citing "Goodtriever: Adaptive Toxicity Mitigation with Retrieval-augmented Models"

24 / 24 papers shown
Title
Risk-Aware Distributional Intervention Policies for Language Models
Bao Nguyen
Binh Nguyen
Duy Nguyen
V. Nguyen
28
1
0
28 Jan 2025
RSA-Control: A Pragmatics-Grounded Lightweight Controllable Text
  Generation Framework
RSA-Control: A Pragmatics-Grounded Lightweight Controllable Text Generation Framework
Yifan Wang
Vera Demberg
24
0
0
24 Oct 2024
Nexus: Specialization meets Adaptability for Efficiently Training
  Mixture of Experts
Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts
Nikolas Gritsch
Qizhen Zhang
Acyr F. Locatelli
Sara Hooker
A. Ustun
MoE
48
1
0
28 Aug 2024
The Future of Open Human Feedback
The Future of Open Human Feedback
Shachar Don-Yehiya
Ben Burtenshaw
Ramon Fernandez Astudillo
Cailean Osborne
Mimansa Jaiswal
...
Omri Abend
Jennifer Ding
Sara Hooker
Hannah Rose Kirk
Leshem Choshen
VLM
ALM
62
4
0
15 Aug 2024
The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the Community
The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the Community
Shachar Don-Yehiya
Leshem Choshen
Omri Abend
19
2
0
15 Aug 2024
On the Limitations of Compute Thresholds as a Governance Strategy
On the Limitations of Compute Thresholds as a Governance Strategy
Sara Hooker
42
14
0
08 Jul 2024
Preference Tuning For Toxicity Mitigation Generalizes Across Languages
Preference Tuning For Toxicity Mitigation Generalizes Across Languages
Xiaochen Li
Zheng-Xin Yong
Stephen H. Bach
CLL
23
13
0
23 Jun 2024
LIDAO: Towards Limited Interventions for Debiasing (Large) Language
  Models
LIDAO: Towards Limited Interventions for Debiasing (Large) Language Models
Tianci Liu
Haoyu Wang
Shiyang Wang
Yu Cheng
Jing Gao
ALM
30
0
0
01 Jun 2024
Enhancing Contextual Understanding in Large Language Models through
  Contrastive Decoding
Enhancing Contextual Understanding in Large Language Models through Contrastive Decoding
Zheng Zhao
Emilio Monti
Jens Lehmann
H. Assem
42
21
0
04 May 2024
DESTEIN: Navigating Detoxification of Language Models via Universal
  Steering Pairs and Head-wise Activation Fusion
DESTEIN: Navigating Detoxification of Language Models via Universal Steering Pairs and Head-wise Activation Fusion
Yu Li
Zhihua Wei
Han Jiang
Chuanyang Gong
LLMSV
21
2
0
16 Apr 2024
Towards Robustness of Text-to-Visualization Translation against Lexical
  and Phrasal Variability
Towards Robustness of Text-to-Visualization Translation against Lexical and Phrasal Variability
Jinwei Lu
Yuanfeng Song
Haodi Zhang
Chen Zhang
Raymond Chi-Wing Wong
OOD
26
0
0
10 Apr 2024
From One to Many: Expanding the Scope of Toxicity Mitigation in Language
  Models
From One to Many: Expanding the Scope of Toxicity Mitigation in Language Models
Luiza Amador Pozzobon
Patrick Lewis
Sara Hooker
B. Ermiş
36
7
0
06 Mar 2024
Fine-Grained Detoxification via Instance-Level Prefixes for Large
  Language Models
Fine-Grained Detoxification via Instance-Level Prefixes for Large Language Models
Xin Yi
Linlin Wang
Xiaoling Wang
Liang He
MoMe
35
1
0
23 Feb 2024
Controlled Text Generation for Large Language Model with Dynamic
  Attribute Graphs
Controlled Text Generation for Large Language Model with Dynamic Attribute Graphs
Xun Liang
Hanyu Wang
Shichao Song
Mengting Hu
Xunzhi Wang
Zhiyu Li
Feiyu Xiong
Bo Tang
12
9
0
17 Feb 2024
Representation Surgery: Theory and Practice of Affine Steering
Representation Surgery: Theory and Practice of Affine Steering
Shashwat Singh
Shauli Ravfogel
Jonathan Herzig
Roee Aharoni
Ryan Cotterell
Ponnurangam Kumaraguru
LLMSV
27
12
0
15 Feb 2024
Aya Model: An Instruction Finetuned Open-Access Multilingual Language
  Model
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model
A. Ustun
Viraat Aryabumi
Zheng-Xin Yong
Wei-Yin Ko
Daniel D'souza
...
Shayne Longpre
Niklas Muennighoff
Marzieh Fadaee
Julia Kreutzer
Sara Hooker
ALM
ELM
SyDa
LRM
27
192
0
12 Feb 2024
XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in
  Large Language Models
XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models
Paul Röttger
Hannah Rose Kirk
Bertie Vidgen
Giuseppe Attanasio
Federico Bianchi
Dirk Hovy
ALM
ELM
AILaw
18
119
0
02 Aug 2023
Semiparametric Language Models Are Scalable Continual Learners
Semiparametric Language Models Are Scalable Continual Learners
Guangyue Peng
Tao Ge
Si-Qing Chen
Furu Wei
Houfeng Wang
KELM
37
10
0
02 Mar 2023
Training Language Models with Memory Augmentation
Training Language Models with Memory Augmentation
Zexuan Zhong
Tao Lei
Danqi Chen
RALM
226
126
0
25 May 2022
Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval
Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval
Uri Alon
Frank F. Xu
Junxian He
Sudipta Sengupta
Dan Roth
Graham Neubig
RALM
70
62
0
28 Jan 2022
Towards Continual Knowledge Learning of Language Models
Towards Continual Knowledge Learning of Language Models
Joel Jang
Seonghyeon Ye
Sohee Yang
Joongbo Shin
Janghoon Han
Gyeonghun Kim
Stanley Jungkyu Choi
Minjoon Seo
CLL
KELM
222
150
0
07 Oct 2021
Text Detoxification using Large Pre-trained Neural Models
Text Detoxification using Large Pre-trained Neural Models
David Dale
Anton Voronov
Daryna Dementieva
V. Logacheva
Olga Kozlova
Nikita Semenov
Alexander Panchenko
39
71
0
18 Sep 2021
Challenges in Detoxifying Language Models
Challenges in Detoxifying Language Models
Johannes Welbl
Amelia Glaese
J. Uesato
Sumanth Dathathri
John F. J. Mellor
Lisa Anne Hendricks
Kirsty Anderson
Pushmeet Kohli
Ben Coppin
Po-Sen Huang
LM&MA
242
191
0
15 Sep 2021
The Woman Worked as a Babysitter: On Biases in Language Generation
The Woman Worked as a Babysitter: On Biases in Language Generation
Emily Sheng
Kai-Wei Chang
Premkumar Natarajan
Nanyun Peng
204
607
0
03 Sep 2019
1