ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.11176
  4. Cited By
A New Generation of Perspective API: Efficient Multilingual
  Character-level Transformers

A New Generation of Perspective API: Efficient Multilingual Character-level Transformers

22 February 2022
Alyssa Lees
Vinh Q. Tran
Yi Tay
Jeffrey Scott Sorensen
Jai Gupta
Donald Metzler
Lucy Vasserman
ArXivPDFHTML

Papers citing "A New Generation of Perspective API: Efficient Multilingual Character-level Transformers"

50 / 102 papers shown
Title
MisgenderMender: A Community-Informed Approach to Interventions for
  Misgendering
MisgenderMender: A Community-Informed Approach to Interventions for Misgendering
Tamanna Hossain
Sunipa Dev
Sameer Singh
19
5
0
23 Apr 2024
Towards Human-centered Proactive Conversational Agents
Towards Human-centered Proactive Conversational Agents
Yang Deng
Lizi Liao
Zhonghua Zheng
Grace Hui Yang
Tat-Seng Chua
LLMAG
32
24
0
19 Apr 2024
AEGIS: Online Adaptive AI Content Safety Moderation with Ensemble of LLM
  Experts
AEGIS: Online Adaptive AI Content Safety Moderation with Ensemble of LLM Experts
Shaona Ghosh
Prasoon Varshney
Erick Galinkin
Christopher Parisien
ELM
38
35
0
09 Apr 2024
Fairness in Large Language Models: A Taxonomic Survey
Fairness in Large Language Models: A Taxonomic Survey
Zhibo Chu
Zichong Wang
Wenbin Zhang
AILaw
43
32
0
31 Mar 2024
A Review of Multi-Modal Large Language and Vision Models
A Review of Multi-Modal Large Language and Vision Models
Kilian Carolan
Laura Fennelly
A. Smeaton
VLM
22
22
0
28 Mar 2024
NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using
  Representative Data
NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative Data
Manuel Tonneau
Pedro Vitor Quinta de Castro
Karim Lasri
I. Farouq
Lakshminarayanan Subramanian
Victor Orozco-Olvera
Samuel Fraiberger
31
9
0
28 Mar 2024
RigorLLM: Resilient Guardrails for Large Language Models against
  Undesired Content
RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content
Zhuowen Yuan
Zidi Xiong
Yi Zeng
Ning Yu
Ruoxi Jia
D. Song
Bo-wen Li
AAML
KELM
40
38
0
19 Mar 2024
From One to Many: Expanding the Scope of Toxicity Mitigation in Language
  Models
From One to Many: Expanding the Scope of Toxicity Mitigation in Language Models
Luiza Amador Pozzobon
Patrick Lewis
Sara Hooker
B. Ermiş
36
7
0
06 Mar 2024
ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable
  Safety Detectors
ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors
Zhexin Zhang
Yida Lu
Jingyuan Ma
Di Zhang
Rui Li
...
Hao-Lun Sun
Lei Sha
Zhifang Sui
Hongning Wang
Minlie Huang
18
26
0
26 Feb 2024
Feedback Loops With Language Models Drive In-Context Reward Hacking
Feedback Loops With Language Models Drive In-Context Reward Hacking
Alexander Pan
Erik Jones
Meena Jagadeesan
Jacob Steinhardt
KELM
42
26
0
09 Feb 2024
SALAD-Bench: A Hierarchical and Comprehensive Safety Benchmark for Large
  Language Models
SALAD-Bench: A Hierarchical and Comprehensive Safety Benchmark for Large Language Models
Lijun Li
Bowen Dong
Ruohui Wang
Xuhao Hu
Wangmeng Zuo
Dahua Lin
Yu Qiao
Jing Shao
ELM
30
84
0
07 Feb 2024
Gradient-Based Language Model Red Teaming
Gradient-Based Language Model Red Teaming
Nevan Wichers
Carson E. Denison
Ahmad Beirami
8
25
0
30 Jan 2024
A Group Fairness Lens for Large Language Models
A Group Fairness Lens for Large Language Models
Guanqun Bi
Lei Shen
Yuqiang Xie
Yanan Cao
Tiangang Zhu
Xiao-feng He
ALM
26
4
0
24 Dec 2023
The DSA Transparency Database: Auditing Self-reported Moderation Actions by Social Media
The DSA Transparency Database: Auditing Self-reported Moderation Actions by Social Media
Amaury Trujillo
T. Fagni
S. Cresci
18
9
0
16 Dec 2023
ToViLaG: Your Visual-Language Generative Model is Also An Evildoer
ToViLaG: Your Visual-Language Generative Model is Also An Evildoer
Xinpeng Wang
Xiaoyuan Yi
Han Jiang
Shanlin Zhou
Zhihua Wei
Xing Xie
25
12
0
13 Dec 2023
Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations
Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations
Hakan Inan
Kartikeya Upasani
Jianfeng Chi
Rashi Rungta
Krithika Iyer
...
Michael Tontchev
Qing Hu
Brian Fuller
Davide Testuggine
Madian Khabsa
AI4MH
11
371
0
07 Dec 2023
Automatic Construction of a Korean Toxic Instruction Dataset for Ethical
  Tuning of Large Language Models
Automatic Construction of a Korean Toxic Instruction Dataset for Ethical Tuning of Large Language Models
Sungjoo Byun
Dongjun Jang
Hyemi Jo
Hyopil Shin
16
2
0
30 Nov 2023
Causal ATE Mitigates Unintended Bias in Controlled Text Generation
Causal ATE Mitigates Unintended Bias in Controlled Text Generation
Rahul Madhavan
Kahini Wadhawan
21
0
0
19 Nov 2023
Subtle Misogyny Detection and Mitigation: An Expert-Annotated Dataset
Subtle Misogyny Detection and Mitigation: An Expert-Annotated Dataset
Brooklyn Sheppard
Anna Richter
Allison Cohen
Elizabeth Allyn Smith
Tamara Kneese
Carolyne Pelletier
Ioana Baldini
Yue Dong
19
4
0
15 Nov 2023
Toxicity Detection is NOT all you Need: Measuring the Gaps to Supporting
  Volunteer Content Moderators
Toxicity Detection is NOT all you Need: Measuring the Gaps to Supporting Volunteer Content Moderators
Yang Trista Cao
Lovely-Frances Domingo
Sarah Ann Gilbert
Michelle Mazurek
Katie Shilton
Hal Daumé
16
5
0
14 Nov 2023
A Taxonomy of Rater Disagreements: Surveying Challenges & Opportunities
  from the Perspective of Annotating Online Toxicity
A Taxonomy of Rater Disagreements: Surveying Challenges & Opportunities from the Perspective of Annotating Online Toxicity
Wenbo Zhang
Hangzhi Guo
Ian D Kivlichan
Vinodkumar Prabhakaran
Davis Yadav
Amulya Yadav
23
2
0
07 Nov 2023
Unveiling Safety Vulnerabilities of Large Language Models
Unveiling Safety Vulnerabilities of Large Language Models
George Kour
Marcel Zalmanovici
Naama Zwerdling
Esther Goldbraich
Ora Nova Fandina
Ateret Anaby-Tavor
Orna Raz
E. Farchi
AAML
16
14
0
07 Nov 2023
Measuring Adversarial Datasets
Measuring Adversarial Datasets
Yuanchen Bai
Raoyi Huang
Vijay Viswanathan
Tzu-Sheng Kuo
Tongshuang Wu
39
1
0
06 Nov 2023
People Make Better Edits: Measuring the Efficacy of LLM-Generated
  Counterfactually Augmented Data for Harmful Language Detection
People Make Better Edits: Measuring the Efficacy of LLM-Generated Counterfactually Augmented Data for Harmful Language Detection
Indira Sen
Dennis Assenmacher
Mattia Samory
Isabelle Augenstein
Wil M.P. van der Aalst
Claudia Wagner
17
19
0
02 Nov 2023
What Makes it Ok to Set a Fire? Iterative Self-distillation of Contexts
  and Rationales for Disambiguating Defeasible Social and Moral Situations
What Makes it Ok to Set a Fire? Iterative Self-distillation of Contexts and Rationales for Disambiguating Defeasible Social and Moral Situations
Kavel Rao
Liwei Jiang
Valentina Pyatkin
Yuling Gu
Niket Tandon
Nouha Dziri
Faeze Brahman
Yejin Choi
19
15
0
24 Oct 2023
Towards Detecting Contextual Real-Time Toxicity for In-Game Chat
Towards Detecting Contextual Real-Time Toxicity for In-Game Chat
Zachary Yang
Nicolas Grenan-Godbout
Reihaneh Rabbany
19
3
0
20 Oct 2023
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Xi Chen
Xiao Wang
Lucas Beyer
Alexander Kolesnikov
Jialin Wu
...
Keran Rong
Tianli Yu
Daniel Keysers
Xiao-Qi Zhai
Radu Soricut
MLLM
VLM
30
93
0
13 Oct 2023
Are Personalized Stochastic Parrots More Dangerous? Evaluating Persona
  Biases in Dialogue Systems
Are Personalized Stochastic Parrots More Dangerous? Evaluating Persona Biases in Dialogue Systems
Yixin Wan
Jieyu Zhao
Aman Chadha
Nanyun Peng
Kai-Wei Chang
32
22
0
08 Oct 2023
Auto-survey Challenge
Auto-survey Challenge
Thanh Gia Hieu Khuong
Benedictus Kent Rachmat
15
1
0
06 Oct 2023
Fine-tuning Aligned Language Models Compromises Safety, Even When Users
  Do Not Intend To!
Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!
Xiangyu Qi
Yi Zeng
Tinghao Xie
Pin-Yu Chen
Ruoxi Jia
Prateek Mittal
Peter Henderson
SILM
44
523
0
05 Oct 2023
CLEVA: Chinese Language Models EVAluation Platform
CLEVA: Chinese Language Models EVAluation Platform
Yanyang Li
Jianqiao Zhao
Duo Zheng
Zi-Yuan Hu
Zhi Chen
...
Yongfeng Huang
Shijia Huang
Dahua Lin
Michael R. Lyu
Liwei Wang
ALM
ELM
33
9
0
09 Aug 2023
BeaverTails: Towards Improved Safety Alignment of LLM via a
  Human-Preference Dataset
BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset
Jiaming Ji
Mickel Liu
Juntao Dai
Xuehai Pan
Chi Zhang
Ce Bian
Chi Zhang
Ruiyang Sun
Yizhou Wang
Yaodong Yang
ALM
14
396
0
10 Jul 2023
Your spouse needs professional help: Determining the Contextual
  Appropriateness of Messages through Modeling Social Relationships
Your spouse needs professional help: Determining the Contextual Appropriateness of Messages through Modeling Social Relationships
David Jurgens
Agrima Seth
Jack E. Sargent
Athena Aghighi
Michael Geraci
20
7
0
06 Jul 2023
Your Attack Is Too DUMB: Formalizing Attacker Scenarios for Adversarial
  Transferability
Your Attack Is Too DUMB: Formalizing Attacker Scenarios for Adversarial Transferability
Marco Alecci
Mauro Conti
Francesco Marchiori
L. Martinelli
Luca Pajola
AAML
16
7
0
27 Jun 2023
Lost in Translation: Large Language Models in Non-English Content
  Analysis
Lost in Translation: Large Language Models in Non-English Content Analysis
Gabriel Nicholas
Aliya Bhatia
ELM
13
33
0
12 Jun 2023
CFL: Causally Fair Language Models Through Token-level Attribute
  Controlled Generation
CFL: Causally Fair Language Models Through Token-level Attribute Controlled Generation
Rahul Madhavan
Rishabh Garg
Kahini Wadhawan
S. Mehta
15
5
0
01 Jun 2023
PaLI-X: On Scaling up a Multilingual Vision and Language Model
PaLI-X: On Scaling up a Multilingual Vision and Language Model
Xi Chen
Josip Djolonga
Piotr Padlewski
Basil Mustafa
Soravit Changpinyo
...
Mojtaba Seyedhosseini
A. Angelova
Xiaohua Zhai
N. Houlsby
Radu Soricut
VLM
44
187
0
29 May 2023
SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable
  Responses Created Through Human-Machine Collaboration
SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable Responses Created Through Human-Machine Collaboration
Hwaran Lee
Seokhee Hong
Joonsuk Park
Takyoung Kim
M. Cha
...
Eun-Ju Lee
Yong Lim
Alice H. Oh
San-hee Park
Jung-Woo Ha
34
16
0
28 May 2023
Healing Unsafe Dialogue Responses with Weak Supervision Signals
Healing Unsafe Dialogue Responses with Weak Supervision Signals
Zi Liang
Pinghui Wang
Ruofei Zhang
Shuo Zhang
Xiaofan Ye Yi Huang
Junlan Feng
21
1
0
25 May 2023
A Pretrainer's Guide to Training Data: Measuring the Effects of Data
  Age, Domain Coverage, Quality, & Toxicity
A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity
Shayne Longpre
Gregory Yauney
Emily Reif
Katherine Lee
Adam Roberts
...
Denny Zhou
Jason W. Wei
Kevin Robinson
David M. Mimno
Daphne Ippolito
21
147
0
22 May 2023
ToxBuster: In-game Chat Toxicity Buster with BERT
ToxBuster: In-game Chat Toxicity Buster with BERT
Zachary Yang
Yasmine Maricar
M. Davari
Nicolas Grenon-Godbout
Reihaneh Rabbany
14
3
0
21 May 2023
Analyzing Norm Violations in Live-Stream Chat
Analyzing Norm Violations in Live-Stream Chat
Jihyung Moon
Dong-Ho Lee
Hyundong Justin Cho
Woojeong Jin
Chan Young Park
MinWoo Kim
Jonathan May
Jay Pujara
Sungjoon Park
23
4
0
18 May 2023
Toxic comments reduce the activity of volunteer editors on Wikipedia
Toxic comments reduce the activity of volunteer editors on Wikipedia
Ivan Smirnov
Camelia Oprea
Markus Strohmaier
KELM
8
3
0
26 Apr 2023
On the Challenges of Using Black-Box APIs for Toxicity Evaluation in
  Research
On the Challenges of Using Black-Box APIs for Toxicity Evaluation in Research
Luiza Amador Pozzobon
B. Ermiş
Patrick Lewis
Sara Hooker
27
45
0
24 Apr 2023
Language Model Behavior: A Comprehensive Survey
Language Model Behavior: A Comprehensive Survey
Tyler A. Chang
Benjamin Bergen
VLM
LRM
LM&MA
27
102
0
20 Mar 2023
Critical Perspectives: A Benchmark Revealing Pitfalls in PerspectiveAPI
Critical Perspectives: A Benchmark Revealing Pitfalls in PerspectiveAPI
Lorena Piedras
Lucas Rosenblatt
Julia Wilkins
26
9
0
05 Jan 2023
Hypothesis Engineering for Zero-Shot Hate Speech Detection
Hypothesis Engineering for Zero-Shot Hate Speech Detection
Janis Goldzycher
Gerold Schneider
14
7
0
03 Oct 2022
A Holistic Approach to Undesired Content Detection in the Real World
A Holistic Approach to Undesired Content Detection in the Real World
Todor Markov
Chong Zhang
Sandhini Agarwal
Tyna Eloundou
Teddy Lee
Steven Adler
Angela Jiang
L. Weng
17
210
0
05 Aug 2022
UL2: Unifying Language Learning Paradigms
UL2: Unifying Language Learning Paradigms
Yi Tay
Mostafa Dehghani
Vinh Q. Tran
Xavier Garcia
Jason W. Wei
...
Tal Schuster
H. Zheng
Denny Zhou
N. Houlsby
Donald Metzler
AI4CE
49
293
0
10 May 2022
Overview of the HASOC track at FIRE 2020: Hate Speech and Offensive
  Content Identification in Indo-European Languages
Overview of the HASOC track at FIRE 2020: Hate Speech and Offensive Content Identification in Indo-European Languages
Thomas Mandl
Sandip J Modha
Gautam Kishore Shahi
Prasenjit Majumder
Mohana Dave
Daksh Patel
Chintak Mandalia
Aditya Patel
91
172
0
12 Aug 2021
Previous
123
Next