ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.13213
  4. Cited By
From Representational Harms to Quality-of-Service Harms: A Case Study on
  Llama 2 Safety Safeguards

From Representational Harms to Quality-of-Service Harms: A Case Study on Llama 2 Safety Safeguards

20 March 2024
Khaoula Chehbouni
Megha Roshan
Emmanuel Ma
Futian Andrew Wei
Afaf Taik
Jackie CK Cheung
G. Farnadi
ArXivPDFHTML

Papers citing "From Representational Harms to Quality-of-Service Harms: A Case Study on Llama 2 Safety Safeguards"

4 / 4 papers shown
Title
Beyond the Safety Bundle: Auditing the Helpful and Harmless Dataset
Beyond the Safety Bundle: Auditing the Helpful and Harmless Dataset
Khaoula Chehbouni
Jonathan Colaço-Carr
Yash More
Jackie CK Cheung
G. Farnadi
71
0
0
12 Nov 2024
Surgical, Cheap, and Flexible: Mitigating False Refusal in Language Models via Single Vector Ablation
Surgical, Cheap, and Flexible: Mitigating False Refusal in Language Models via Single Vector Ablation
Xinpeng Wang
Chengzhi Hu
Paul Röttger
Barbara Plank
46
5
0
04 Oct 2024
Annotation alignment: Comparing LLM and human annotations of
  conversational safety
Annotation alignment: Comparing LLM and human annotations of conversational safety
Rajiv Movva
Pang Wei Koh
Emma Pierson
ALM
27
3
0
10 Jun 2024
BBQ: A Hand-Built Bias Benchmark for Question Answering
BBQ: A Hand-Built Bias Benchmark for Question Answering
Alicia Parrish
Angelica Chen
Nikita Nangia
Vishakh Padmakumar
Jason Phang
Jana Thompson
Phu Mon Htut
Sam Bowman
210
364
0
15 Oct 2021
1