ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.14230
  4. Cited By
Resistance Against Manipulative AI: key factors and possible actions

Resistance Against Manipulative AI: key factors and possible actions

22 April 2024
Piotr Wilczyñski
Wiktoria Mieleszczenko-Kowszewicz
P. Biecek
ArXivPDFHTML

Papers citing "Resistance Against Manipulative AI: key factors and possible actions"

3 / 3 papers shown
Title
Assessing AI vs Human-Authored Spear Phishing SMS Attacks: An Empirical Study
Assessing AI vs Human-Authored Spear Phishing SMS Attacks: An Empirical Study
Jerson Francia
Derek Hansen
Ben Schooley
Matthew Taylor
Shydra Murray
Greg Snow
26
1
0
18 Jun 2024
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors,
  and Lessons Learned
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
Deep Ganguli
Liane Lovitt
John Kernion
Amanda Askell
Yuntao Bai
...
Nicholas Joseph
Sam McCandlish
C. Olah
Jared Kaplan
Jack Clark
218
443
0
23 Aug 2022
Challenges in Detoxifying Language Models
Challenges in Detoxifying Language Models
Johannes Welbl
Amelia Glaese
J. Uesato
Sumanth Dathathri
John F. J. Mellor
Lisa Anne Hendricks
Kirsty Anderson
Pushmeet Kohli
Ben Coppin
Po-Sen Huang
LM&MA
242
193
0
15 Sep 2021
1