ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.00486
  4. Cited By
Dialectical Alignment: Resolving the Tension of 3H and Security Threats
  of LLMs

Dialectical Alignment: Resolving the Tension of 3H and Security Threats of LLMs

30 March 2024
Shu Yang
Jiayuan Su
Han Jiang
Mengdi Li
Keyuan Cheng
Muhammad Asif Ali
Lijie Hu
Di Wang
ArXivPDFHTML

Papers citing "Dialectical Alignment: Resolving the Tension of 3H and Security Threats of LLMs"

11 / 11 papers shown
Title
Understanding Reasoning in Chain-of-Thought from the Hopfieldian View
Understanding Reasoning in Chain-of-Thought from the Hopfieldian View
Lijie Hu
Liang Liu
Shu Yang
Xin Chen
Zhen Tan
Muhammad Asif Ali
Mengdi Li
Di Wang
LRM
41
1
0
04 Oct 2024
MQA-KEAL: Multi-hop Question Answering under Knowledge Editing for
  Arabic Language
MQA-KEAL: Multi-hop Question Answering under Knowledge Editing for Arabic Language
Muhammad Asif Ali
Nawal Daftardar
Mutayyaba Waheed
Jianbin Qin
Di Wang
KELM
32
1
0
18 Sep 2024
Leveraging Logical Rules in Knowledge Editing: A Cherry on the Top
Leveraging Logical Rules in Knowledge Editing: A Cherry on the Top
Keyuan Cheng
Muhammad Asif Ali
Shu Yang
Gang Lin
Yuxuan Zhai
Haoyang Fei
Ke Xu
Lu Yu
Lijie Hu
Di Wang
KELM
32
7
0
24 May 2024
Editable Concept Bottleneck Models
Editable Concept Bottleneck Models
Lijie Hu
Chenyang Ren
Zhengyu Hu
Cheng-Long Wang
Di Wang
Hui Xiong
Jingfeng Zhang
Di Wang
27
3
0
24 May 2024
API Is Enough: Conformal Prediction for Large Language Models Without
  Logit-Access
API Is Enough: Conformal Prediction for Large Language Models Without Logit-Access
Jiayuan Su
Jing Luo
Hongwei Wang
Lu Cheng
74
16
0
02 Mar 2024
Survey of Vulnerabilities in Large Language Models Revealed by
  Adversarial Attacks
Survey of Vulnerabilities in Large Language Models Revealed by Adversarial Attacks
Erfan Shayegani
Md Abdullah Al Mamun
Yu Fu
Pedram Zaree
Yue Dong
Nael B. Abu-Ghazaleh
AAML
147
145
0
16 Oct 2023
Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large
  Language Models in Knowledge Conflicts
Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts
Jian Xie
Kai Zhang
Jiangjie Chen
Renze Lou
Yu-Chuan Su
RALM
198
153
0
22 May 2023
Can LMs Learn New Entities from Descriptions? Challenges in Propagating
  Injected Knowledge
Can LMs Learn New Entities from Descriptions? Challenges in Propagating Injected Knowledge
Yasumasa Onoe
Michael J.Q. Zhang
Shankar Padmanabhan
Greg Durrett
Eunsol Choi
KELM
201
73
0
02 May 2023
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
308
11,909
0
04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
323
8,448
0
28 Jan 2022
Fine-Tuning Language Models from Human Preferences
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
275
1,587
0
18 Sep 2019
1