Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.11391
Cited By
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation
19 May 2023
Xiaowei Huang
Wenjie Ruan
Wei Huang
Gao Jin
Yizhen Dong
Changshun Wu
Saddek Bensalem
Ronghui Mu
Yi Qi
Xingyu Zhao
Kaiwen Cai
Yanghao Zhang
Sihao Wu
Peipei Xu
Dengyu Wu
André Freitas
Mustafa A. Mustafa
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation"
28 / 78 papers shown
Title
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond
Jingfeng Yang
Hongye Jin
Ruixiang Tang
Xiaotian Han
Qizhang Feng
Haoming Jiang
Bing Yin
Xia Hu
LM&MA
123
593
0
26 Apr 2023
Instruction Tuning with GPT-4
Baolin Peng
Chunyuan Li
Pengcheng He
Michel Galley
Jianfeng Gao
SyDa
ALM
LM&MA
154
576
0
06 Apr 2023
Safety Analysis in the Era of Large Language Models: A Case Study of STPA using ChatGPT
Yi Qi
Xingyu Zhao
Siddartha Khastgir
Xiaowei Huang
16
14
0
03 Apr 2023
Can we trust the evaluation on ChatGPT?
Rachith Aiyappa
Jisun An
Haewoon Kwak
Yong-Yeol Ahn
ELM
ALM
LLMAG
AI4MH
LRM
104
76
0
22 Mar 2023
Large AI Models in Health Informatics: Applications, Challenges, and the Future
Jianing Qiu
Lin Li
Jiankai Sun
Jiachuan Peng
Peilun Shi
...
Bo Xiao
Wu Yuan
Ningli Wang
Dong Xu
Benny P. L. Lo
AI4MH
LM&MA
19
123
0
21 Mar 2023
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs
Albert Q. Jiang
Sean Welleck
Jin Peng Zhou
Wenda Li
Jiacheng Liu
M. Jamnik
Timothée Lacroix
Yuhuai Wu
Guillaume Lample
AIMat
58
154
0
21 Oct 2022
ReAct: Synergizing Reasoning and Acting in Language Models
Shunyu Yao
Jeffrey Zhao
Dian Yu
Nan Du
Izhak Shafran
Karthik Narasimhan
Yuan Cao
LLMAG
ReLM
LRM
208
2,413
0
06 Oct 2022
Large Language Models are Few-Shot Clinical Information Extractors
Monica Agrawal
S. Hegselmann
Hunter Lang
Yoon Kim
David Sontag
BDL
LM&MA
152
327
0
25 May 2022
Autoformalization with Large Language Models
Yuhuai Wu
Albert Q. Jiang
Wenda Li
M. Rabe
Charles Staats
M. Jamnik
Christian Szegedy
AI4CE
108
107
0
25 May 2022
Phrase-level Textual Adversarial Attack with Label Preservation
Yibin Lei
Yu Cao
Dianqi Li
Tianyi Zhou
Meng Fang
Mykola Pechenizkiy
AAML
35
24
0
22 May 2022
A Review of Safe Reinforcement Learning: Methods, Theory and Applications
Shangding Gu
Longyu Yang
Yali Du
Guang Chen
Florian Walter
Jun Wang
Alois C. Knoll
OffRL
AI4TS
102
231
0
20 May 2022
Hierarchical Distribution-Aware Testing of Deep Learning
Wei Huang
Xingyu Zhao
Alec Banks
V. Cox
Xiaowei Huang
OOD
AAML
16
7
0
17 May 2022
Prioritizing Corners in OoD Detectors via Symbolic String Manipulation
Chih-Hong Cheng
Changshun Wu
Emmanouil Seferis
Saddek Bensalem
16
3
0
16 May 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
297
3,163
0
21 Mar 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
Generalized Out-of-Distribution Detection: A Survey
Jingkang Yang
Kaiyang Zhou
Yixuan Li
Ziwei Liu
162
812
0
21 Oct 2021
Types of Out-of-Distribution Texts and How to Detect Them
Udit Arora
William Huang
He He
OODD
204
97
0
14 Sep 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
Revisiting Mahalanobis Distance for Transformer-Based Out-of-Domain Detection
Alexander Podolskiy
Dmitry Lipin
A. Bout
Ekaterina Artemova
Irina Piontkovskaya
OODD
84
70
0
11 Jan 2021
Calibration of Pre-trained Transformers
Shrey Desai
Greg Durrett
UQLM
231
288
0
17 Mar 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
220
3,054
0
23 Jan 2020
Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models
Cheolhyoung Lee
Kyunghyun Cho
Wanmo Kang
MoE
222
204
0
25 Sep 2019
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
273
1,561
0
18 Sep 2019
Certified Robustness to Adversarial Word Substitutions
Robin Jia
Aditi Raghunathan
Kerem Göksel
Percy Liang
AAML
161
289
0
03 Sep 2019
Generating Natural Language Adversarial Examples
M. Alzantot
Yash Sharma
Ahmed Elgohary
Bo-Jhang Ho
Mani B. Srivastava
Kai-Wei Chang
AAML
230
909
0
21 Apr 2018
Adversarial Example Generation with Syntactically Controlled Paraphrase Networks
Mohit Iyyer
John Wieting
Kevin Gimpel
Luke Zettlemoyer
AAML
GAN
178
708
0
17 Apr 2018
Safety Verification of Deep Neural Networks
Xiaowei Huang
M. Kwiatkowska
Sen Wang
Min Wu
AAML
178
883
0
21 Oct 2016
Previous
1
2