Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.10235
Cited By
Assessing Hidden Risks of LLMs: An Empirical Study on Robustness, Consistency, and Credibility
15 May 2023
Wen-song Ye
Mingfeng Ou
Tianyi Li
Yipeng Chen
Xuetao Ma
Yifan YangGong
Sai Wu
Jie Fu
Gang Chen
Haobo Wang
J. Zhao
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Assessing Hidden Risks of LLMs: An Empirical Study on Robustness, Consistency, and Credibility"
29 / 29 papers shown
Title
A Survey on Privacy Risks and Protection in Large Language Models
Kang Chen
Xiuze Zhou
Yuanguo Lin
Shibo Feng
Li Shen
Pengcheng Wu
AILaw
PILM
53
0
0
04 May 2025
DICE: A Framework for Dimensional and Contextual Evaluation of Language Models
Aryan Shrivastava
Paula Akemi Aoyagui
21
0
0
14 Apr 2025
Prompt-Reverse Inconsistency: LLM Self-Inconsistency Beyond Generative Randomness and Prompt Paraphrasing
Jihyun Janice Ahn
Wenpeng Yin
SILM
LRM
53
1
0
02 Apr 2025
Mapping the Trust Terrain: LLMs in Software Engineering -- Insights and Perspectives
Dipin Khati
Yijin Liu
David Nader-Palacio
Yixuan Zhang
Denys Poshyvanyk
46
0
0
18 Mar 2025
Waste Not, Want Not; Recycled Gumbel Noise Improves Consistency in Natural Language Generation
Damien de Mijolla
Hannan Saddiq
Kim Moore
54
0
0
02 Mar 2025
The Order Effect: Investigating Prompt Sensitivity to Input Order in LLMs
Bryan Guan
Tanya Roosta
Peyman Passban
Mehdi Rezagholizadeh
92
0
0
06 Feb 2025
Measuring Free-Form Decision-Making Inconsistency of Language Models in Military Crisis Simulations
Aryan Shrivastava
Jessica Hullman
Max Lamparth
34
6
0
17 Oct 2024
Generalists vs. Specialists: Evaluating Large Language Models for Urdu
Samee Arif
Abdul Hameed Azeemi
Agha Ali Raza
Awais Athar
ALM
LM&MA
ELM
33
4
0
05 Jul 2024
Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models
Yuyan Chen
Qiang Fu
Yichen Yuan
Zhihao Wen
Ge Fan
Dayiheng Liu
Dongmei Zhang
Zhixu Li
Yanghua Xiao
HILM
40
67
0
04 Jul 2024
NLPerturbator: Studying the Robustness of Code LLMs to Natural Language Variations
Junkai Chen
Zhenhao Li
Xing Hu
Xin Xia
AAML
32
7
0
28 Jun 2024
Assessing LLMs for Zero-shot Abstractive Summarization Through the Lens of Relevance Paraphrasing
Hadi Askari
Anshuman Chhabra
Muhao Chen
Prasant Mohapatra
26
4
0
06 Jun 2024
Safeguarding Large Language Models: A Survey
Yi Dong
Ronghui Mu
Yanghao Zhang
Siqi Sun
Tianle Zhang
...
Yi Qi
Jinwei Hu
Jie Meng
Saddek Bensalem
Xiaowei Huang
OffRL
KELM
AILaw
29
17
0
03 Jun 2024
Data Contamination Calibration for Black-box LLMs
Wen-song Ye
Jiaqi Hu
Liyao Li
Haobo Wang
Gang Chen
Junbo Zhao
26
6
0
20 May 2024
Evaluating Consistency and Reasoning Capabilities of Large Language Models
Yash Saxena
Sarthak Chopra
Arunendra Mani Tripathi
ELM
LRM
28
5
0
25 Apr 2024
LLMChain: Blockchain-based Reputation System for Sharing and Evaluating Large Language Models
Mouhamed Amine Bouchiha
Quentin Telnoff
Souhail Bakkali
R. Champagnat
Mourad Rabah
Mickael Coustaty
Y. Ghamri-Doudane
LRM
24
3
0
20 Apr 2024
Crossing Linguistic Horizons: Finetuning and Comprehensive Evaluation of Vietnamese Large Language Models
Sang T. Truong
D. Q. Nguyen
Toan Nguyen
Dong D. Le
Nhi N. Truong
Tho Quan
Oluwasanmi Koyejo
24
2
0
05 Mar 2024
Exploring Advanced Methodologies in Security Evaluation for LLMs
Junming Huang
Jiawei Zhang
Qi Wang
Weihong Han
Yanchun Zhang
27
0
0
28 Feb 2024
AuditLLM: A Tool for Auditing Large Language Models Using Multiprobe Approach
Maryam Amirizaniani
Elias Martin
Tanya Roosta
Aman Chadha
Chirag Shah
15
2
0
14 Feb 2024
Selecting Shots for Demographic Fairness in Few-Shot Learning with Large Language Models
Carlos Alejandro Aguirre
Kuleen Sasse
Isabel Cachola
Mark Dredze
16
1
0
14 Nov 2023
Can LLM-Generated Misinformation Be Detected?
Canyu Chen
Kai Shu
DeLMO
21
144
0
25 Sep 2023
GPTEval: A Survey on Assessments of ChatGPT and GPT-4
Rui Mao
Guanyi Chen
Xulang Zhang
Frank Guerin
Erik Cambria
ELM
LM&MA
28
91
0
24 Aug 2023
TableGPT: Towards Unifying Tables, Nature Language and Commands into One GPT
Liangyu Zha
Junlin Zhou
Liyao Li
Rui Wang
Qingyi Huang
...
Xing-yan Deng
J. Xu
Haobo Wang
Gang Chen
J. Zhao
RALM
LMTD
26
42
0
17 Jul 2023
Robust Prompt Optimization for Large Language Models Against Distribution Shifts
Moxin Li
Wenjie Wang
Fuli Feng
Yixin Cao
Jizhi Zhang
Tat-Seng Chua
OffRL
30
7
0
23 May 2023
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies
Mor Geva
Daniel Khashabi
Elad Segal
Tushar Khot
Dan Roth
Jonathan Berant
RALM
242
460
0
06 Jan 2021
WARP: Word-level Adversarial ReProgramming
Karen Hambardzumyan
Hrant Khachatrian
Jonathan May
AAML
246
340
0
01 Jan 2021
e-SNLI: Natural Language Inference with Natural Language Explanations
Oana-Maria Camburu
Tim Rocktaschel
Thomas Lukasiewicz
Phil Blunsom
LRM
249
618
0
04 Dec 2018
Generating Natural Language Adversarial Examples
M. Alzantot
Yash Sharma
Ahmed Elgohary
Bo-Jhang Ho
Mani B. Srivastava
Kai-Wei Chang
AAML
230
909
0
21 Apr 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
Adversarial Example Generation with Syntactically Controlled Paraphrase Networks
Mohit Iyyer
John Wieting
Kevin Gimpel
Luke Zettlemoyer
AAML
GAN
178
708
0
17 Apr 2018
1