Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.04752
Cited By
CRiskEval: A Chinese Multi-Level Risk Evaluation Benchmark Dataset for Large Language Models
7 June 2024
Ling Shi
Deyi Xiong
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CRiskEval: A Chinese Multi-Level Risk Evaluation Benchmark Dataset for Large Language Models"
4 / 4 papers shown
Title
Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey
Zhichen Dong
Zhanhui Zhou
Chao Yang
Jing Shao
Yu Qiao
ELM
52
55
0
14 Feb 2024
Simulating Human Strategic Behavior: Comparing Single and Multi-agent LLMs
Karthik Sreedhar
Lydia B. Chilton
LLMAG
48
12
0
13 Feb 2024
Evaluating Superhuman Models with Consistency Checks
Lukas Fluri
Daniel Paleka
Florian Tramèr
ELM
31
41
0
16 Jun 2023
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck
Varun Chandrasekaran
Ronen Eldan
J. Gehrke
Eric Horvitz
...
Scott M. Lundberg
Harsha Nori
Hamid Palangi
Marco Tulio Ribeiro
Yi Zhang
ELM
AI4MH
AI4CE
ALM
233
2,232
0
22 Mar 2023
1