Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.09009
Cited By
How is ChatGPT's behavior changing over time?
18 July 2023
Lingjiao Chen
Matei A. Zaharia
James Y. Zou
ELM
KELM
AI4MH
Re-assign community
ArXiv
PDF
HTML
Papers citing
"How is ChatGPT's behavior changing over time?"
40 / 40 papers shown
Title
Improving Model Alignment Through Collective Intelligence of Open-Source LLMS
Junlin Wang
Roy Xie
Shang Zhu
Jue Wang
Ben Athiwaratkun
Bhuwan Dhingra
S. Song
Ce Zhang
James Y. Zou
ALM
27
0
0
05 May 2025
Bye-bye, Bluebook? Automating Legal Procedure with Large Language Models
Matthew Dahl
AILaw
ELM
45
0
0
05 May 2025
Memorization and Knowledge Injection in Gated LLMs
Xu Pan
Ely Hahami
Zechen Zhang
H. Sompolinsky
KELM
CLL
RALM
104
0
0
30 Apr 2025
LLM-Evaluation Tropes: Perspectives on the Validity of LLM-Evaluations
Laura Dietz
Oleg Zendel
P. Bailey
Charles L. A. Clarke
Ellese Cotterill
Jeff Dalton
Faegheh Hasibi
Mark Sanderson
Nick Craswell
ELM
43
0
0
27 Apr 2025
Improving LLM Personas via Rationalization with Psychological Scaffolds
Brihi Joshi
Xiang Ren
Swabha Swayamdipta
Rik Koncel-Kedziorski
Tim Paek
68
0
0
25 Apr 2025
Testing LLMs' Capabilities in Annotating Translations Based on an Error Typology Designed for LSP Translation: First Experiments with ChatGPT
Joachim Minder
Guillaume Wisniewski
Natalie Kübler
28
0
0
21 Apr 2025
Assessing how hyperparameters impact Large Language Models' sarcasm detection performance
Montgomery Gole
Andriy Miranskyy
AI4MH
21
0
0
08 Apr 2025
RobuNFR: Evaluating the Robustness of Large Language Models on Non-Functional Requirements Aware Code Generation
Feng Lin
Dong Jae Kim
Z. Li
Jinqiu Yang
Tse-Husn
Chen
AAML
38
0
0
28 Mar 2025
Generalization Bias in Large Language Model Summarization of Scientific Research
Uwe Peters
Benjamin Chin-Yee
ELM
34
0
0
28 Mar 2025
I'm Sorry Dave: How the old world of personnel security can inform the new world of AI insider risk
Paul Martin
Sarah Mercer
93
0
0
26 Mar 2025
Unlocking the Power of Function Vectors for Characterizing and Mitigating Catastrophic Forgetting in Continual Instruction Tuning
Gangwei Jiang
Caigao Jiang
Zhaoyi Li
Siqiao Xue
Jun-ping Zhou
Linqi Song
Defu Lian
Yin Wei
CLL
MU
56
0
0
16 Feb 2025
The Cake that is Intelligence and Who Gets to Bake it: An AI Analogy and its Implications for Participation
Martin Mundt
Anaelia Ovalle
Felix Friedrich
A Pranav
Subarnaduti Paul
Manuel Brack
Kristian Kersting
William Agnew
213
0
0
05 Feb 2025
PROMPTFUZZ: Harnessing Fuzzing Techniques for Robust Testing of Prompt Injection in LLMs
Jiahao Yu
Yangguang Shao
Hanwen Miao
Junzheng Shi
SILM
AAML
67
4
0
23 Sep 2024
Prompts Are Programs Too! Understanding How Developers Build Software Containing Prompts
Jenny T Liang
Melissa Lin
Nikitha Rao
Brad A. Myers
75
5
0
19 Sep 2024
Cascade Reward Sampling for Efficient Decoding-Time Alignment
Bolian Li
Yifan Wang
A. Grama
Ruqi Zhang
Ruqi Zhang
AI4TS
47
9
0
24 Jun 2024
Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools
Varun Magesh
Faiz Surani
Matthew Dahl
Mirac Suzgun
Christopher D. Manning
Daniel E. Ho
HILM
ELM
AILaw
27
65
0
30 May 2024
CHARP: Conversation History AwaReness Probing for Knowledge-grounded Dialogue Systems
Abbas Ghaddar
David Alfonso-Hermelo
Philippe Langlais
Mehdi Rezagholizadeh
Boxing Chen
Prasanna Parthasarathi
34
0
0
24 May 2024
"ChatGPT Is Here to Help, Not to Replace Anybody" -- An Evaluation of Students' Opinions On Integrating ChatGPT In CS Courses
Bruno Pereira Cipriano
P. Alves
31
9
0
26 Apr 2024
Dialectical Alignment: Resolving the Tension of 3H and Security Threats of LLMs
Shu Yang
Jiayuan Su
Han Jiang
Mengdi Li
Keyuan Cheng
Muhammad Asif Ali
Lijie Hu
Di Wang
16
5
0
30 Mar 2024
Designing Informative Metrics for Few-Shot Example Selection
Rishabh Adiga
Lakshminarayanan Subramanian
Varun Chandrasekaran
27
1
0
06 Mar 2024
Stability Analysis of ChatGPT-based Sentiment Analysis in AI Quality Assurance
Tinghui Ouyang
AprilPyone Maungmaung
Koichi Konishi
Yoshiki Seo
Isao Echizen
AI4MH
18
5
0
15 Jan 2024
Evaluating Language Model Agency through Negotiations
Tim R. Davidson
V. Veselovsky
Martin Josifoski
Maxime Peyrard
Antoine Bosselut
Michal Kosinski
Robert West
LLMAG
29
22
0
09 Jan 2024
ChatGPT & Mechanical Engineering: Examining performance on the FE Mechanical Engineering and Undergraduate Exams
Matthew Frenkel
Hebah Emara
26
2
0
26 Sep 2023
Watch Your Language: Investigating Content Moderation with Large Language Models
Deepak Kumar
Y. AbuHashem
Zakir Durumeric
AI4MH
21
15
0
25 Sep 2023
Prompting or Fine-tuning? A Comparative Study of Large Language Models for Taxonomy Construction
Boqi Chen
Fandi Yi
Dániel Varró
11
16
0
04 Sep 2023
Position: Key Claims in LLM Research Have a Long Tail of Footnotes
Anna Rogers
A. Luccioni
40
19
0
14 Aug 2023
GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher
Youliang Yuan
Wenxiang Jiao
Wenxuan Wang
Jen-tse Huang
Pinjia He
Shuming Shi
Zhaopeng Tu
SILM
56
231
0
12 Aug 2023
Assessing Student Errors in Experimentation Using Artificial Intelligence and Large Language Models: A Comparative Study with Human Raters
Arne Bewersdorff
Kathrin Seßler
Armin Baur
Enkelejda Kasneci
Claudia Nerdel
11
37
0
11 Aug 2023
Deception Abilities Emerged in Large Language Models
Thilo Hagendorff
LLMAG
28
74
0
31 Jul 2023
How Language Model Hallucinations Can Snowball
Muru Zhang
Ofir Press
William Merrill
Alisa Liu
Noah A. Smith
HILM
LRM
78
252
0
22 May 2023
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation
Xiaowei Huang
Wenjie Ruan
Wei Huang
Gao Jin
Yizhen Dong
...
Sihao Wu
Peipei Xu
Dengyu Wu
André Freitas
Mustafa A. Mustafa
ALM
27
81
0
19 May 2023
Semantically Aligned Task Decomposition in Multi-Agent Reinforcement Learning
Wenhao Li
Dan Qiao
Baoxiang Wang
Xiangfeng Wang
Bo Jin
H. Zha
18
5
0
18 May 2023
ChatLog: Carefully Evaluating the Evolution of ChatGPT Across Time
Shangqing Tu
Chunyang Li
Jifan Yu
Xiaozhi Wang
Lei Hou
Juanzi Li
LLMAG
AI4MH
75
11
0
27 Apr 2023
Can we trust the evaluation on ChatGPT?
Rachith Aiyappa
Jisun An
Haewoon Kwak
Yong-Yeol Ahn
ELM
ALM
LLMAG
AI4MH
LRM
106
87
0
22 Mar 2023
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck
Varun Chandrasekaran
Ronen Eldan
J. Gehrke
Eric Horvitz
...
Scott M. Lundberg
Harsha Nori
Hamid Palangi
Marco Tulio Ribeiro
Yi Zhang
ELM
AI4MH
AI4CE
ALM
230
2,989
0
22 Mar 2023
NormSAGE: Multi-Lingual Multi-Cultural Norm Discovery from Conversations On-the-Fly
Yi Ren Fung
Tuhin Chakraborty
Hao Guo
Owen Rambow
Smaranda Muresan
Heng Ji
8
39
0
16 Oct 2022
HAPI: A Large-scale Longitudinal Dataset of Commercial ML API Predictions
Lingjiao Chen
Zhihua Jin
Sabri Eyuboglu
Christopher Ré
Matei A. Zaharia
James Y. Zou
37
9
0
18 Sep 2022
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
Deep Ganguli
Liane Lovitt
John Kernion
Amanda Askell
Yuntao Bai
...
Nicholas Joseph
Sam McCandlish
C. Olah
Jared Kaplan
Jack Clark
218
441
0
23 Aug 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,402
0
28 Jan 2022
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
267
1,808
0
14 Dec 2020
1