ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.11998
  4. Cited By
LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

21 September 2023
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Tianle Li
Siyuan Zhuang
Zhanghao Wu
Yonghao Zhuang
Zhuohan Li
Zi Lin
Eric P. Xing
Joseph E. Gonzalez
Ion Stoica
Haotong Zhang
ArXivPDFHTML

Papers citing "LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset"

50 / 140 papers shown
Title
LLMs Get Lost In Multi-Turn Conversation
LLMs Get Lost In Multi-Turn Conversation
Philippe Laban
Hiroaki Hayashi
Yingbo Zhou
Jennifer Neville
34
0
0
09 May 2025
REVEAL: Multi-turn Evaluation of Image-Input Harms for Vision LLM
REVEAL: Multi-turn Evaluation of Image-Input Harms for Vision LLM
Madhur Jindal
Saurabh Deshpande
AAML
43
0
0
07 May 2025
Value Portrait: Understanding Values of LLMs with Human-aligned Benchmark
Value Portrait: Understanding Values of LLMs with Human-aligned Benchmark
Jongwook Han
Dongmin Choi
Woojung Song
Eun-Ju Lee
Yohan Jo
PILM
53
0
0
02 May 2025
Tempo: Application-aware LLM Serving with Mixed SLO Requirements
Tempo: Application-aware LLM Serving with Mixed SLO Requirements
Wei Zhang
Zhiyu Wu
Yi Mu
Banruo Liu
Myungjin Lee
Fan Lai
51
0
0
24 Apr 2025
Values in the Wild: Discovering and Analyzing Values in Real-World Language Model Interactions
Values in the Wild: Discovering and Analyzing Values in Real-World Language Model Interactions
Saffron Huang
Esin Durmus
Miles McCain
Kunal Handa
Alex Tamkin
Jerry Hong
Michael Stern
Arushi Somani
Xiuruo Zhang
Deep Ganguli
VLM
40
1
0
21 Apr 2025
PROMPTEVALS: A Dataset of Assertions and Guardrails for Custom Production Large Language Model Pipelines
PROMPTEVALS: A Dataset of Assertions and Guardrails for Custom Production Large Language Model Pipelines
Reya Vir
Shreya Shankar
Harrison Chase
Will Fu-Hinthorn
Aditya G. Parameswaran
AI4TS
32
0
0
20 Apr 2025
Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory Constraints
Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory Constraints
Ruicheng Ao
Gan Luo
D. Simchi-Levi
Xinshang Wang
26
2
0
15 Apr 2025
DICE: A Framework for Dimensional and Contextual Evaluation of Language Models
DICE: A Framework for Dimensional and Contextual Evaluation of Language Models
Aryan Shrivastava
Paula Akemi Aoyagui
29
0
0
14 Apr 2025
Efficient LLM Serving on Hybrid Real-time and Best-effort Requests
Efficient LLM Serving on Hybrid Real-time and Best-effort Requests
Wan Borui
Zhao Juntao
Jiang Chenyu
Guo Chuanxiong
Wu Chuan
VLM
56
1
0
13 Apr 2025
AI-Slop to AI-Polish? Aligning Language Models through Edit-Based Writing Rewards and Test-time Computation
AI-Slop to AI-Polish? Aligning Language Models through Edit-Based Writing Rewards and Test-time Computation
Tuhin Chakrabarty
Philippe Laban
C. Wu
32
1
0
10 Apr 2025
Societal Impacts Research Requires Benchmarks for Creative Composition Tasks
Societal Impacts Research Requires Benchmarks for Creative Composition Tasks
Judy Hanwen Shen
Carlos Guestrin
31
0
0
09 Apr 2025
PolyGuard: A Multilingual Safety Moderation Tool for 17 Languages
PolyGuard: A Multilingual Safety Moderation Tool for 17 Languages
Priyanshu Kumar
Devansh Jain
Akhila Yerukola
Liwei Jiang
Himanshu Beniwal
Thomas Hartvigsen
Maarten Sap
52
0
0
06 Apr 2025
Robustly identifying concepts introduced during chat fine-tuning using crosscoders
Robustly identifying concepts introduced during chat fine-tuning using crosscoders
Julian Minder
Clement Dumas
Caden Juang
Bilal Chugtai
Neel Nanda
27
0
0
03 Apr 2025
A Survey of Scaling in Large Language Model Reasoning
A Survey of Scaling in Large Language Model Reasoning
Zihan Chen
Song Wang
Zhen Tan
Xingbo Fu
Zhenyu Lei
Peng Wang
Huan Liu
Cong Shen
Jundong Li
LRM
86
0
0
02 Apr 2025
A multi-agentic framework for real-time, autonomous freeform metasurface design
A multi-agentic framework for real-time, autonomous freeform metasurface design
Robert Lupoiu
Yixuan Shao
Tianxiang Dai
Chenkai Mao
Kofi Edee
Jonathan A. Fan
AI4CE
68
0
0
26 Mar 2025
ChatBench: From Static Benchmarks to Human-AI Evaluation
ChatBench: From Static Benchmarks to Human-AI Evaluation
Serina Chang
Ashton Anderson
Jake M. Hofman
ELM
AI4MH
57
2
0
22 Mar 2025
SPADE: Systematic Prompt Framework for Automated Dialogue Expansion in Machine-Generated Text Detection
SPADE: Systematic Prompt Framework for Automated Dialogue Expansion in Machine-Generated Text Detection
Haoyi Li
Angela Yifei Yuan
Soyeon Caren Han
Christopher Leckie
43
0
0
19 Mar 2025
SOSecure: Safer Code Generation with RAG and StackOverflow Discussions
SOSecure: Safer Code Generation with RAG and StackOverflow Discussions
Manisha Mukherjee
Vincent J. Hellendoorn
SILM
60
1
0
17 Mar 2025
Broaden your SCOPE! Efficient Multi-turn Conversation Planning for LLMs using Semantic Space
Zhiliang Chen
Xinyuan Niu
Chuan-Sheng Foo
Bryan Kian Hsiang Low
50
1
0
14 Mar 2025
RigoChat 2: an adapted language model to Spanish using a bounded dataset and reduced hardware
Gonzalo Santamaría Gómez
Guillem García Subies
Pablo Gutiérrez Ruiz
Mario González Valero
Natàlia Fuertes
...
Nuria Aldama García
David Betancur Sánchez
Kateryna Sushkova
Marta Guerrero Nieto
Á. Jiménez
51
0
0
11 Mar 2025
Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Ling Team
B. Zeng
C. Huang
Chao Zhang
Changxin Tian
...
Zhaoxin Huan
Zujie Wen
Zhenhang Sun
Zhuoxuan Du
Z. He
MoE
ALM
109
2
0
07 Mar 2025
CrowdSelect: Synthetic Instruction Data Selection with Multi-LLM Wisdom
Yisen Li
Lingfeng Yang
Wenxuan Shen
Pan Zhou
Yao Wan
Weiwei Lin
D. Z. Chen
67
0
0
03 Mar 2025
Rethinking LLM Bias Probing Using Lessons from the Social Sciences
Kirsten N. Morehouse
S. Swaroop
Weiwei Pan
43
0
0
28 Feb 2025
Know You First and Be You Better: Modeling Human-Like User Simulators via Implicit Profiles
Know You First and Be You Better: Modeling Human-Like User Simulators via Implicit Profiles
Kuang Wang
X. Li
S. M. I. Simon X. Yang
Li Zhou
Feng Jiang
H. Li
42
0
0
26 Feb 2025
Dataset Featurization: Uncovering Natural Language Features through Unsupervised Data Reconstruction
Michal Bravansky
Vaclav Kubon
Suhas Hariharan
Robert Kirk
62
0
0
24 Feb 2025
FADE: Why Bad Descriptions Happen to Good Features
FADE: Why Bad Descriptions Happen to Good Features
Bruno Puri
Aakriti Jain
Elena Golimblevskaia
Patrick Kahardipraja
Thomas Wiegand
Wojciech Samek
Sebastian Lapuschkin
70
0
0
24 Feb 2025
WildLong: Synthesizing Realistic Long-Context Instruction Data at Scale
WildLong: Synthesizing Realistic Long-Context Instruction Data at Scale
Jiaxi Li
Xingxing Zhang
Xun Wang
Xiaolong Huang
Li Dong
Liang Wang
Si-Qing Chen
Wei Lu
Furu Wei
SyDa
88
0
0
23 Feb 2025
Multilingual != Multicultural: Evaluating Gaps Between Multilingual Capabilities and Cultural Alignment in LLMs
Multilingual != Multicultural: Evaluating Gaps Between Multilingual Capabilities and Cultural Alignment in LLMs
Jonathan Rystrøm
Hannah Rose Kirk
Scott A. Hale
44
2
0
23 Feb 2025
Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive Investigation
Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive Investigation
Vera Neplenbroek
Arianna Bisazza
Raquel Fernández
97
0
0
17 Feb 2025
Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarcity
Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarcity
Dylan Zhang
Justin Wang
Tianran Sun
36
0
0
17 Feb 2025
TimeCAP: Learning to Contextualize, Augment, and Predict Time Series Events with Large Language Model Agents
TimeCAP: Learning to Contextualize, Augment, and Predict Time Series Events with Large Language Model Agents
Geon Lee
Wenchao Yu
Kijung Shin
Wei Cheng
Haifeng Chen
AI4TS
LLMAG
54
3
0
17 Feb 2025
Idiosyncrasies in Large Language Models
Idiosyncrasies in Large Language Models
Mingjie Sun
Yida Yin
Zhiqiu Xu
J. Zico Kolter
Zhuang Liu
35
4
0
17 Feb 2025
SafeDialBench: A Fine-Grained Safety Benchmark for Large Language Models in Multi-Turn Dialogues with Diverse Jailbreak Attacks
SafeDialBench: A Fine-Grained Safety Benchmark for Large Language Models in Multi-Turn Dialogues with Diverse Jailbreak Attacks
Hongye Cao
Yanming Wang
Sijia Jing
Ziyue Peng
Zhixin Bai
...
Yang Gao
Fanyu Meng
Xi Yang
Chao Deng
Junlan Feng
AAML
41
0
0
16 Feb 2025
DeepThink: Aligning Language Models with Domain-Specific User Intents
DeepThink: Aligning Language Models with Domain-Specific User Intents
Yang Li
Mingxuan Luo
Yeyun Gong
Chen Lin
Jian Jiao
Yi Liu
Kaili Huang
LRM
ALM
ELM
52
0
0
08 Feb 2025
fMoE: Fine-Grained Expert Offloading for Large Mixture-of-Experts Serving
fMoE: Fine-Grained Expert Offloading for Large Mixture-of-Experts Serving
Hanfei Yu
Xingqi Cui
H. M. Zhang
H. Wang
Hao Wang
MoE
52
0
0
07 Feb 2025
The Best Instruction-Tuning Data are Those That Fit
The Best Instruction-Tuning Data are Those That Fit
Dylan Zhang
Qirun Dai
Hao Peng
ALM
115
3
0
06 Feb 2025
Why human-AI relationships need socioaffective alignment
Why human-AI relationships need socioaffective alignment
Hannah Rose Kirk
Iason Gabriel
Chris Summerfield
Bertie Vidgen
Scott A. Hale
40
6
0
04 Feb 2025
Generative Psycho-Lexical Approach for Constructing Value Systems in Large Language Models
Generative Psycho-Lexical Approach for Constructing Value Systems in Large Language Models
Haoran Ye
T. Zhang
Yuhang Xie
Liyuan Zhang
Yuanyi Ren
Xin Zhang
Guojie Song
PILM
74
0
0
04 Feb 2025
Evaluation of Large Language Models via Coupled Token Generation
Evaluation of Large Language Models via Coupled Token Generation
N. C. Benz
Stratis Tsirtsis
Eleni Straitouri
Ivi Chatzi
Ander Artola Velasco
Suhas Thejaswi
Manuel Gomez Rodriguez
46
0
0
03 Feb 2025
Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback
Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback
Lester James Validad Miranda
Yizhong Wang
Yanai Elazar
Sachin Kumar
Valentina Pyatkin
Faeze Brahman
Noah A. Smith
Hannaneh Hajishirzi
Pradeep Dasigi
45
8
0
08 Jan 2025
A Statistical Framework for Ranking LLM-Based Chatbots
A Statistical Framework for Ranking LLM-Based Chatbots
Siavash Ameli
Siyuan Zhuang
Ion Stoica
Michael W. Mahoney
ELM
38
1
0
24 Dec 2024
WarriorCoder: Learning from Expert Battles to Augment Code Large Language Models
WarriorCoder: Learning from Expert Battles to Augment Code Large Language Models
Huawen Feng
Pu Zhao
Qingfeng Sun
Can Xu
Fangkai Yang
...
Qianli Ma
Qingwei Lin
Saravan Rajmohan
Dongmei Zhang
Qi Zhang
AAML
ALM
62
0
0
23 Dec 2024
Clio: Privacy-Preserving Insights into Real-World AI Use
Clio: Privacy-Preserving Insights into Real-World AI Use
Alex Tamkin
Miles McCain
Kunal Handa
Esin Durmus
Liane Lovitt
...
Wes Mitchell
Shan Carter
Jack Clark
Jared Kaplan
Deep Ganguli
74
12
0
18 Dec 2024
Lightweight Safety Classification Using Pruned Language Models
Lightweight Safety Classification Using Pruned Language Models
Mason Sawtell
Tula Masterman
Sandi Besen
Jim Brown
84
2
0
18 Dec 2024
QUENCH: Measuring the gap between Indic and Non-Indic Contextual General
  Reasoning in LLMs
QUENCH: Measuring the gap between Indic and Non-Indic Contextual General Reasoning in LLMs
Mohammad Aflah Khan
Neemesh Yadav
Sarah Masud
Md. Shad Akhtar
71
0
0
16 Dec 2024
Smaller Language Models Are Better Instruction Evolvers
Smaller Language Models Are Better Instruction Evolvers
Tingfeng Hui
Lulu Zhao
Guanting Dong
Yaqi Zhang
Hua Zhou
Sen Su
ALM
79
1
0
15 Dec 2024
Multi-Bin Batching for Increasing LLM Inference Throughput
Multi-Bin Batching for Increasing LLM Inference Throughput
Ozgur Guldogan
Jackson Kunde
Kangwook Lee
Ramtin Pedarsani
LRM
59
2
0
03 Dec 2024
Marconi: Prefix Caching for the Era of Hybrid LLMs
Marconi: Prefix Caching for the Era of Hybrid LLMs
Rui Pan
Zhuang Wang
Zhen Jia
Can Karakus
Luca Zancato
Tri Dao
Ravi Netravali
Yida Wang
90
4
0
28 Nov 2024
From Jack of All Trades to Master of One: Specializing LLM-based
  Autoraters to a Test Set
From Jack of All Trades to Master of One: Specializing LLM-based Autoraters to a Test Set
M. Finkelstein
Dan Deutsch
Parker Riley
Juraj Juraska
Geza Kovacs
Markus Freitag
71
0
0
23 Nov 2024
Steering Language Model Refusal with Sparse Autoencoders
Kyle O'Brien
David Majercak
Xavier Fernandes
Richard Edgar
Jingya Chen
Harsha Nori
Dean Carignan
Eric Horvitz
Forough Poursabzi-Sangde
LLMSV
56
10
0
18 Nov 2024
123
Next