Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.01582
Cited By
LMdiff: A Visual Diff Tool to Compare Language Models
2 November 2021
Hendrik Strobelt
Benjamin Hoover
Arvind Satyanarayan
Sebastian Gehrmann
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LMdiff: A Visual Diff Tool to Compare Language Models"
14 / 14 papers shown
Title
SimCT: A Simple Consistency Test Protocol in LLMs Development Lifecycle
Fufangchen Zhao
Guoqiang Jin
Rui Zhao
Jiangheng Huang
Fei Tan
29
1
0
24 Jul 2024
Interactive Prompt Debugging with Sequence Salience
Ian Tenney
Ryan Mullins
Bin Du
Shree Pandya
Minsuk Kahng
Lucas Dixon
LRM
24
1
0
11 Apr 2024
KnowledgeVIS: Interpreting Language Models by Comparing Fill-in-the-Blank Prompts
Adam Joseph Coscia
Alex Endert
VLM
22
9
0
07 Mar 2024
Consistency Matters: Explore LLMs Consistency From a Black-Box Perspective
Fufangchen Zhao
Guoqiang Jin
Jiaheng Huang
Rui Zhao
Fei Tan
25
1
0
27 Feb 2024
LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models
Minsuk Kahng
Ian Tenney
Mahima Pushkarna
Michael Xieyang Liu
James Wexler
Emily Reif
Krystal Kallarackal
Minsuk Chang
Michael Terry
Lucas Dixon
46
21
0
16 Feb 2024
Visual Analytics for Generative Transformer Models
Raymond Li
Ruixin Yang
Wen Xiao
Ahmed AbuRaed
Gabriel Murray
Giuseppe Carenini
19
1
0
21 Nov 2023
Which Prompts Make The Difference? Data Prioritization For Efficient Human LLM Evaluation
M. Boubdir
Edward Kim
B. Ermiş
Marzieh Fadaee
Sara Hooker
ALM
31
18
0
22 Oct 2023
R-U-SURE? Uncertainty-Aware Code Suggestions By Maximizing Utility Across Random User Intents
Daniel D. Johnson
Daniel Tarlow
Christian J. Walder
21
6
0
01 Mar 2023
Visual Comparison of Language Model Adaptation
R. Sevastjanova
E. Cakmak
Shauli Ravfogel
Ryan Cotterell
Mennatallah El-Assady
VLM
23
16
0
17 Aug 2022
Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models
Hendrik Strobelt
Albert Webson
Victor Sanh
Benjamin Hoover
Johanna Beyer
Hanspeter Pfister
Alexander M. Rush
VLM
8
134
0
16 Aug 2022
Mediators: Conversational Agents Explaining NLP Model Behavior
Nils Feldhus
A. Ravichandran
Sebastian Möller
25
16
0
13 Jun 2022
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics
Sebastian Gehrmann
Tosin P. Adewumi
Karmanya Aggarwal
Pawan Sasanka Ammanamanchi
Aremu Anuoluwapo
...
Nishant Subramani
Wei-ping Xu
Diyi Yang
Akhila Yerukola
Jiawei Zhou
VLM
246
283
0
02 Feb 2021
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
267
1,808
0
14 Dec 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,943
0
20 Apr 2018
1