ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.01582
  4. Cited By
LMdiff: A Visual Diff Tool to Compare Language Models

LMdiff: A Visual Diff Tool to Compare Language Models

2 November 2021
Hendrik Strobelt
Benjamin Hoover
Arvind Satyanarayan
Sebastian Gehrmann
    VLM
ArXivPDFHTML

Papers citing "LMdiff: A Visual Diff Tool to Compare Language Models"

14 / 14 papers shown
Title
SimCT: A Simple Consistency Test Protocol in LLMs Development Lifecycle
SimCT: A Simple Consistency Test Protocol in LLMs Development Lifecycle
Fufangchen Zhao
Guoqiang Jin
Rui Zhao
Jiangheng Huang
Fei Tan
29
1
0
24 Jul 2024
Interactive Prompt Debugging with Sequence Salience
Interactive Prompt Debugging with Sequence Salience
Ian Tenney
Ryan Mullins
Bin Du
Shree Pandya
Minsuk Kahng
Lucas Dixon
LRM
24
1
0
11 Apr 2024
KnowledgeVIS: Interpreting Language Models by Comparing
  Fill-in-the-Blank Prompts
KnowledgeVIS: Interpreting Language Models by Comparing Fill-in-the-Blank Prompts
Adam Joseph Coscia
Alex Endert
VLM
22
9
0
07 Mar 2024
Consistency Matters: Explore LLMs Consistency From a Black-Box Perspective
Fufangchen Zhao
Guoqiang Jin
Jiaheng Huang
Rui Zhao
Fei Tan
25
1
0
27 Feb 2024
LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large
  Language Models
LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models
Minsuk Kahng
Ian Tenney
Mahima Pushkarna
Michael Xieyang Liu
James Wexler
Emily Reif
Krystal Kallarackal
Minsuk Chang
Michael Terry
Lucas Dixon
46
21
0
16 Feb 2024
Visual Analytics for Generative Transformer Models
Visual Analytics for Generative Transformer Models
Raymond Li
Ruixin Yang
Wen Xiao
Ahmed AbuRaed
Gabriel Murray
Giuseppe Carenini
19
1
0
21 Nov 2023
Which Prompts Make The Difference? Data Prioritization For Efficient
  Human LLM Evaluation
Which Prompts Make The Difference? Data Prioritization For Efficient Human LLM Evaluation
M. Boubdir
Edward Kim
B. Ermiş
Marzieh Fadaee
Sara Hooker
ALM
31
18
0
22 Oct 2023
R-U-SURE? Uncertainty-Aware Code Suggestions By Maximizing Utility
  Across Random User Intents
R-U-SURE? Uncertainty-Aware Code Suggestions By Maximizing Utility Across Random User Intents
Daniel D. Johnson
Daniel Tarlow
Christian J. Walder
21
6
0
01 Mar 2023
Visual Comparison of Language Model Adaptation
Visual Comparison of Language Model Adaptation
R. Sevastjanova
E. Cakmak
Shauli Ravfogel
Ryan Cotterell
Mennatallah El-Assady
VLM
23
16
0
17 Aug 2022
Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation
  with Large Language Models
Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models
Hendrik Strobelt
Albert Webson
Victor Sanh
Benjamin Hoover
Johanna Beyer
Hanspeter Pfister
Alexander M. Rush
VLM
8
134
0
16 Aug 2022
Mediators: Conversational Agents Explaining NLP Model Behavior
Mediators: Conversational Agents Explaining NLP Model Behavior
Nils Feldhus
A. Ravichandran
Sebastian Möller
25
16
0
13 Jun 2022
The GEM Benchmark: Natural Language Generation, its Evaluation and
  Metrics
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics
Sebastian Gehrmann
Tosin P. Adewumi
Karmanya Aggarwal
Pawan Sasanka Ammanamanchi
Aremu Anuoluwapo
...
Nishant Subramani
Wei-ping Xu
Diyi Yang
Akhila Yerukola
Jiawei Zhou
VLM
246
283
0
02 Feb 2021
Extracting Training Data from Large Language Models
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
267
1,808
0
14 Dec 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,943
0
20 Apr 2018
1