ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.01732
  4. Cited By
Proto-lm: A Prototypical Network-Based Framework for Built-in
  Interpretability in Large Language Models

Proto-lm: A Prototypical Network-Based Framework for Built-in Interpretability in Large Language Models

3 November 2023
Sean Xie
Soroush Vosoughi
Saeed Hassanpour
ArXivPDFHTML

Papers citing "Proto-lm: A Prototypical Network-Based Framework for Built-in Interpretability in Large Language Models"

6 / 6 papers shown
Title
The Mysterious Case of Neuron 1512: Injectable Realignment Architectures
  Reveal Internal Characteristics of Meta's Llama 2 Model
The Mysterious Case of Neuron 1512: Injectable Realignment Architectures Reveal Internal Characteristics of Meta's Llama 2 Model
Brenden Smith
Dallin Baker
Clayton Chase
Myles Barney
Kaden Parker
Makenna Allred
Peter Hu
Alex Evans
Nancy Fulda
14
0
0
04 Jul 2024
Towards Interpretable Deep Reinforcement Learning Models via Inverse
  Reinforcement Learning
Towards Interpretable Deep Reinforcement Learning Models via Inverse Reinforcement Learning
Yuansheng Xie
Soroush Vosoughi
Saeed Hassanpour
14
2
0
30 Mar 2022
Framework for Evaluating Faithfulness of Local Explanations
Framework for Evaluating Faithfulness of Local Explanations
S. Dasgupta
Nave Frost
Michal Moshkovitz
FAtt
106
60
0
01 Feb 2022
Interactively Providing Explanations for Transformer Language Models
Interactively Providing Explanations for Transformer Language Models
Felix Friedrich
P. Schramowski
Christopher Tauchmann
Kristian Kersting
LRM
31
6
0
02 Sep 2021
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
Towards A Rigorous Science of Interpretable Machine Learning
Towards A Rigorous Science of Interpretable Machine Learning
Finale Doshi-Velez
Been Kim
XAI
FaML
225
3,658
0
28 Feb 2017
1