Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.10041
Cited By
Hidden State Variability of Pretrained Language Models Can Guide Computation Reduction for Transfer Learning
18 October 2022
Shuo Xie
Jiahao Qiu
Ankita Pasad
Li Du
Qing Qu
Hongyuan Mei
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Hidden State Variability of Pretrained Language Models Can Guide Computation Reduction for Transfer Learning"
20 / 20 papers shown
Title
LLM Internal States Reveal Hallucination Risk Faced With a Query
Ziwei Ji
Delong Chen
Etsuko Ishii
Samuel Cahyawijaya
Yejin Bang
Bryan Wilie
Pascale Fung
HILM
LRM
26
18
0
03 Jul 2024
Binary Hypothesis Testing for Softmax Models and Leverage Score Models
Yeqi Gao
Yuzhou Gu
Zhao-quan Song
28
0
0
09 May 2024
Where does In-context Translation Happen in Large Language Models
Suzanna Sia
David Mueller
Kevin Duh
LRM
30
0
0
07 Mar 2024
The Expressibility of Polynomial based Attention Scheme
Zhao-quan Song
Guangyi Xu
Junze Yin
21
5
0
30 Oct 2023
Neural Collapse in Multi-label Learning with Pick-all-label Loss
Pengyu Li
Xiao Li
Yutong Wang
Qing Qu
8
6
0
24 Oct 2023
Generalized Neural Collapse for a Large Number of Classes
Jiachen Jiang
Jinxin Zhou
Peng Wang
Qing Qu
Dustin Mixon
Chong You
Zhihui Zhu
AI4CE
21
20
0
09 Oct 2023
A Fast Optimization View: Reformulating Single Layer Attention in LLM Based on Tensor and SVM Trick, and Solving It in Matrix Multiplication Time
Yeqi Gao
Zhao-quan Song
Weixin Wang
Junze Yin
15
25
0
14 Sep 2023
GradientCoin: A Peer-to-Peer Decentralized Large Language Models
Yeqi Gao
Zhao-quan Song
Junze Yin
15
18
0
21 Aug 2023
What Do Self-Supervised Speech Models Know About Words?
Ankita Pasad
C. Chien
Shane Settle
Karen Livescu
SSL
16
26
0
30 Jun 2023
Quantifying the Variability Collapse of Neural Networks
Jing-Xue Xu
Haoxiong Liu
23
4
0
06 Jun 2023
The Law of Parsimony in Gradient Descent for Learning Deep Linear Networks
Can Yaras
P. Wang
Wei Hu
Zhihui Zhu
Laura Balzano
Qing Qu
20
17
0
01 Jun 2023
Task-Specific Skill Localization in Fine-tuned Language Models
A. Panigrahi
Nikunj Saunshi
Haoyu Zhao
Sanjeev Arora
MoMe
19
66
0
13 Feb 2023
Comparative layer-wise analysis of self-supervised speech models
Ankita Pasad
Bowen Shi
Karen Livescu
SSL
19
109
0
08 Nov 2022
Nearest Class-Center Simplification through Intermediate Layers
Ido Ben-Shaul
S. Dekel
32
26
0
21 Jan 2022
An Unconstrained Layer-Peeled Perspective on Neural Collapse
Wenlong Ji
Yiping Lu
Yiliang Zhang
Zhun Deng
Weijie J. Su
122
65
0
06 Oct 2021
Probing Classifiers: Promises, Shortcomings, and Advances
Yonatan Belinkov
221
291
0
24 Feb 2021
Exploring Deep Neural Networks via Layer-Peeled Model: Minority Collapse in Imbalanced Training
Cong Fang
Hangfeng He
Qi Long
Weijie J. Su
FAtt
114
162
0
29 Jan 2021
WARP: Word-level Adversarial ReProgramming
Karen Hambardzumyan
Hrant Khachatrian
Jonathan May
AAML
248
340
0
01 Jan 2021
The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives
Elena Voita
Rico Sennrich
Ivan Titov
182
181
0
03 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
1