Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2010.03648
Cited By
A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks
7 October 2020
Nikunj Saunshi
Sadhika Malladi
Sanjeev Arora
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks"
21 / 21 papers shown
Title
Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representations
Yize Zhao
Tina Behnia
V. Vakilian
Christos Thrampoulidis
55
8
0
20 Feb 2025
Do we really have to filter out random noise in pre-training data for language models?
Jinghan Ru
Yuxin Xie
Xianwei Zhuang
Yuguo Yin
Zhihui Guo
Zhiming Liu
Qianli Ren
Yuexian Zou
83
2
0
10 Feb 2025
Semantic Captioning: Benchmark Dataset and Graph-Aware Few-Shot In-Context Learning for SQL2Text
Ali Al-Lawati
Jason Lucas
Prasenjit Mitra
LMTD
43
0
0
06 Jan 2025
Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
Noam Razin
Sadhika Malladi
Adithya Bhaskar
Danqi Chen
Sanjeev Arora
Boris Hanin
91
14
0
11 Oct 2024
Investigating the Impact of Model Complexity in Large Language Models
Jing Luo
Huiyuan Wang
Weiran Huang
34
0
0
01 Oct 2024
An Overview on Language Models: Recent Developments and Outlook
Chengwei Wei
Yun Cheng Wang
Bin Wang
C.-C. Jay Kuo
20
41
0
10 Mar 2023
On the Provable Advantage of Unsupervised Pretraining
Jiawei Ge
Shange Tang
Jianqing Fan
Chi Jin
SSL
33
16
0
02 Mar 2023
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
Hong Liu
Sang Michael Xie
Zhiyuan Li
Tengyu Ma
AI4CE
34
49
0
25 Oct 2022
Emergent Abilities of Large Language Models
Jason W. Wei
Yi Tay
Rishi Bommasani
Colin Raffel
Barret Zoph
...
Tatsunori Hashimoto
Oriol Vinyals
Percy Liang
J. Dean
W. Fedus
ELM
ReLM
LRM
48
2,333
0
15 Jun 2022
AANG: Automating Auxiliary Learning
Lucio Dery
Paul Michel
M. Khodak
Graham Neubig
Ameet Talwalkar
34
9
0
27 May 2022
The Mechanism of Prediction Head in Non-contrastive Self-supervised Learning
Zixin Wen
Yuanzhi Li
SSL
27
34
0
12 May 2022
Empirical Evaluation and Theoretical Analysis for Representation Learning: A Survey
Kento Nozawa
Issei Sato
AI4TS
19
4
0
18 Apr 2022
Can Unsupervised Knowledge Transfer from Social Discussions Help Argument Mining?
Subhabrata Dutta
Jeevesh Juneja
Dipankar Das
Tanmoy Chakraborty
9
15
0
24 Mar 2022
Understanding Contrastive Learning Requires Incorporating Inductive Biases
Nikunj Saunshi
Jordan T. Ash
Surbhi Goel
Dipendra Kumar Misra
Cyril Zhang
Sanjeev Arora
Sham Kakade
A. Krishnamurthy
SSL
19
109
0
28 Feb 2022
Self-Supervised Representation Learning: Introduction, Advances and Challenges
Linus Ericsson
H. Gouk
Chen Change Loy
Timothy M. Hospedales
SSL
OOD
AI4TS
29
270
0
18 Oct 2021
On the Surrogate Gap between Contrastive and Supervised Losses
Han Bao
Yoshihiro Nagano
Kento Nozawa
SSL
UQCV
39
19
0
06 Oct 2021
Cross-lingual Transfer for Text Classification with Dictionary-based Heterogeneous Graph
Nuttapong Chairatanakul
Noppayut Sriwatanasakdi
Nontawat Charoenphakdee
Xin Liu
T. Murata
16
4
0
09 Sep 2021
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing
Pengfei Liu
Weizhe Yuan
Jinlan Fu
Zhengbao Jiang
Hiroaki Hayashi
Graham Neubig
VLM
SyDa
31
3,828
0
28 Jul 2021
Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning
Colin Wei
Sang Michael Xie
Tengyu Ma
22
96
0
17 Jun 2021
Can Pretext-Based Self-Supervised Learning Be Boosted by Downstream Data? A Theoretical Analysis
Jiaye Teng
Weiran Huang
Haowei He
SSL
26
11
0
05 Mar 2021
The Rediscovery Hypothesis: Language Models Need to Meet Linguistics
Vassilina Nikoulina
Maxat Tezekbayev
Nuradil Kozhakhmet
Madina Babazhanova
Matthias Gallé
Z. Assylbekov
29
8
0
02 Mar 2021
1